Browse — Corpus Dashboard

This really highlights how strongly models infer geography from language and how that can shift triage decisions even when severity stays consistent.

Founder of Achievers Dream - SG’s 1st C… AI Safety & Risk filtered out ⌕ thread

Qi Han Wong you’re absolutely correct. That’s why we built a five model architecture. We introduced a classification layer first.

Founder & CEO, MindHYVE.ai | Delivering… AI Safety & Risk filtered out ⌕ thread

I wonder if the outcome would have been different if the prompt had specified that the patient was an expat living in the U.S. To me, the model’s behavior seems fairly logical. If a symptom description is written in Japanese, Chinese, or Hindi and no location is provided, the most likely assumption is that the patient is located in a region where that language is commonly spoken. Healthcare systems, care pathways, and thresholds for recommending the ER vary significantly across countries. This becomes even more interesting with languages that are spoken across many regions. French may point to France, Belgium, Switzerland, Quebec, or several African countries. Spanish could mean Mexico, Argentina, Spain, Colombia, or many others. Even English spans countries with very different healthcare practices. The real question may not be whether the model is culturally profiling the user, but whether it should be making geographic assumptions at all. In cases where location materially affects the recommendation, asking for location first might be the safer approach.

Enterprise Systems & Decision Intellige… AI Safety & Risk filtered out ⌕ thread

Fascinating! Healthcare recommendations intervene with safety instruction too however, do you think the low ER recommendation rate could be default behaviour (conservative), artificat of scarce supervision in other languages or cultural contexts?

Trainee - ML Engineering @ General Mill… AI Safety & Risk filtered out ⌕ thread

Love this insight! That's why thorough testing against good data will be the only way to make sure that an AI system is working properly and without bias!

AI Safety & Risk relevant value: safety + fairness for: society demanding approval ⌕ thread → raw LLM

Thanks Qi Han Wong, very interesting!This maps very directly to legal AI too. Language is not jurisdiction. A Spanish prompt may require Argentine, Spanish, Mexican, or US law. An English contract may still be governed by Argentine law. If the model silently treats language as a proxy for geography or governing law, it may understand the risk correctly but route the answer through the wrong institutional pathway. For legal and compliance AI, explicit jurisdiction anchoring is not a detail. It is a safety layer: governing law, forum, user location, institutional authority, and role of the user all matter.

Senior Corporate Lawyer | Independent D… AI Safety & Risk relevant value: safety for: individual_users demanding approval ⌕ thread → raw LLM

Geographic anchoring may take care of logistical routing but at the same time erase a patient's biological identity by defaulting to Western clinical baselines by increasing genetic and biological blind spots. Medical AI safety requires decoupling genetics from location, prompting for both the physical location of the patient and their specific ethnic health predispositions. This problem is already existing example where patient of different ethinicty vists a GP in a different geograhical location My view is that Medical AI would be more efficent on regional flavour rather than one solution fits all

AI Product & Programme Manager | AI Gov… AI Safety & Risk relevant value: safety + fairness for: vulnerable_groups demanding outrage ⌕ thread → raw LLM

Alex Smirnoff To my understanding, France and Russia both have ER-oriented healthcare cultures (urgences in France, skoraya pomoshch in Russia), so this is aligned with the US.

Google | AI Product Builder | Researchi… AI Safety & Risk filtered out ⌕ thread

This is great research. Thank you for doing this.

Product & Commercial Counsel | Legal AI… AI Safety & Risk filtered out ⌕ thread

Wow! Very interesting finding! 🤯 Do you think the bias of treating language as a proxy for geography limited to healthcare, or is it a broader issue across domains? ...and should location be explicitly anchored in prompts to ensure correct grounding?🤔

MarTech & Operations Leader | AI Busine… AI Safety & Risk filtered out ⌕ thread

Interesting demonstration. This highlights why autonomous medical triage is a regulated, high-risk application. Variability in recommendations from the same presentation can have real clinical and economic consequences. As a neuro-ophthalmologist, I would also question whether the observed differences truly reflect distinct clinical norms. The vignette lacks information that would typically be needed for disposition, making it difficult to know whether the recommendation is being driven by the clinical presentation itself or by assumptions associated with language and geography. In that sense, the finding may be even more important: the model appears willing to make a disposition recommendation despite substantial clinical uncertainty.

Founder & CEO | Neurologist & Neuro-Oph… AI Safety & Risk filtered out ⌕ thread

Ankur Pandey thought you would want to look at this

Trust Layer For Healthcare AI AI Safety & Risk filtered out ⌕ thread

Qi-Han Wong it may be the reason indeed. Most probably it is also true for the rest of Europe.

AI & ML Applications AI Safety & Risk filtered out ⌕ thread

Is it really a safety failure to use a large-language model for medical guidance? Doesn't embedding the "solution" in the prompt simply double-down on this technology's limitations?

The AI Blindspot Guy - Helping companie… AI Safety & Risk filtered out ⌕ thread

Vivek Khandelwal very interesting. The models are getting more capable and can decipher increasingly more context, and will increasingly decipher more than what we want them to

3x AI/SaaS founder • AI products / impa… AI Safety & Risk filtered out ⌕ thread

Greatest finding. These nuances needed to be discovered, documented and communicated. Well done

PhD * Ensuring the Artificial Intellige… AI Safety & Risk filtered out ⌕ thread

Great testing idea💡

Computer vision engineer in Automotive … AI Safety & Risk filtered out ⌕ thread

Shahnaz Miri, MD, MBA Exactly right - the model's willingness to commit to a disposition despite incomplete clinical information is perhaps the more fundamental finding.

Google | AI Product Builder | Researchi… AI Safety & Risk filtered out ⌕ thread

Angharad Hurley, interesting research!

AI Prompt Engineer | Safety-Focused Red… AI Safety & Risk filtered out ⌕ thread

The detail that should worry people most is that the severity score held at 8.0 across every language. The model understood the danger. It just routed that danger into a different action based on a location nobody asked it to assume. That kind of failure passes a clean English eval and only surfaces once real users hit it. Anchoring location fixes this case, but the bigger lesson is that evals have to hold everything constant except the variable you are testing, or the divergence stays invisible until production.

Technical Program Manager | Shipping AI… AI Safety & Risk filtered out ⌕ thread

Browse Comments — Clean (de-noised)