top of page
Search

Whom Clinical AI Serves


A few years ago the most widely deployed sepsis prediction model in American hospitals was independently evaluated. Sepsis is what happens when an ordinary infection tips the body into a chain reaction that starts shutting down organs. It can kill within hours, often before anyone realizes how sick the patient is. The model built to catch it caught only one in three real cases, and flagged one in five of every hospitalized patient as high risk.


Clinicians learned fast. The alert fires constantly on patients who are not deteriorating, and misses most of the ones who are. After a few weeks you stop trusting it. You have to stop trusting it. The alert becomes noise. A system bought to catch deterioration had trained everyone to ignore deterioration.


This is the first problem: bad math. No routing decision fixes it. You cannot wire a broken system to serve patients. The sepsis model failed at validation and nothing downstream changes that.


That was 2021. This month MIRA appeared in Nature. It's an autonomous medical AI agent that reads patient histories, orders tests, interprets results, and writes treatment plans. On identical cases it hit 87.8% diagnostic accuracy against 78.1% for board-certified physicians. It follows standard of care more consistently than the physicians it was measured against. This is what the current frontier looks like.


MIRA passed validation. The math works. Which makes a different question urgent.


That same model, with identical outputs, serves completely different people depending on one choice: where the prediction goes.


If the output reaches a clinician first, it serves the patient, because the clinician can act on it.


If it reaches a coverage decision first, it serves the insurer. High-risk patients get flagged for denial before they ever get sick, and the patient never sees the signal at all.


Route it to bed management and it serves the hospital's logistics, moving patients through beds faster. Same accuracy, different master. The patient still gets diagnosed eventually, but the hospital made the routing decision, not them.


The model never changed. The math is sound. Who it serves comes down to who decided where the signal goes.


So the question worth asking about any of these systems is who sees the output, who acts on it, and who pays when it's wrong.


For the sepsis model, that question is almost beside the point, because the model is broken. The cost lands everywhere. The doctor or nurse who learned to tune out alarms, the hospital carrying the liability, and underneath all of it the patient, whose deterioration goes unnoticed while everyone ignores a system that cried wolf too many times.


For MIRA, the answer isn't an accident. Someone decided where the signal goes and built the system to send it there.


And the answer has to be specific. "It serves patients" is the kind of thing that sounds right and commits to nothing. "It serves the bed-management team, who see an output the patient never will" is specific. If you can't name the party, you haven't actually answered.


If you work at a hospital buying this technology, you still have to validate it. Clinical evidence, bias testing, the documented failure modes. That work is real, and MIRA cleared it across eight diagnoses where the sepsis model couldn't clear it at all.


But validation only tells you whether an AI or prediction model works. It tells you nothing about whom it serves. A model can pass every test you throw at it and still be wired to serve the wrong party. Perfect evidence and bad incentives live comfortably in the same system. They are different problems, and passing the first does not touch the second.


You need both. Good math, and then a real answer to who it serves. Most systems never get the math right. Most of the ones that do never ask the second question.


General LLMs are now moving straight into hospitals. Patients are using ChatGPT, Claude, and the rest to read their own lab results and to decide whether something is worth a visit. Hospitals are deploying them without ever deciding where the output is supposed to go. That isn't the unsettling part.


The unsettling part is that the patient has quietly become the router. Someone reads a lab value, asks a chatbot what it means, and the whole exchange happens outside any hospital, any clinician, any framework built to be accountable to them. Whatever shapes that answer was decided upstream, by whoever trained the model and set its incentives. Not by anyone who answers to the patient.


A clinician tuning out a broken alarm is a local failure, one bad procurement decision in one hospital. A patient diagnosing themselves through a consumer chatbot is something larger: clinical judgment leaving the clinic entirely, moving into systems no one in medicine controls. The question is no longer where the hospital sends the signal. It's what the company that built the model decided you should know.


None of this requires understanding how a neural network works. It only requires asking the question. Who does this serve, named specifically enough that you could write it down: the actual party whose interests shaped what you're allowed to see.


((Z))


June 26, 2026

 
 

Get posts in your email:

Copyright 2024 - Zohaib Akhtar

bottom of page