Harvard AI Emergency Room Diagnosis Study Finds Model Beat Two Doctors at Triage

· · Views: 2,003 · 3 min time to read

A new Harvard-led study is highlighting how artificial intelligence might impact emergency medicine. Researchers found that one OpenAI model made more accurate early diagnoses than two human doctors in a real hospital.

The study was published in Science by a team from Harvard Medical School and Beth Israel Deaconess Medical Center. They reported that one AI model seemed to be more accurate than human doctors in actual emergency room cases.

The Harvard trial showed AI outperformed human doctors in high-pressure emergency medicine triage, which is the first stage of assessment when patients arrive and doctors must make quick decisions with limited information.

What the Harvard team tested

Reports described how researchers studied 76 patients who visited Beth Israel’s emergency room.

TechCrunch explained that the study compared diagnoses from two internal medicine doctors with those from OpenAI’s o1 and 4o models. Two other doctors reviewed the answers without knowing if they came from humans or AI.

The Guardian said both the AI and the doctors received the same electronic health records, including vital signs, demographic information, and short nursing notes.

The o1 model made the correct or nearly correct diagnosis in 67% of triage cases, while the human doctors were correct 50% to 55% of the time.

Why the result matters

AI showed the biggest advantage when doctors had the least information. The difference was especially pronounced during the initial ER triage, when doctors must act quickly with little patient information.

The AI’s edge was most clear in situations needing rapid decisions with minimal information. When more details were available, o1’s accuracy increased to 82%, while the human experts reached 70% to 79%. However, this later difference was not statistically significant.

Researchers say this is not a replacement story

Despite these results, the study’s authors and other experts said AI is not ready to replace emergency doctors. The researchers did not claim AI could make real life-or-death decisions alone. Instead, they called for more trials in real patient-care settings.

The study only looked at text-based information, and the researchers said current AI models are still limited when dealing with nontext data. The trial did not test if AI could recognize a patient’s visible distress or appearance. This means the system acted more like a second opinion based on paperwork, not a full doctor at the bedside.

Caution remains over accountability and overhype

The conversation about the study soon moved beyond just accuracy.

Lead author Arjun Manrai, who said the results do not mean “AI replaces doctors,” but they do show a “profound change in technology” that could change medicine. Another lead author, Dr. Adam Rodman, said there is “not a formal framework right now for accountability,” and emphasized that patients still want human doctors to help them with tough treatment and life-or-death decisions. Emergency doctor Kristen Panthagani, said some headlines were overhyped because the study compared internal medicine doctors, not ER specialists.

Overall, both reports suggest the Harvard study is an important step for AI in diagnosis, but it does not mean machines are ready to run the emergency room.

Share
f 𝕏 in
Copied