Collaborative AI passes U.S. medical exams

News

Abstract

Editors have highlighted the following attributes while ensuring the content's credibility: fact-checked peer-reviewed publication trusted source proofread A council of five AI models working together, discussing their answers through an iterative process, achieved 97%, 93%, and 94% accuracy on 325 medical exam questions spanning the three stages of the U.S. Medical Licensing Examination (USMLE), according to a study published in PLOS Medicine by researcher Yahya Shaikh of Baltimore, U.S., and colleagues. A facilitator algorithm facilitates a deliberative process when there are divergent responses, summarizing the reasoning in each response and asking the council to deliberate and re-answer the original question. When the council was given 325 publicly available USMLE questions, including those focused on foundational biomedical sciences as well as clinical diagnosis and management, the system achieved consensus responses that were correct 97%, 93%, and 94% of the time for Step 1, Step 2 CK, and Step 3, respectively, outperforming single-instance GPT-4 models. Our work provides the first clear evidence that AI systems can self-correct through structured dialog, with the performance of the collective better than the performance of any single AI.

Visit original source to read more

Key Data

Publication Date

09 October 2025
Primary Author

Public Library of Science
Source

Medical Xpress
Language

English

Click below to visit original source:

Member Login

Collaborative AI passes U.S. medical exams

News