More than 100 million Americans will be treated this year by physicians using AI. According to The Information, OpenEvidence — often described as “ChatGPT for doctors” — is on track to become one of only eight applied-AI companies to surpass $10B in valuation and $100M in revenue.
Adoption at clinical scale
- Used by ~45% of U.S. physicians
- 65,000+ new clinicians onboarded monthly
- ~20 million physician consultations per month, up from ~385,000 a year ago
- Physician usage far exceeds general AI tools: 45% OpenEvidence vs. 16% ChatGPT vs. 5% Abridge (OffCall survey of 1,000 U.S. physicians)
Economics that stand out
- ~$150M annualized revenue (≈$12M/month, tripled in four months)
- >90% gross margins
- Only ~10% of ad inventory monetized, implying >$1B potential annual revenue
- Reportedly raising $250M at a $12B valuation, doubling in ~60 days
How Physicians Actually Use Clinical AI
A new NPJ Digital Medicine study examined how physicians interact with AI chatbots during real clinical reasoning tasks (Siden et al.).
Study design
- Interviews with 22 U.S. physicians
- Chat-log analysis from 2 randomized controlled trials
- 253 diagnostic and management vignettes
- Clinical reasoning scored by blinded physicians
Four dominant usage patterns emerged
- Full case copy-paste
- Selective copy-paste (labs, key findings)
- Physician-written summaries
- Short, search-like queries (Google-style)
Key finding
No interaction style consistently improved clinical reasoning performance.
Providing more context did not reliably lead to better outcomes.
Instead, performance was influenced more by:
- Task type (diagnosis vs. management)
- Physician interpretation of AI output
- Judgment about when to rely on vs. override AI
Why This Matters Now
AI adoption in medicine is no longer theoretical — it is already shaping care at population scale. The critical questions are shifting:
- Which clinical decisions benefit from AI support — and which may be harmed?
- How does AI influence physician judgment, not just accuracy?
- How should clinical AI be designed around real workflows, not prompt engineering?
- Where does human oversight add the most value?
The takeaway:
Training physicians how to prompt may matter far less than designing AI systems that fit clinical tasks, decision points, and accountability structures.
