Emily Alsentzer

Emily Alsentzer

Assistant Professor

Stanford University

I am an Assistant Professor of Biomedical Data Science and, by courtesy, of Computer Science at Stanford University. The core goal of my research is to augment clinical decision making and broaden access to high quality healthcare through the use of machine learning (ML) and natural language processing. I leverage heterogeneous clinical data, such as electronic health records and genomic data, to provide actionable insights to clinicians, researchers, and patients and develop new methods to infuse biomedical knowledge found in knowledge graphs and text into machine learning algorithms. My work is motivated by the question: how can we design trustworthy machine learning methods that excel in settings with limited annotated data and can be deployed safely and effectively into clinical workflows?

Previously, I was a postdoctoral fellow at Brigham and Women’s Hospital and Harvard Medical School (HMS) where I worked to deploy ML models within the Mass General Brigham healthcare system. I received my PhD from the Health Science & Technology (HST) program at MIT & HMS, co-advised by Zak Kohane and Pete Szolovits. During my PhD, I created ClinicalBERT, a language model trained on electronic health records that has millions of downloads on HuggingFace, and developed SHEPHERD, a graph neural network approach for the diagnosis of patients with rare genetic diseases in the Undiagnosed Disease Network.

Interests
  • Deployable Machine Learning
  • Few Shot Learning
  • LLMs & Foundation Models
  • Graph Neural Networks
  • Summarization
  • Rare Disease Diagnosis
Education
  • PhD in Medical Engineering & Medical Physics (HST), 2022

    Massachusetts Institute of Technology

  • MS in Biomedical Informatics, 2017

    Stanford University

  • BS in Computer Science, 2016

    Stanford University

[October 2024] I start as an Assistant Professor at Stanford University. My lab is recruiting students and postdocs to advance trustworthy, deployable AI methods for healthcare.

[March 2024] Our perspective on leveraging large language models to foster equity in healthcare was published in JAMIA.

[July 2023] Our work on assessing racial and gender bias in GPT-4 for medical applications was featured in Stat News. Our work on few shot diagnosis of rare disease patients received a Best Oral Presentation Award at ISMB.

[June 2023] Our paper Do we still need clinical language models? received a Best Paper Award at CHIL 2023.

[April 2023] I was awarded a grant from Microsoft’s Accelerate Foundation Models Research Initiative to study the use of LLMs for clinical summarization.

Recent Publications

All publications»

Coding Inequity: Assessing GPT-4's Potential for Perpetuating Racial and Gender Biases in Healthcare. medRxiv, 2023.
Do We Still Need Clinical Language Models?. Conference on Health, Inference, and Learning, 2023.
Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification. medRxiv, 2023.
Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models. medRxiv, 2023.
Deep learning for diagnosing patients with rare genetic diseases. medRxiv, 2022.