Emily Alsentzer

Assistant Professor

Stanford University

I am an Assistant Professor of Biomedical Data Science and, by courtesy, of Computer Science at Stanford University. The core goal of my research is to augment clinical decision making and broaden access to high quality healthcare through the use of machine learning (ML) and natural language processing. I leverage heterogeneous clinical data, such as electronic health records and genomic data, to provide actionable insights to clinicians, researchers, and patients and develop new methods to infuse biomedical knowledge found in knowledge graphs and text into machine learning algorithms. My work is motivated by the question: how can we design trustworthy machine learning methods that excel in settings with limited annotated data and can be deployed safely and effectively into clinical workflows?

Previously, I was a postdoctoral fellow at Brigham and Women’s Hospital and Harvard Medical School (HMS) where I worked to deploy ML models within the Mass General Brigham healthcare system. I received my PhD from the Health Science & Technology (HST) program at MIT & HMS, co-advised by Zak Kohane and Pete Szolovits. During my PhD, I created ClinicalBERT, a language model trained on electronic health records that has millions of downloads on HuggingFace, and developed SHEPHERD, a graph neural network approach for the diagnosis of patients with rare genetic diseases in the Undiagnosed Disease Network.

Interests

Deployable Machine Learning
Few Shot Learning
LLMs & Foundation Models
Graph Neural Networks
Summarization
Rare Disease Diagnosis

Education

PhD in Medical Engineering & Medical Physics (HST), 2022
Massachusetts Institute of Technology
MS in Biomedical Informatics, 2017
Stanford University
BS in Computer Science, 2016
Stanford University

News

All news»

[October 2024] I start as an Assistant Professor at Stanford University. My lab is recruiting students and postdocs to advance trustworthy, deployable AI methods for healthcare.

[March 2024] Our perspective on leveraging large language models to foster equity in healthcare was published in JAMIA.

[July 2023] Our work on assessing racial and gender bias in GPT-4 for medical applications was featured in Stat News. Our work on few shot diagnosis of rare disease patients received a Best Oral Presentation Award at ISMB.

[June 2023] Our paper Do we still need clinical language models? received a Best Paper Award at CHIL 2023.

[April 2023] I was awarded a grant from Microsoft’s Accelerate Foundation Models Research Initiative to study the use of LLMs for clinical summarization.

Recent Publications

All publications»

Travis Zack, Eric Lehman, Mirac Suzgun, Jorge A Rodriguez, Leo Anthony Celi, Judy Gichoya, Dan Jurafsky, Peter Szolovits, David W Bates, Raja-Elie E Abdulnour, Atul Butte, Emily Alsentzer. Coding Inequity: Assessing GPT-4's Potential for Perpetuating Racial and Gender Biases in Healthcare. medRxiv, 2023.

Eric Lehman, Evan Hernandez, Diwakar Mahajan, Jonas Wulff, Micah J Smith, Zachary Ziegler, Daniel Nadler, Peter Szolovits, Alistair Johnson, Emily Alsentzer. Do We Still Need Clinical Language Models?. Conference on Health, Inference, and Learning, 2023.

Matthew G Crowson, Emily Alsentzer, Julie M Fiskio, David Bates. Towards Medical Billing Automation: NLP for Outpatient Clinician Note Classification. medRxiv, 2023.

Emily Alsentzer, Matthew J Rasmussen, Romy Fontoura, Alexis L Cull, Brett Beaulieu-Jones, Kathryn J Gray, David W Bates, Vesela P Kovacheva. Zero-shot Interpretable Phenotyping of Postpartum Hemorrhage Using Large Language Models. medRxiv, 2023.

Emily Alsentzer, Michelle M Li, Shilpa N Kobren, Undiagnosed Diseases Network, Isaac S Kohane, Marinka Zitnik. Deep learning for diagnosing patients with rare genetic diseases. medRxiv, 2022.

See all publications