Hello! I am currently an Applied Scientist at Amazon Store Foundational AI (SFAI), where I work on large-scale LLM post-training and alignment for Amazonโs Rufus LLM. My work focuses on reinforcement learning, instruction fine-tuning, synthetic data generation, and evaluation methods to improve LLM reasoning, reliability, and controllability.
Previously, I was a Postdoctoral Researcher at
Stanford University, advised by Prof.
Nigam H. Shah and Prof.
Sanmi Koyejo, where I worked on LLM post-training, synthetic data pipelines, and evaluation methods for healthcare.
I received my Ph.D. in
Computer Science from Emory University, advised by Prof.
Carl Yang and working closely with Prof.
Joyce C. Ho. I earned my bachelorโs degree in Software Engineering from
Tongji University, graduating as Valedictorian (GPA 4.9/5.0, Rank 1/164), recipient of National Scholarships for three consecutive years, where I worked with Prof.
Tianwei Yu on machine learning research. I also spent a summer as a research intern at the
Perk Lab, working with Prof.
Gabor Fichtinger at Queenโs University in Canada.
News
- [2025.12] Our work MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks has been officially accepted for publication in
Nature Medicine. Huge thanks to all collaborators for making this possible!
- [2025.08] Our work on temporal instruction tuning (๐ง๐๐ ๐๐ฅโ๏ธ) has been officially accepted for publication in
npj Digital Medicine. Grateful to Dr. Shah and all collaborators for their valuable contributions and support!
- [2025.07] Our work CuraBench: A Benchmark Dataset Generation System for Healthcare AI Evaluation
is accepted to
KDD 2025 HealthDay. Thanks to all the collaborators!
- [2025.06] One
paper on knowledge-enhanced reasoning is accepted to
AMIA 2025.
- [2025.06] Our
survey on knowledge graphs for healthcare is accepted to Journal of Biomedical Informatics. Congrats!
- [2025.06] Our
work on multimodal EHR is accepted to
MedInfo 2025.
- [2025.05] Our work
๐ ๐ฒ๐ฑ๐๐๐๐ : Holistic Evaluation of Large Language Models for Medical Tasks is now available online. Thank you to all the coauthors for their hard work!
- [2025.05] Our work
CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models is accepted to
ICML 2025. Congrats Wei!
- [2025.05] Our
work on Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey is accepted to
ACM Computing Surveys.
- [2025.03] We build ๐ ๐ฒ๐ฑ๐๐๐๐ โจ: a comprehensive benchmark evaluating AI on realistic clinical tasks that healthcare professionals perform daily instead of just medical exams. Check out our
HAI blogpost and MedHELM
leaderboard for more details.
- [2025.02] We release three de-identified, longitudinal EHR datasets from Stanford:
more details โ now freely available for non-commercial research-use worldwide.
- [2025.02] Pleased to share that our grant proposal ๐๐๐ฟ๐ฎ๐๐ฒ๐ป๐ฐ๐ต has been selected for funding through by the
Stanford RAISE Health Seed Grant Program and
Stanford HAI. A big thank you to all the collaborators for their support!
- [2024.12] Served as a Junior Chair at the Foundations Models and Multimodal AI Round Table at ML4H 2024 Symposium. The topic summary can be found
here.
- [2024.09] Our work on clinician preference aligned synthetic instruction generation for visual instruction tuning is accepted to
NeurIPS'24 Datasets and Benchmarks Track.
- [2024.09] Our work on network recall by large language models is accepted to
NeurIPS'24 Research Track. Abstract version is presented on
IC2S2'24 as an Oral.
- [2024.08] Excited to be selected as a Rising Star Spotlight Speaker to give a talk about my research at
University of Michigan 2024 AI Symposium.
- [2024.06] Our work on multi-agent LLM reasoning for EHR-based few-shot disease prediction is accepted to
AMIA Annual Symposium as an Oral.
- [2024.05] Our work on disease subtyping is accepted to
KDD Applied Data Science Track.
- [2024.04] My PhD Work is selected to the
CHIL Doctoral Symposium. Thank you CHIL!
- [2024.04] I successfully defended my dissertation. Officially, Dr. Cui!
- [2024.03] Our survey paper on LLM domain specialization is cited by the
2024 Economic Report of the President!
- [2023.12] Glad to received NSF Student Travel Support Award for the ICDM 2023.
- [2023.09] Our paper on artificial node features on non-attributed graphs has been selected as Most Influential CIKM Papers produced by
Best Paper Digest.
- [2023.09] One work on visual knowledge extraction is accepted to
NeurIPS'23.
- [2023.09] Two work are accepted to
PSB'24. Congrats to Alexis (High-Schooler)!
- [2023.08] Humbled to be selected as a
2023 EECS Rising Star!
- [2023.05] One work on biological data augmentation is accepted to
KDD'23.
- [2023.05] One work on multimodal extraction is accepted to
ACL'23 Findings.
- [2023.04] Our work on brain network pre-training is accepted to
CHIL'23 as an Oral.
- [2022.11] Our paper on few-shot learning is accepted to
AAAI'23 as an Oral.
- [2022.11] Glad to receive
NeurIPS AI4Science Travel Award!
- [2022.10] Our Benchmark paper on Graph Neural Networks for brain networks has now been officially accepted to
IEEE TMI.
- [2022.09] Our paper on brain transformer is accepted to
NeurIPS'22 as an Spotlight.
- [2022.08] One paper on node feature for non-attributed graphs is accepted to
CIKM'22.
- [2022.06] One paper on interpretable GNNs is accepted to
MICCAI'22 as an Oral.
Interests
- LLM Post-training and Alignment (RL/IFT)
- Multimodal LLMs
- LLM Evaluation and Benchmarking
- AI for Healthcare
Education
Ph.D. in Computer Science, 2019-2024
Emory University
B.Eng. in Software Engineering, 2015-2019
Tongji University