Hejie Cui

Hejie Cui

Applied Scientist, Amazon SFAI Rufus LLM & Postdoctoral Researcher

Amazon Store Foundation AI (SFAI)

Stanford University

Hello! I am currently an Applied Scientist at Amazon Store Foundational AI (SFAI), where I work on large-scale LLM post-training and alignment for Amazonโ€™s Rufus LLM. My work focuses on reinforcement learning, instruction fine-tuning, synthetic data generation, and evaluation methods to improve LLM reasoning, reliability, and controllability.

Previously, I was a Postdoctoral Researcher at Stanford University, advised by Prof. Nigam H. Shah and Prof. Sanmi Koyejo, where I worked on LLM post-training, synthetic data pipelines, and evaluation methods for healthcare.

I received my Ph.D. in Computer Science from Emory University, advised by Prof. Carl Yang and working closely with Prof. Joyce C. Ho. I earned my bachelorโ€™s degree in Software Engineering from Tongji University, graduating as Valedictorian (GPA 4.9/5.0, Rank 1/164), recipient of National Scholarships for three consecutive years, where I worked with Prof. Tianwei Yu on machine learning research. I also spent a summer as a research intern at the Perk Lab, working with Prof. Gabor Fichtinger at Queenโ€™s University in Canada.

News

  • [2025.12] Our work MedHELM: Holistic Evaluation of Large Language Models for Medical Tasks has been officially accepted for publication in Nature Medicine. Huge thanks to all collaborators for making this possible!
  • [2025.08] Our work on temporal instruction tuning (๐—ง๐—œ๐— ๐—˜๐—ฅโŒ›๏ธ) has been officially accepted for publication in npj Digital Medicine. Grateful to Dr. Shah and all collaborators for their valuable contributions and support!
  • [2025.07] Our work CuraBench: A Benchmark Dataset Generation System for Healthcare AI Evaluation is accepted to KDD 2025 HealthDay. Thanks to all the collaborators!
  • [2025.06] One paper on knowledge-enhanced reasoning is accepted to AMIA 2025.
  • [2025.06] Our survey on knowledge graphs for healthcare is accepted to Journal of Biomedical Informatics. Congrats!
  • [2025.06] Our work on multimodal EHR is accepted to MedInfo 2025.
  • [2025.05] Our work ๐— ๐—ฒ๐—ฑ๐—›๐—˜๐—Ÿ๐— : Holistic Evaluation of Large Language Models for Medical Tasks is now available online. Thank you to all the coauthors for their hard work!
  • [2025.05] Our work CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models is accepted to ICML 2025. Congrats Wei!
  • [2025.05] Our work on Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey is accepted to ACM Computing Surveys.
  • [2025.03] We build ๐— ๐—ฒ๐—ฑ๐—›๐—˜๐—Ÿ๐— โœจ: a comprehensive benchmark evaluating AI on realistic clinical tasks that healthcare professionals perform daily instead of just medical exams. Check out our HAI blogpost and MedHELM leaderboard for more details.
  • [2025.02] We release three de-identified, longitudinal EHR datasets from Stanford: more details โ€” now freely available for non-commercial research-use worldwide.
  • [2025.02] Pleased to share that our grant proposal ๐—–๐˜‚๐—ฟ๐—ฎ๐—•๐—ฒ๐—ป๐—ฐ๐—ต has been selected for funding through by the Stanford RAISE Health Seed Grant Program and Stanford HAI. A big thank you to all the collaborators for their support!
  • [2024.12] Served as a Junior Chair at the Foundations Models and Multimodal AI Round Table at ML4H 2024 Symposium. The topic summary can be found here.
  • [2024.09] Our work on clinician preference aligned synthetic instruction generation for visual instruction tuning is accepted to NeurIPS'24 Datasets and Benchmarks Track.
  • [2024.09] Our work on network recall by large language models is accepted to NeurIPS'24 Research Track. Abstract version is presented on IC2S2'24 as an Oral.
  • [2024.08] Excited to be selected as a Rising Star Spotlight Speaker to give a talk about my research at University of Michigan 2024 AI Symposium.
  • [2024.06] Our work on multi-agent LLM reasoning for EHR-based few-shot disease prediction is accepted to AMIA Annual Symposium as an Oral.
  • [2024.05] Our work on disease subtyping is accepted to KDD Applied Data Science Track.
  • [2024.04] My PhD Work is selected to the CHIL Doctoral Symposium. Thank you CHIL!
  • [2024.04] I successfully defended my dissertation. Officially, Dr. Cui!
  • [2024.03] Our survey paper on LLM domain specialization is cited by the 2024 Economic Report of the President!
  • [2023.12] Glad to received NSF Student Travel Support Award for the ICDM 2023.
  • [2023.09] Our paper on artificial node features on non-attributed graphs has been selected as Most Influential CIKM Papers produced by Best Paper Digest.
  • [2023.09] One work on visual knowledge extraction is accepted to NeurIPS'23.
  • [2023.09] Two work are accepted to PSB'24. Congrats to Alexis (High-Schooler)!
  • [2023.08] Humbled to be selected as a 2023 EECS Rising Star!
  • [2023.05] One work on biological data augmentation is accepted to KDD'23.
  • [2023.05] One work on multimodal extraction is accepted to ACL'23 Findings.
  • [2023.04] Our work on brain network pre-training is accepted to CHIL'23 as an Oral.
  • [2022.11] Our paper on few-shot learning is accepted to AAAI'23 as an Oral.
  • [2022.11] Glad to receive NeurIPS AI4Science Travel Award!
  • [2022.10] Our Benchmark paper on Graph Neural Networks for brain networks has now been officially accepted to IEEE TMI.
  • [2022.09] Our paper on brain transformer is accepted to NeurIPS'22 as an Spotlight.
  • [2022.08] One paper on node feature for non-attributed graphs is accepted to CIKM'22.
  • [2022.06] One paper on interpretable GNNs is accepted to MICCAI'22 as an Oral.

Interests

  • LLM Post-training and Alignment (RL/IFT)
  • Multimodal LLMs
  • LLM Evaluation and Benchmarking
  • AI for Healthcare

Education

  • Ph.D. in Computer Science, 2019-2024

    Emory University

  • B.Eng. in Software Engineering, 2015-2019

    Tongji University

Latest