Michael Hu

I am a fourth-year PhD student at the NYU Center for Data Science, advised by Kyunghyun Cho and Tal Linzen. I am supported by the NSF Graduate Research Fellowship.

I study how to train and adapt large language models: [Aioli], [pre-pretraining]

I’m also interested in cognitive science and training dynamics.

Cognitive science: [On human-scale LMs], [BabyLM]
Training dynamics: [Latent state models], [Visualization]

Previously, I completed a BSE at Princeton CS, where I spent two lovely years working with Karthik Narasimhan and Tom Griffiths. I then joined Yobi AI for two years as the first employee.

In my spare time, I enjoy cooking, running, and playing basketball.

August 2025	Gave a talk on Aioli and pre-pretraining at Harvard ML Foundations.
July 2025	Pre-pretraining won an Outstanding Paper Award at ACL 2025! 🏅
July 2025	New preprints: Scaling Laws Are Unreliable for Downstream Tasks and RELIC: Evaluating Compositional Instruction Following via Language Recognition.
Spring 2025	Gave talks on pre-pretraining at École Normale Supérieure CoML, FLaNN, Ryco Lab Reading Group, and CDS Seminar.
April 2025	Presenting Aioli: A Unified Optimization Framework for Language Model Data Mixing and How to visualize training dynamics in neural networks at ICLR 2025.

News