Michael Hu

I am a third-year PhD student at the NYU Center for Data Science, working with Kyunghyun Cho and Tal Linzen. I work on algorithms that optimize the training data of language models. I am supported by an NSF Graduate Research Fellowship.
In my spare time, I enjoy cooking, running, and playing basketball.
Previously, I completed a BSE at Princeton CS, where I spent two lovely years working with Karthik Narasimhan and Tom Griffiths.
news
Mar 26, 2025 | Gave talks on “Between Circuits and Chomsky” at École Normale Supérieure CoML and FLaNN. |
---|---|
Mar 15, 2025 | “Aioli: A Unified Optimization Framework for Language Model Data Mixing” and “How to visualize training dynamics in neural networks” accepted to ICLR 2025. |
Feb 27, 2025 | New preprint: “Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases”. |
Jul 17, 2024 | New preprint: “The importance of human-scale language modeling for psycholinguistics.” |
Nov 21, 2023 | “Latent State Models of Training Dynamics” accepted to TMLR. |
selected publications
2025
- PreprintBetween Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic BiasesPreprint, 2025
2024
- PreprintBigger is not always better: The importance of human-scale language modeling for psycholinguisticsPreprint, 2024
2023
- TMLR