Pang Wei Koh

I'm interested in making machine learning systems more useful, responsible, and reliable in the real world. For example:

Access. How can we expand our access to foundation models, so that we can better understand, build upon, and adapt them? We develop new methods, architectures, and data for efficiently training and deploying fully open models.
Reliability. How do we make our models more reliable and trustworthy? We design new approaches to evaluation and are working on the next generation of retrieval-based models that can reason directly over data.
Impact. What can we do with AI that we could not do before, e.g., accelerate scientific discovery or provide universal access to medical advice?

I received my PhD in Computer Science from Stanford, advised by Percy Liang. Before that, I was the 3rd employee and Director of Partnerships at Coursera. I was also an undergraduate at Stanford, advised by Andrew Ng and Daphne Koller.

I'm part of the UW ML and NLP groups, and I'm also a visiting research scientist at AI2. If you're interested in joining our group, please read this. This cycle, I'm also looking for prospective students/postdocs interested in AI for science.

Current students

Scott Geng
(with Ranjay Krishna)

Jacqueline He
(with Luke Zettlemoyer)

Rulin Shao
(with Luke Zettlemoyer)

Rui Xin
(with Sewoong Oh)

Ian Magnusson
(with Noah Smith)

Zhiyuan Zeng
(with Hanna Hajishirzi)

Alumni

Irena Gao (MS 2023, now PhD student at Stanford University)
Kendrick Shen (MS 2022, now ML research engineer at Genesis Therapeutics)
Henrik Marklund (MS 2021, now PhD student at Stanford University)
Kai-Siang Ang (MS 2021, now ML engineer at Nuro)
Erik Jones (MS 2020, now PhD student at UC Berkeley)
Hubert Teo (MS 2019, now senior software engineer at Flock Freight)
Thao Nguyen (BS 2019, now PhD student at the University of Washington)
Yew-Siang Tang (BS 2019, now senior software engineer at You.com)

Publications

* = equal contribution.

2 OLMo 2 Furious

Team OLMo, Pete Walsh, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Shane Arora, Akshita Bhagia, Yuling Gu, Shengyi Huang, Matt Jordan, Nathan Lambert, Dustin Schwenk, Oyvind Tafjord, Taira Anderson, David Atkinson, Faeze Brahman, Christopher Clark, Pradeep Dasigi, Nouha Dziri, Michal Guerquin, Hamish Ivison, Pang Wei Koh, Jiacheng Liu, Saumya Malik, William Merrill, Lester James V. Miranda, Jacob Morrison, Tyler Murray, Crystal Nam, Valentina Pyatkin, Aman Rangapur, Michael Schmitz, Sam Skjonsberg, David Wadden, Christopher Wilhelm, Michael Wilson, Luke Zettlemoyer, Ali Farhadi, Noah A. Smith, and Hannaneh Hajishirzi

arXiv 2025

(paper) (website)

OLMoE: Open Mixture-of-Experts language models

Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld, Kyle Lo, Jacob Morrison, Sewon Min, Weijia Shi, Pete Walsh, Oyvind Tafjord, Nathan Lambert, Yuling Gu, Shane Arora, Akshita Bhagia, Dustin Schwenk, David Wadden, Alexander Wettig, Binyuan Hui, Tim Dettmers, Douwe Kiela, Ali Farhadi, Noah A Smith, Pang Wei Koh, Amanpreet Singh, and Hannaneh Hajishirzi

ICLR 2025

(paper) (code)

Language models scale reliably with over-training and on downstream tasks

Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Jenia Jitsev, Luca Soldaini, Alexandros G. Dimakis, Gabriel Ilharco, Pang Wei Koh, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, and Ludwig Schmidt

ICLR 2025