Pang Wei Koh

pangwei@cs.stanford.edu
twitter | github | google scholar

I'm a PhD student at Stanford working on machine learning with Percy Liang.

Bio

I received my BS and MS in Computer Science from Stanford University in 2013, where I worked with Andrew Ng and Daphne Koller in the Stanford AI Lab. I grew up in Singapore and served as an "AI" (armored infantry) officer before coming to Stanford.

In 2012, I joined Coursera as its third employee. I served as Director of Partnerships and Course Operations for two years, during which I built a team of 25 people working with thousands of instructors and staff from 100+ schools, and then as the product manager in charge of university-facing products. I returned to Stanford in 2015, working for a year with Anshul Kundaje on computational biology. In 2016, I started my PhD in Computer Science at Stanford, working with Percy Liang. I'm supported by a Facebook PhD Fellowship.

Research

* = equal contribution.

WILDS: A benchmark of in-the-wild distribution shifts
Pang Wei Koh*, Shiori Sagawa*, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton A. Earnshaw, Imran S. Haque, Sara Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, and Percy Liang
ICML 2021
Long talk
Just Train Twice: Improving group robustness without training group information
Evan Zheran Liu*, Behzad Haghgoo*, Annie S. Chen*, Aditi Raghunathan, Pang Wei Koh, Shiori Sagawa, Percy Liang, and Chelsea Finn
ICML 2021
Long talk
Accuracy on the line: On the strong correlation between out-of-distribution and in-distribution generalization
John Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, and Ludwig Schmidt
ICML 2021
Supporting COVID-19 policy response with large-scale mobility-based modeling
Serina Chang, Mandy L. Wilson, Bryan Lewis, Zakaria Mehrab, Komal K. Dudakiya, Emma Pierson, Pang Wei Koh, Jaline Gerardin, Beth Redbird, David Grusky, Madhav Marathe, Jure Leskovec
KDD (Applied Data Science track) 2021
Best paper award
On the opportunities and risks of foundation models
Rishi Bommasani, Drew A. Hudson, ..., Pang Wei Koh, ..., and Percy Liang (116 authors, alphabetical within ellipses)
arXiv 2021
Selective classification can magnify disparities across groups
Erik Jones*, Shiori Sagawa*, Pang Wei Koh*, Ananya Kumar, and Percy Liang
ICLR 2021
Spotlight talk at the NeurIPS 2020 ICBINB Workshop
Also presented at the NeurIPS 2020 Algorithmic Fairness Workshop
Mobility network models of COVID-19 explain inequities and inform reopening
Serina Y Chang*, Emma Pierson*, Pang Wei Koh*, Jaline Gerardin, Beth Redbird, David Grusky, and Jure Leskovec
Nature 2020
Commentary in Nature News and Views by Kevin Ma and Marc Lipsitch
Interactive article in The New York Times by Yaryna Serkez
Other press by The New York Times; The Washington Post; The Telegraph; Bloomberg; CNN; MIT Technology Review; Wired; STAT; and Stanford News
Also presented at NetSci 2021 (oral), the NeurIPS 2020 ML for Health Workshop, and the NeurIPS 2020 COVID-19 Symposium (invited talk).
Concept bottleneck models
Pang Wei Koh*, Thao Nguyen*, Yew Siang Tang*, Steve Mussmann, Emma Pierson, Been Kim, and Percy Liang
ICML 2020
Spotlight talk at the ICML 2020 Workshop on Human Interpretability in Machine Learning
An investigation of why overparameterization exacerbates spurious correlations
Shiori Sagawa*, Aditi Raghunathan*, Pang Wei Koh*, and Percy Liang
ICML 2020
ExpBERT: Representation engineering with natural language explanations
Shikhar Murty, Pang Wei Koh, and Percy Liang
ACL 2020
Toward trustworthy AI development: Mechanisms for supporting verifiable claims
Miles Brundage*, Shahar Avin*, Jasmine Wang*, Haydn Belfield*, Gretchen Krueger*, Gillian Hadfield, Heidy Khlaaf, Jingying Yang, Helen Toner, Ruth Fong, Tegan Maharaj, Pang Wei Koh, Sara Hooker, ..., Thomas Krendl Gilbert, Lisa Dyer, Saif Khan, Yoshua Bengio, and Markus Anderljung
arXiv 2020
Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization
Shiori Sagawa*, Pang Wei Koh*, Tatsunori B. Hashimoto, and Percy Liang
ICLR 2020
On the accuracy of influence functions for measuring group effects
Pang Wei Koh*, Kai-Siang Ang*, Hubert H. K. Teo*, and Percy Liang
NeurIPS 2019
Temporal FiLM: Capturing long-range sequence dependencies with feature-wise modulations
Sawyer Birnbaum*, Volodymyr Kuleshov*, S. Zayd Enam, Pang Wei Koh, Stefano Ermon
NeurIPS 2019
Inferring multi-dimensional rates of aging from cross-sectional data
Emma Pierson*, Pang Wei Koh*, Tatsunori B. Hashimoto*, Daphne Koller, Jure Leskovec, Nicholas Eriksson, and Percy Liang
AISTATS 2019
Contributed talk at the ICML/IJCAI 2018 Workshop on Computational Biology
Spotlight talk at the NeurIPS 2018 Workshop on Machine Learning for Health
Stronger data poisoning attacks break data sanitization defenses
Pang Wei Koh*, Jacob Steinhardt*, and Percy Liang
arXiv 2018
Certified defenses for data poisoning attacks
Jacob Steinhardt*, Pang Wei Koh*, and Percy Liang
NeurIPS 2017
Understanding black-box predictions via influence functions
Pang Wei Koh and Percy Liang
ICML 2017
Best paper award
Localized hepatic lobular regeneration by central-vein-associated lineage-restricted progenitors
Jonathan M. Tsai, Pang Wei Koh, Ania Stefanska, Liujing Xing, Graham G. Walmsley, Nicolas Poux, Irving L. Weissman, and Yuval Rinkevich
Proceedings of the National Academy of Sciences (PNAS) 2017
An atlas of transcriptional, chromatin accessibility, and surface marker changes in human mesoderm development
Pang Wei Koh*, Rahul Sinha*, Amira A. Barkal, Rachel M. Morganti, Angela Chen, Irving L. Weissman, Lay Teng Ang, Anshul Kundaje, and Kyle M. Loh
Scientific Data 2016
Mapping the pairwise choices leading from pluripotency to human bone, heart, and other mesoderm cell types
Kyle M. Loh*, Angela Chen*, Pang Wei Koh, Tianda Z. Deng, Rahul Sinha, Jonathan M. Tsai, Amira A. Barkal, Kimberle Y. Shen, Rajan Jain, Rachel M. Morganti, Ng Shyh-Chang, Nathaniel B. Fernhoff, Benson M. George, Gerlinde Wernig, Rachel E.A. Salomon, Zhenghao Chen, Hannes Vogel, Jonathan A. Epstein, Anshul Kundaje, William S. Talbot, Philip A. Beachy, Lay Teng Ang, and Irving L. Weissman
Cell 2016
Denoising genome-wide histone ChIP-seq with convolutional neural networks
Pang Wei Koh*, Emma Pierson*, and Anshul Kundaje.
Intelligent Systems for Molecular Biology (ISMB) / Bioinformatics 2017
Spotlight talk and best poster award at the ICML 2016 Workshop on Computational Biology
Top 10 papers of 2016-2017 in regulatory and systems genomics at RECOMB/ISMB
Dissecting an online intervention for cancer survivors
Zhenghao Chen, Pang Wei Koh, Philip L. Ritter, Kate Lorig, Erin O’Carroll Bantum, and Suchi Saria
Health Education & Behavior 2014
Peer and self assessment in massive online classes
Chinmay Kulkarni, Pang Wei Koh, Huy Le, Daniel Chia, Kathryn Papadopoulos, Justin Cheng, Daphne Koller, and Scott Klemmer
ACM Transactions on Computer-Human Interaction 2013
Identifying genetic drivers of cancer morphology
Pang Wei Koh, Andrew Beck, and Daphne Koller.
Undergraduate honors thesis 2012
Firestone Medal for Excellence in Research
Ben Wegbreit Prize for Best Undergraduate Honors Thesis in Computer Science
David M. Kennedy Honors Thesis Prize (best thesis in Stanford Engineering & Appl. Sciences)
Undergraduate Award in Computer Science (an international research award)
Sparse filtering
Jiquan Ngiam, Pang Wei Koh, Zhenghao Chen, Sonia Bhaskar, and Andrew Y. Ng
NeurIPS 2011
Spotlight paper
Learning deep energy models
Jiquan Ngiam, Zhenghao Chen, Pang Wei Koh, and Andrew Y. Ng
ICML 2011
On random weights and unsupervised feature learning
Andrew Saxe, Pang Wei Koh, Zhenghao Chen, Maneesh Bhand, Bipin Suresh, and Andrew Y. Ng
ICML 2011
Tiled convolutional neural networks
Quoc V. Le, Jiquan Ngiam, Zhenghao Chen, Daniel Chia, Pang Wei Koh, and Andrew Y. Ng
NeurIPS 2010
Lower bound on the time complexity of local adiabatic evolution
Zhenghao Chen, Pang Wei Koh, and Zhao Yan
Physical Review A 2006

Teaching

At Coursera, we were fortunate to have troves of data on what makes for effective teaching. I spoke frequently at workshops and conferences about online education and worked with many instructors on their courses. My team designed authoring tools and analytics dashboards for our instructors.

In 2012, I was head TA for CS228 at Stanford, Daphne's class on Probabilistic Graphical Models. Together with 8 other TAs, we revamped the class to make it application-focused and auto-gradable, and successfully taught it to 200+ Stanford students and 100,000+ online learners on the Coursera platform.

Before college, Zhenghao Chen and I created and taught a series of 14 full-day workshops for 100+ high school students, covering introductions to programming, artificial intelligence, cryptography, and computer networking.