I am a postdoc at Institute for Foundations of Data Science (IFDS) at University of Washington, where I'm working with Simon Du and Maryam Fazel. Previously, I received my Ph.D. in Computer Science from Duke University, where I was fortunate to be advised by Rong Ge. Before coming to Duke, I received B.S. in Statistics from Peking University.
In summer 2023, I worked with Prof. Tengyu Ma at Stanford. In summer 2022, I was an applied science intern at AWS AI. In summer 2018, I was an intern in Industrial and Systems Engineering (ISyE), Georgia Tech, working with Prof. Tuo Zhao.
My research interests are in optimization and theoretical machine learning. Recently, I am particularly interested in deep learning theory.
Publications and Preprints
* denotes equal contribution, (α-β order) denotes alphabetical ordering-
How Does Gradient Descent Learn Features -- A Local Analysis for Regularized Two-Layer Neural Networks
Mo Zhou, Rong Ge.
Conference on Neural Information Processing Systems (NeurIPS), 2024.
Short version appeared at NeurIPS Mathematics of Modern Machine Learning (M3L) workshop, 2023. -
Implicit Regularization Leads to Benign Overfitting for Sparse Linear Regression
Mo Zhou, Rong Ge.
International Conference on Machine Learning (ICML), 2023. -
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Xingyu Zhu*, Zixuan Wang*, Xiang Wang, Mo Zhou, Rong Ge.
International Conference on Learning Representations (ICLR), 2023 -
Depth-Separation with Multilayer Mean-Field Networks.
Yunwei Ren, Mo Zhou, Rong Ge.
International Conference on Learning Representations (ICLR), 2023. Notable-top-25%. -
Plateau in Monotonic Linear Interpolation–A “Biased” View of Loss Landscape for Deep Networks
Xiang Wang, Annie N Wang, Mo Zhou, Rong Ge.
International Conference on Learning Representations (ICLR), 2023 -
Understanding The Robustness of Self-supervised Learning Through Topic Modeling
Zeping Luo*, Shiyou Wu*, Cindy Weng*, Mo Zhou, Rong Ge
International Conference on Learning Representations (ICLR), 2023 -
Understanding Deflation Process in Over-parametrized Tensor Decomposition
(α-β order) Rong Ge*, Yunwei Ren*, Xiang Wang*, Mo Zhou*
Conference on Neural Information Processing Systems (NeurIPS), 2021. -
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network
Mo Zhou, Rong Ge, Chi Jin
Conference on Learning Theory (COLT), 2021. -
Towards Understanding the Importance of Shortcut Connections in Residual Networks
Tianyi Liu*, Minshuo Chen*, Mo Zhou, Simon S. Du, Enlu Zhou, Tuo Zhao
Conference on Neural Information Processing Systems (NeurIPS), 2019. -
Towards Understanding the Importance of Noise in Training Neural Networks
Mo Zhou*, Tianyi Liu*, Yan Li, Dachao Lin, Enlu Zhou, Tuo Zhao
International Conference on Machine Learning (ICML), 2019. Long Talk.
Presentations
-
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network
COLT 2021, Aug. 2021
Theory of Overparameterized Machine Learning (TOPML) 2021, Apr. 2021
Duke Deep Learning Reading Group, Apr. 2021
THEORINET Journal Club/MODL Reading Group, Feb. 2021
Teaching
- CPS590.04 Machine Learning Algorithms, 2021 Spring. TA
- CPS330 Design and Analysis of Algorithms, 2020 Fall. TA
- CPS330 Design and Analysis of Algorithms, 2020 Spring. TA
Services
- Reviewer for ICML, ICLR, NeurIPS, JMLR, Mathematical Programming, STOC.
Education
-
Duke University, 2019 - 2024
Ph.D. in Computer Science -
Peking University, 2015 - 2019
B.S. in Statistics