Publications
Note: (α-β) indicates alphabetical ordering
Preprints
Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift
(α-β) , Jonathan Pei, James WangNarrowing the Collaboration Gap, Probably
Mirah Shi, Marcel Hussing, Natalie Collina, Ira Globus-Harris, Aaron Roth,Personalization Aids Pluralistic Alignment Under Competition
(α-β) Natalie Collina, , Aaron Roth, Mirah Shi
Best paper award, Workshop on AI for Mechanism Design and Strategic Decision Making, ICLR 2026Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds
(α-β) Ezra Edelman,
FORC 2026 (Highlight track)Why Do Transformers Fail to Forecast Time Series In-Context?
Yufa Zhou, Yixiao Wang, , Anru Zhang
Spotlight presentation, What Can('t) Transformers Do? Workshop, NeurIPS 2025Weight Clipping for Robust Conformal Inference under Unbounded Covariate Shifts
James Wang,
Conference Papers
Testing Noise Assumptions of Learning Algorithms
(α-β) , Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan
COLT 2026
Best paper award, Reliable ML Workshop, NeurIPS 2025Model Agreement via Anchoring
(α-β) Eric Eaton, , Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell
COLT 2026Emergent Alignment via Competition
(α-β) Natalie Collina, , Aaron Roth, Emily Ryu, Mirah Shi
ICML 2026Less Data, Faster Training: Repeating Smaller Datasets Speeds up Learning via Sampling Biases
Jingwen Liu, Ezra Edelman, , Bingbin Liu
ICML 2026
Contributed talk, Workshop on Scientific Methods for Understanding Deep Learning, ICLR 2026Is Code Better than Language for Algorithmic Reasoning?
Terry Tong, Yu Feng, , Dan Roth
ICML 2026In Good GRACES: Principled Teacher Selection for Knowledge Distillation
Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Sham M. Kakade,
ICLR 2026Collaborative Prediction: Tractable Information Aggregation via Agreement
(α-β) Natalie Collina, Ira Globus-Harris, , Varun Gupta, Aaron Roth, Mirah Shi
SODA 2026
Spotlight presentation, EC 2025 Workshop on Human-AI Collaboration
Spotlight presentation, 2025 TTIC Workshop on Incentives for Collaborative Learning and Data SharingProbabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You, , Eric Wong
NeurIPS 2025A Theory of Learning with Autoregressive Chain of Thought
Nirmit Joshi, Gal Vardi, Adam Block, , Zhiyuan Li, Theodor Misiakiewicz, Nathan Srebro
COLT 2025Tractable Agreement Protocols
(α-β) Natalie Collina, , Varun Gupta, Aaron Roth
STOC 2025Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji, Aaron Roth,
ICLR 2025Progressive Distillation Induces an Implicit Curriculum
Abhishek Panigrahy, Bingbin Liu, Sadhika Malladi, Andrej Risteski,
Oral presentation, ICLR 2025Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue, Avishree Khare, Rajeev Alur, , Eric Wong
ICLR 2025The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman, Nikolaos Tsilivis, Benjamin L. Edelman, Eran Malach,
NeurIPS 2024Tolerant Algorithms for Learning with Arbitrary Covariate Shift
(α-β) , Abhishek Shetty, Konstantinos Stavropoulos, Arsen Vasilyan
Spotlight presentation, NeurIPS 2024Complexity Matters: Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu, Da Kuang,
ICML 2024Stochastic Bandits with ReLU Neural Networks
Kan Xu, Hamsa Bastani, , Osbert Bastani
ICML 2024Adversarial Resilience in Sequential Prediction via Abstention
(α-β) , Steve Hanneke, Shay Moran, Abhishek Shetty
NeurIPS 2023Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
(α-β) Benjamin L. Edelman, , Sham M. Kakade, Eran Malach, Cyril Zhang
Spotlight presentation, NeurIPS 2023Exposing Attention Glitches with Flip-Flop Language Modeling
Bingbin Liu, Jordan T. Ash, , Akshay Krishnamurthy, Cyril Zhang
Spotlight presentation, NeurIPS 2023Learning Narrow One-Hidden-Layer ReLU Networks
(α-β) Sitan Chen, Zehao Dou, , Adam R. Klivans, Raghu Meka
COLT 2023Transformers Learn Shortcuts to Automata
Bingbin Liu, Jordan T. Ash, , Akshay Krishnamurthy, Cyril Zhang
Notable top-5% paper, ICLR 2023Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
(α-β) , Sham M. Kakade, Adam T. Kalai, Cyril Zhang
NeurIPS 2022Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
(α-β) Boaz Barak, Benjamin L. Edelman, , Sham M. Kakade, Eran Malach, Cyril Zhang
NeurIPS 2022Inductive Biases and Variable Creation in Self-Attention Mechanisms
(α-β) Benjamin L. Edelman, , Sham M. Kakade, Cyril Zhang
ICML 2022Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi, Jordan T. Ash, , Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham M. Kakade, Akshay Krishnamurthy
ICML 2022Anti-Concentrated Confidence Bonuses For Scalable Exploration
Jordan T. Ash, Cyril Zhang, , Akshay Krishnamurthy, Sham M. Kakade
ICLR 2022Investigating the Role of Negatives in Contrastive Representation Learning
(α-β) Jordan T. Ash, , Akshay Krishnamurthy, Dipendra Misra
AISTATS 2022Gone Fishing: Neural Active Learning with Fisher Embeddings
Jordan T. Ash, , Akshay Krishnamurthy, Sham M. Kakade
NeurIPS 2021Acceleration via Fractal Learning Rate Schedules
(α-β) Naman Agarwal, , Cyril Zhang
ICML 2021Statistical Estimation from Dependent Data
Anthimos-Vardis Kandiros, Yuval Dagan, Nishanth Dikkala, , Constantinos Daskalakis
ICML 2021Tight Hardness Results for Learning One-Layer ReLU Networks
(α-β) , Adam R. Klivans, Pasin Manurangsi, Daniel Reichman
ITCS 2021From Boltzmann Machines to Neural Networks and Back Again
(α-β) , Adam R. Klivans, Frederic Koehler
NeurIPS 2020Statistical-Query Lower Bounds via Functional Gradients
(α-β) , Aravind Gollakota, Adam R. Klivans
NeurIPS 2020Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent
(α-β) , Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam R. Klivans
ICML 2020Efficiently Learning Adversarially Robust Halfspaces with Noise
Omar Montasser, , Ilias Diakonikolas, Nathan Srebro
ICML 2020Learning Mixtures of Graphs from Epidemic Cascades
Jessica Hoffmann, Soumya Basu, , Constantine Caramanis
ICML 2020Approximation Schemes for ReLU Regression
(α-β) Ilias Diakonikolas, , Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi
COLT 2020Learning Ising and Potts Models with Latent Variables
AISTATS 2020Time/Accuracy Trade-offs for Learning a ReLU with respect to Gaussian Marginals
(α-β) , Sushrut Karmalkar, Adam R. Klivans
Spotlight presentation, NeurIPS 2019Learning Ising Models with Independent Failures
(α-β) , Daniel Kane, Adam R. Klivans
COLT 2019Learning Neural Networks with Two Nonlinear Layers in Polynomial Time
(α-β) , Adam R. Klivans
COLT 2019Learning One Convolutional Layer with Overlapping Patches
(α-β) , Adam R. Klivans, Raghu Meka
Oral presentation, ICML 2018Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
(α-β) , Adam R. Klivans
NeurIPS 2017Reliably Learning the ReLU in Polynomial Time
(α-β) , Varun Kanade, Adam R. Klivans, Justin Thaler
COLT 2017
Oral presentation, Optimization for Machine Learning (OPT-ML) Workshop, NeurIPS 2016
Reports
Encoding Structural Symmetry is Key for Length Generalization in Arithmetic Tasks
Mahdi Sabbaghi, George J. Pappas, Hamed Hassani,Recovering the Lowest Layer of Deep Networks with High Threshold Activations
(α-β) , Rina PanigrahyQuantifying Perceptual Distortion of Adversarial Examples
Matthew Jordan, Naren Manoj, , Alexandros DimakisImproved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
(α-β) Simon Du,