Note: (α-β) indicates alphabetical ordering
Personalization Aids Pluralistic Alignment Under Competition
(α-β) Natalie Collina, , Aaron Roth, and Mirah Shi
Emergent Alignment via Competition
(α-β) Natalie Collina, , Aaron Roth, Emily Ryu, and Mirah Shi
Why Do Transformers Fail to Forecast Time Series In-Context?
Yufa Zhou, Yixiao Wang, , Anru Zhang
Spotlight presentation, What Can(‘t) Transformers Do? Workshop, NeurIPS 2025
Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds
Ezra Edelman,
Weight Clipping for Robust Conformal Inference under Unbounded Covariate Shifts
James Wang,
Testing Noise Assumptions of Learning Algorithms
(α-β) , Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan
Best paper award, Reliable ML Workshop, NeurIPS 2025
In Good GRACES: Principled Teacher Selection for Knowledge Distillation
Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Sham M. Kakade,
ICLR 2026
Collaborative Prediction: Tractable Information Aggregation via Agreement
(α-β) Natalie Collina, Ira Globus-Harris, , Varun Gupta, Aaron Roth, Mirah Shi
SODA 2026
Spotlight presentation, EC 2025 Workshop on Human-AI Collaboration
Spotlight presentation, 2025 TTIC Workshop on Incentives for Collaborative Learning and Data Sharing
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You, , Eric Wong
NeurIPS 2025
A Theory of Learning with Autoregressive Chain of Thought
Nirmit Joshi, Gal Vardi, Adam Block, , Zhiyuan Li, Theodor Misiakiewicz, Nathan Srebro
COLT 2025
Tractable Agreement Protocols
(α-β) Natalie Collina, , Varun Gupta, Aaron Roth
STOC 2025
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji, Aaron Roth,
ICLR 2025
Progressive Distillation Induces an Implicit Curriculum
Abhishek Panigrahy, Bingbin Liu, Sadhika Malladi, Andrej Risteski,
Oral presentation, ICLR 2025
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue, Avishree Khare, Rajeev Alur, , Eric Wong
ICLR 2025
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman, Nikolaos Tsilivis, Benjamin L. Edelman, Eran Malach,
NeurIPS 2024
Tolerant Algorithms for Learning with Arbitrary Covariate Shift
(α-β) , Abhishek Shetty, Konstantinos Stavropoulos, Arsen Vasilyan
Spotlight presentation, NeurIPS 2024
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu, Da Kuang,
ICML 2024
Stochastic Bandits with ReLU Neural Networks
Kan Xu, Hamsa Bastani, , Osbert Bastani
ICML 2024
Adversarial Resilience in Sequential Prediction via Abstention
(α-β) , Steve Hanneke, Shay Moran, Abhishek Shetty
NeurIPS 2023
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
(α-β) Benjamin L. Edelman, , Sham M. Kakade, Eran Malach, Cyril Zhang
Spotlight presentation, NeurIPS 2023
Exposing Attention Glitches with Flip-Flop Language Modeling
Bingbin Liu, Jordan T. Ash, , Akshay Krishnamurthy, Cyril Zhang
Spotlight presentation, NeurIPS 2023
Learning Narrow One-Hidden-Layer ReLU Networks
(α-β) Sitan Chen, Zehao Dou, , Adam R. Klivans, Raghu Meka
COLT 2023
Transformers Learn Shortcuts to Automata
Bingbin Liu, Jordan T. Ash, , Akshay Krishnamurthy, Cyril Zhang
Notable top-5% paper, ICLR 2023
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
(α-β) , Sham M. Kakade, Adam T. Kalai, Cyril Zhang
NeurIPS 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
(α-β) Boaz Barak, Benjamin L. Edelman, , Sham M. Kakade, Eran Malach, Cyril Zhang
NeurIPS 2022
Inductive Biases and Variable Creation in Self-Attention Mechanisms
(α-β) Benjamin L. Edelman, , Sham M. Kakade, Cyril Zhang
ICML 2022
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi, Jordan T. Ash, , Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham M. Kakade, Akshay Krishnamurthy
ICML 2022
Anti-Concentrated Confidence Bonuses For Scalable Exploration
Jordan T. Ash, Cyril Zhang, , Akshay Krishnamurthy, Sham M. Kakade
ICLR 2022
Investigating the Role of Negatives in Contrastive Representation Learning
(α-β) Jordan T. Ash, , Akshay Krishnamurthy, Dipendra Misra
AISTATS 2022
Gone Fishing: Neural Active Learning with Fisher Embeddings
Jordan T. Ash, , Akshay Krishnamurthy, Sham M. Kakade
NeurIPS 2021
Acceleration via Fractal Learning Rate Schedules
(α-β) Naman Agarwal, , Cyril Zhang
ICML 2021
Statistical Estimation from Dependent Data
Anthimos-Vardis Kandiros, Yuval Dagan, Nishanth Dikkala, , Constantinos Daskalakis
ICML 2021
Tight Hardness Results for Learning One-Layer ReLU Networks
(α-β) , Adam R. Klivans, Pasin Manurangsi, Daniel Reichman
ITCS 2021
From Boltzmann Machines to Neural Networks and Back Again
(α-β) , Adam R. Klivans, Frederic Koehler
NeurIPS 2020
Statistical-Query Lower Bounds via Functional Gradients
(α-β) , Aravind Gollakota, Adam R. Klivans
NeurIPS 2020
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent
(α-β) , Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam R. Klivans
ICML 2020
Efficiently Learning Adversarially Robust Halfspaces with Noise
Omar Montasser, , Ilias Diakonikolas, Nathan Srebro
ICML 2020
Learning Mixtures of Graphs from Epidemic Cascades
Jessica Hoffmann, Soumya Basu, , Constantine Caramanis
ICML 2020
Approximation Schemes for ReLU Regression
(α-β) Ilias Diakonikolas, , Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi
COLT 2020
Learning Ising and Potts Models with Latent Variables
AISTATS 2020
Time/Accuracy Trade-offs for Learning a ReLU with respect to Gaussian Marginals
(α-β) , Sushrut Karmalkar, Adam R. Klivans
Spotlight presentation, NeurIPS 2019
Learning Ising Models with Independent Failures
(α-β) , Daniel Kane, Adam R. Klivans
COLT 2019
Learning Neural Networks with Two Nonlinear Layers in Polynomial Time
(α-β) , Adam R. Klivans
COLT 2019
Learning One Convolutional Layer with Overlapping Patches
(α-β) , Adam R. Klivans, Raghu Meka
Oral presentation, ICML 2018
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
(α-β) , Adam R. Klivans
NeurIPS 2017
Reliably Learning the ReLU in Polynomial Time
(α-β) , Varun Kanade, Adam R. Klivans, Justin Thaler
COLT 2017
Oral presentation, Optimization for Machine Learning (OPT-ML) Workshop, NeurIPS 2016
Encoding Structural Symmetry is Key for Length Generalization in Arithmetic Tasks
Mahdi Sabbaghi, George J. Pappas, Hamed Hassani,
Recovering the Lowest Layer of Deep Networks with High Threshold Activations
(α-β) , Rina Panigrahy
Quantifying Perceptual Distortion of Adversarial Examples
Matthew Jordan, Naren Manoj, , Alexandros Dimakis
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
(α-β) Simon Du,