Research

Publications

Note: (α-β) indicates alphabetical ordering

Preprints

Personalization Aids Pluralistic Alignment Under Competition
(α-β) Natalie Collina, Surbhi Goel, Aaron Roth, and Mirah Shi
Oral presentation, Workshop on AI for Mechanism Design and Strategic Decision Making, ICLR 2026
Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds
(α-β) Ezra Edelman, Surbhi Goel
Model Agreement via Anchoring
(α-β) Eric Eaton, Surbhi Goel, Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell
Emergent Alignment via Competition
(α-β) Natalie Collina, Surbhi Goel, Aaron Roth, Emily Ryu, and Mirah Shi
Why Do Transformers Fail to Forecast Time Series In-Context?
Yufa Zhou, Yixiao Wang, Surbhi Goel, Anru Zhang
Spotlight presentation, What Can(‘t) Transformers Do? Workshop, NeurIPS 2025
Weight Clipping for Robust Conformal Inference under Unbounded Covariate Shifts
James Wang, Surbhi Goel
Testing Noise Assumptions of Learning Algorithms
(α-β) Surbhi Goel, Adam R. Klivans, Konstantinos Stavropoulos, Arsen Vasilyan
Best paper award, Reliable ML Workshop, NeurIPS 2025

Conference Papers

In Good GRACES: Principled Teacher Selection for Knowledge Distillation
Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Sham M. Kakade, Surbhi Goel
ICLR 2026
Collaborative Prediction: Tractable Information Aggregation via Agreement
(α-β) Natalie Collina, Ira Globus-Harris, Surbhi Goel, Varun Gupta, Aaron Roth, Mirah Shi
SODA 2026
Spotlight presentation, EC 2025 Workshop on Human-AI Collaboration
Spotlight presentation, 2025 TTIC Workshop on Incentives for Collaborative Learning and Data Sharing
Probabilistic Stability Guarantees for Feature Attributions
Helen Jin, Anton Xue, Weiqiu You, Surbhi Goel, Eric Wong
NeurIPS 2025
A Theory of Learning with Autoregressive Chain of Thought
Nirmit Joshi, Gal Vardi, Adam Block, Surbhi Goel, Zhiyuan Li, Theodor Misiakiewicz, Nathan Srebro
COLT 2025
Tractable Agreement Protocols
(α-β) Natalie Collina, Surbhi Goel, Varun Gupta, Aaron Roth
STOC 2025
Conformal Language Model Reasoning with Coherent Factuality
Maxon Rubin-Toles, Maya Gambhir, Keshav Ramji, Aaron Roth, Surbhi Goel
ICLR 2025
Progressive Distillation Induces an Implicit Curriculum
Abhishek Panigrahy, Bingbin Liu, Sadhika Malladi, Andrej Risteski, Surbhi Goel
Oral presentation, ICLR 2025
Logicbreaks: A Framework for Understanding Subversion of Rule-based Inference
Anton Xue, Avishree Khare, Rajeev Alur, Surbhi Goel, Eric Wong
ICLR 2025
The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains
Ezra Edelman, Nikolaos Tsilivis, Benjamin L. Edelman, Eran Malach, Surbhi Goel
NeurIPS 2024
Tolerant Algorithms for Learning with Arbitrary Covariate Shift
(α-β) Surbhi Goel, Abhishek Shetty, Konstantinos Stavropoulos, Arsen Vasilyan
Spotlight presentation, NeurIPS 2024
Complexity Matters: Feature Learning in the Presence of Spurious Correlations
GuanWen Qiu, Da Kuang, Surbhi Goel
ICML 2024
Stochastic Bandits with ReLU Neural Networks
Kan Xu, Hamsa Bastani, Surbhi Goel, Osbert Bastani
ICML 2024
Adversarial Resilience in Sequential Prediction via Abstention
(α-β) Surbhi Goel, Steve Hanneke, Shay Moran, Abhishek Shetty
NeurIPS 2023
Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck
(α-β) Benjamin L. Edelman, Surbhi Goel, Sham M. Kakade, Eran Malach, Cyril Zhang
Spotlight presentation, NeurIPS 2023
Exposing Attention Glitches with Flip-Flop Language Modeling
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
Spotlight presentation, NeurIPS 2023
Learning Narrow One-Hidden-Layer ReLU Networks
(α-β) Sitan Chen, Zehao Dou, Surbhi Goel, Adam R. Klivans, Raghu Meka
COLT 2023
Transformers Learn Shortcuts to Automata
Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang
Notable top-5% paper, ICLR 2023
Recurrent Convolutional Neural Networks Learn Succinct Learning Algorithms
(α-β) Surbhi Goel, Sham M. Kakade, Adam T. Kalai, Cyril Zhang
NeurIPS 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
(α-β) Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham M. Kakade, Eran Malach, Cyril Zhang
NeurIPS 2022
Inductive Biases and Variable Creation in Self-Attention Mechanisms
(α-β) Benjamin L. Edelman, Surbhi Goel, Sham M. Kakade, Cyril Zhang
ICML 2022
Understanding Contrastive Learning Requires Incorporating Inductive Biases
Nikunj Saunshi, Jordan T. Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham M. Kakade, Akshay Krishnamurthy
ICML 2022
Anti-Concentrated Confidence Bonuses For Scalable Exploration
Jordan T. Ash, Cyril Zhang, Surbhi Goel, Akshay Krishnamurthy, Sham M. Kakade
ICLR 2022
Investigating the Role of Negatives in Contrastive Representation Learning
(α-β) Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Dipendra Misra
AISTATS 2022
Gone Fishing: Neural Active Learning with Fisher Embeddings
Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Sham M. Kakade
NeurIPS 2021
Acceleration via Fractal Learning Rate Schedules
(α-β) Naman Agarwal, Surbhi Goel, Cyril Zhang
ICML 2021
Statistical Estimation from Dependent Data
Anthimos-Vardis Kandiros, Yuval Dagan, Nishanth Dikkala, Surbhi Goel, Constantinos Daskalakis
ICML 2021
Tight Hardness Results for Learning One-Layer ReLU Networks
(α-β) Surbhi Goel, Adam R. Klivans, Pasin Manurangsi, Daniel Reichman
ITCS 2021
From Boltzmann Machines to Neural Networks and Back Again
(α-β) Surbhi Goel, Adam R. Klivans, Frederic Koehler
NeurIPS 2020
Statistical-Query Lower Bounds via Functional Gradients
(α-β) Surbhi Goel, Aravind Gollakota, Adam R. Klivans
NeurIPS 2020
Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent
(α-β) Surbhi Goel, Aravind Gollakota, Zhihan Jin, Sushrut Karmalkar, Adam R. Klivans
ICML 2020
Efficiently Learning Adversarially Robust Halfspaces with Noise
Omar Montasser, Surbhi Goel, Ilias Diakonikolas, Nathan Srebro
ICML 2020
Learning Mixtures of Graphs from Epidemic Cascades
Jessica Hoffmann, Soumya Basu, Surbhi Goel, Constantine Caramanis
ICML 2020
Approximation Schemes for ReLU Regression
(α-β) Ilias Diakonikolas, Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans, Mahdi Soltanolkotabi
COLT 2020
Learning Ising and Potts Models with Latent Variables
Surbhi Goel
AISTATS 2020
Time/Accuracy Trade-offs for Learning a ReLU with respect to Gaussian Marginals
(α-β) Surbhi Goel, Sushrut Karmalkar, Adam R. Klivans
Spotlight presentation, NeurIPS 2019
Learning Ising Models with Independent Failures
(α-β) Surbhi Goel, Daniel Kane, Adam R. Klivans
COLT 2019
Learning Neural Networks with Two Nonlinear Layers in Polynomial Time
(α-β) Surbhi Goel, Adam R. Klivans
COLT 2019
Learning One Convolutional Layer with Overlapping Patches
(α-β) Surbhi Goel, Adam R. Klivans, Raghu Meka
Oral presentation, ICML 2018
Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks
(α-β) Surbhi Goel, Adam R. Klivans
NeurIPS 2017
Reliably Learning the ReLU in Polynomial Time
(α-β) Surbhi Goel, Varun Kanade, Adam R. Klivans, Justin Thaler
COLT 2017
Oral presentation, Optimization for Machine Learning (OPT-ML) Workshop, NeurIPS 2016

Reports

Encoding Structural Symmetry is Key for Length Generalization in Arithmetic Tasks
Mahdi Sabbaghi, George J. Pappas, Hamed Hassani, Surbhi Goel
Recovering the Lowest Layer of Deep Networks with High Threshold Activations
(α-β) Surbhi Goel, Rina Panigrahy
Quantifying Perceptual Distortion of Adversarial Examples
Matthew Jordan, Naren Manoj, Surbhi Goel, Alexandros Dimakis
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
(α-β) Simon Du, Surbhi Goel