1

Investigating the Role of Negatives in Contrastive Representation Learning

Noise contrastive learning is a popular technique for unsupervised representation learning. In this approach, a representation is obtained via reduction to supervised learning, where given a notion of semantic similarity, the learner tries to …

Gone Fishing: Neural Active Learning with Fisher Embeddings

There is an increasing need for effective active learning algorithms that are compatible with deep neural networks. While there are many classic, well-studied sample selection methods, the non-convexity and varying internal representation of neural …

Acceleration via Fractal Learning Rate Schedules

When balancing the practical tradeoffs of iterative methods for large-scale optimization, the learning rate schedule remains notoriously difficult to understand and expensive to tune. We demonstrate the presence of these subtleties even in the …

Statistical Estimation from Dependent Data

We consider a general statistical estimation problem wherein binary labels across different observations are not independent conditioning on their feature vectors, but dependent, capturing settings where e.g. these observations are collected on a …

Tight Hardness Results for Training Depth-2 ReLU Networks

We prove several hardness results for training depth-2 neural networks with the ReLU activation function; these networks are simply weighted sums (that may include negative coefficients) of ReLUs. Our goal is to output a depth-2 neural network that …

From Boltzmann Machines to Neural Networks and Back Again

Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, …

Statistical-Query Lower Bounds via Functional Gradients

We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning …

Learning Ising and Potts Models with Latent Variables

We study the problem of learning graphical models with latent variables. We give the $first$ efficient algorithms for learning: 1) ferromagnetic Ising models with latent variables under $arbitrary$ external fields, and 2) ferromagnetic Potts model …

Approximation Schemes for ReLU Regression

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution. We give the first efficient, constant-factor approximation …

Efficiently Learning Adversarially Robust Halfspaces with Noise

We study the problem of learning adversarially robust halfspaces in the distribution-independent setting. In the realizable setting, we provide necessary and sufficient conditions on the adversarial perturbation sets under which halfspaces are …