1

Learning Mixtures of Graphs from Epidemic Cascades

We consider the problem of learning the weighted edges of a balanced mixture of two undirected graphs from epidemic cascades. While mixture models are popular modeling tools, algorithmic development with rigorous guarantees has lagged. Graph mixtures …

Superpolynomial Lower Bounds for Learning One-Layer Neural Networks using Gradient Descent

We prove the first superpolynomial lower bounds for learning one-layer neural networks with respect to the Gaussian distribution using gradient descent. We show that any classifier trained using gradient descent with respect to square-loss will fail …

Time/Accuracy Tradeoffs for Learning a ReLU with respect to Gaussian Marginals

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $opt

Learning Ising Models with Independent Failures

We give the first efficient algorithm for learning the structure of an Ising model that tolerates independent failures; that is, each entry of the observed sample is missing with some unknown probability p. Our algorithm matches the essentially …

Learning Neural Networks with Two Nonlinear Layers in Polynomial Time

We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e.g., sigmoid or ReLU). We make no assumptions on the structure of the network, and the algorithm …

Learning One Convolutional Layer with Overlapping Patches

We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches. Additionally, our algorithm requires only mild conditions on the underlying …

Eigenvalue Decay Implies Polynomial-Time Learnability for Neural Networks

We consider the problem of learning function classes computed by neural networks with various activations (e.g. ReLU or Sigmoid), a task believed to be computationally intractable in the worst-case. A major open problem is to understand the minimal …

Reliably Learning the ReLU in Polynomial Time

We give the first dimension-efficient algorithms for learning Rectified Linear Units (ReLUs), which are functions of the form $x \rightarrow \max(0, w \cdot x)$ with $w \in S^{n−1}$. Our algorithm works in the challenging Reliable Agnostic learning …