Graphical models are powerful tools for modeling high-dimensional data, but learning graphical models in the presence of latent variables is well-known to be difficult. In this work we give new results for learning Restricted Boltzmann Machines, …

We give the first statistical-query lower bounds for agnostically learning any non-polynomial activation with respect to Gaussian marginals (e.g., ReLU, sigmoid, sign). For the specific problem of ReLU regression (equivalently, agnostically learning …

We study the problem of learning graphical models with latent variables. We give the $first$ efficient algorithms for learning: 1) ferromagnetic Ising models with latent variables under $arbitrary$ external fields, and 2) ferromagnetic Potts model …

We consider the fundamental problem of ReLU regression, where the goal is to output the best fitting ReLU with respect to square loss given access to draws from some unknown distribution. We give the first efficient, constant-factor approximation …

We study the problem of learning adversarially robust halfspaces in the distribution-independent setting. In the realizable setting, we provide necessary and sufficient conditions on the adversarial perturbation sets under which halfspaces are …

We consider the problem of learning the weighted edges of a balanced mixture of two undirected graphs from epidemic cascades. While mixture models are popular modeling tools, algorithmic development with rigorous guarantees has lagged. Graph mixtures …

We prove the first superpolynomial lower bounds for learning one-layer neural networks with respect to the Gaussian distribution using gradient descent. We show that any classifier trained using gradient descent with respect to square-loss will fail …

We consider the problem of computing the best-fitting ReLU with respect to square-loss on a training set when the examples have been drawn according to a spherical Gaussian distribution (the labels can be arbitrary). Let $opt

We give the first efficient algorithm for learning the structure of an Ising model that tolerates independent failures; that is, each entry of the observed sample is missing with some unknown probability p. Our algorithm matches the essentially …

We give a polynomial-time algorithm for learning neural networks with one layer of sigmoids feeding into any Lipschitz, monotone activation function (e.g., sigmoid or ReLU). We make no assumptions on the structure of the network, and the algorithm …