Learning protein structure with a differentiable simulator
Abstract: While the problem of predicting protein structure from sequence is among the oldest in computational biology, current methods leave a significant fraction of the protein universe out of reach. Standard methodology involves two steps: (1) defining an energy landscape, whether with physics, statistics, or homology, and (2) sampling low-energy conformations. Often, even "correct” energy landscapes that assign the lowest energy to the correct structure will not generate it as a prediction, because the conformational sampling algorithm cannot find it. We have been developing an alternative approach to bridge this gap by directly training energy landscapes in tandem with the conformational sampling algorithms that operate on them. I will talk about this approach, backpropagation through simulators in general, and how we built a deep neural energy function that is trained by backpropagating through the *entire* protein folding process.