 Good morning. My name is Emiliani Polipin. I work in the Unix Research Centre in Germany, and this morning my task is to give you a short introduction on QMM approaches. So far, during this week, you have seen and used methods based on classical physics. However, if you want to talk about QMM, that is a short for quantum mechanics, molecular mechanics, we need to talk about quantum physics. But I also know that some of you is not so much familiar with quantum physics and quantum mechanics yet. For this reason, in the first part of this lesson, I'm going to revise some basic elements about quantum mechanics and quantum chemistry. A session to understand, let's say, let's say the big picture. But I would like that you keep in mind from the beginning that if you want to use QMM approaches to investigate your systems, what we will see and learn today is absolutely not enough, and you will need to do the effort to go more in that in the theory of computational quantum chemistry if you really plan to employ these approaches in your system. In your investigations. After my theoretical lesson, my colleague, Maudi Vaibab, has prepared for you some tutorials where you will have the chance to watch QMM at work with practical example and real calculation by using the CP2K code. CP2K is a rather popular quantum code with many features, some of which I will mention in this lesson, and others will be introduced by Maudi himself in Victoria. Summarizing my intent today is to provide you with a quick overview of the essential concept of quantum mechanics and computational quantum chemistry, and then short to introduce QMM in order to make it easier to understand what Maudi is going to show you in Victoria. In particular, here is the outline. In the first part, we will discuss about when and why quantum mechanics is useful in biology. Moreover, as I said, we will give you a short recap of a crucial concept in quantum mechanics and computational quantum chemistry. Then we will talk about why we need to introduce hybrid QMM approaches, and in the second part of this lesson, after the break, we will go more in-depth about how to build a QMM method and how quantum and classical region can be coupled together in a QMM scheme, focusing in particular to the approaches implemented in the CP2K code. Let's start by answering the question, why do we need quantum mechanics in describing computational systems? In fact, in the previous days, you used approaches such as force field-based molecular mechanics or docking, where the finest resolution is the atom. What does it mean? The atoms are described like points in the space moving according to the classical Newtonian equations, if the dynamics is of interest. However, there are many phenomena in nature, including in biology, where such level of detail is not sufficient. Here are some examples. First of all, when chemical reaction are involved. For example, if you want to study enzymatic reaction. Second, when you want to work with a system containing metal atoms, for which in general no units are classical parametrization is available, and in each specific case, a docked force field parameter set to be tuned. Third example is when you want to study a phenomenon that involves proton transport, such as in the hydrobic generation of ATP or in photosynthesis. In fact, in hydrogen bonding solvent like water, protons do not diffuse as the other common cations, that is, as a random Brownian motion, mass motion due to thermal fluctuation. Instead, the excess proton diffuses via the so-called Krautus mechanism, sketch here in the picture at the right. A mechanism that implies the formation and the concomitant cleavage of covalent bonds involving neighboring molecules. As last example, quantum mechanics is necessary when we need to perform first principle base prediction of spectroscopic data, such as absorption of fluorescence spectra or even NMR, because empirical parametrization are usually unavailable or unreliable. What do all these examples have in common? The fact that the dynamical behavior of the electrons inside the atoms is fundamental for a correct description of phenomena, and cannot be neglected as done, for example, when we use the force field base approach. Unfortunately, electrons as well as the lightest nuclei such as protons cannot be dynamically described through the classical Newton equations, and the more complex theory is required, that is quantum mechanics. The fundamental equation of quantum mechanics is the so-called Schweringer equation here at the top of this slide. Calculate the quantum properties of a system, the quantum system implies to solve the corresponding Schweringer equation, which you can consider the equivalent of the Newtonian equation in the classical world. The unknown variable of the Schweringer equation is psi, the so-called wave function of the entire quantum system. It is a function of the coordinates of the quantum element in the system, that is electrons and nuclei for a molecular system, and the time. Knowing the wave function of the system at a certain time allows us to compute the properties of that system at that time. How? By solving in principle an integral like this. The square modulus of psi is proportional to the charge density and distribution of the system. The H in the Schweringer equation is instead the so-called Hamiltonian, and it represents the physics of the system. It is a equivalent of the force field at classical level. That is, it contains the interactional energy term of the electrons and the nuclei. For example, here at the bottom I reported the typical Hamiltonian used to describe molecular system in quantum chemistry, where now small and capital R represent electronic and nuclear coordinates respectively. From left to right you can recognize the kinetic term for the nuclei, the kinetic term for the electrons, the Coulomb interaction terms between electrons and nuclei, and finally the Coulomb interaction term between the nuclei. The Schweringer equation is mathematically extremely complex to solve. Just to give you an example with even only one particle, so n equals to 1, and in also complex Hamiltonians exact analytical solution are not available. Therefore, for the many body system we usually deal with, the Schweringer equation can only be solved approximately by numerical solution generated by our computer, and many approximate approaches have been devised in the years for the same. In particular, if we are interested only in the properties of the system and not with its dynamical behavior, we can prove that it is enough to solve a slightly simpler equation called time independent Schweringer equation, where now the wave function psi is independent from the time. Here at the right I listed some of the many schemes developed to approximately solve this simpler equation, for theory, couple cluster, density function, theory, etc. When instead we are also interested in the dynamics of the quantum system, the full time dependent Schweringer equation needs to be solved and some different approaches are nowadays available to approximately find the solution of this equation. These are the so-called ab inition molecular dynamic scheme, such as the Herrenfest, the Bernu-Penheimer, or the Carparinello molecular dynamic schemes. All these schemes share the assumption or better, the approximation that the motion of the atomic nuclei and the one of the electrons in the molecule can be treated separately, but also the assumption in addition that the nuclear motion can be considered as a classical motion. These mostly because the masses of the nuclei are at least three order of magnitude larger than the electronic mass and thinking classically the nuclei are much lower than the electrons. This assumption is often collected under the name Borg-Penheimer approximation, not to be confused with the molecular dynamic scheme just mentioned. Even if nowadays this name referred to a more technical aspect of the assumption itself that I cannot describe here, in the larger majority of the molecular system, including the typical larger biological systems, this approximation, the Bernu-Penheimer approximation, is well verified and can be safely employed. I do not have time to enter in much more details about the different ab inition molecular dynamic schemes, but I want to briefly mention how the Bernu-Penheimer dynamical scheme works, because that is implemented in the CP2K code and you will use it in the tutorial. In this slide I brought down the two equations that describe the scheme, the Bernu-Penheimer molecular dynamic scheme. The first thing to note is the separation between the electronic degrees of freedom, first equation, and the nuclear degrees of freedom, second equation. The second thing to note is that the electronic problem does not evolve in time. It is a time independent Scheddinger equation, while the nuclear degrees of freedom evolve in time as classical entities, that is according a Newton-like equation. Mass times acceleration. The two dots over the capital R mean second derivative in the spectrum. So mass times acceleration equals to minus the gradient of a quantity that represents the potential felt by the nuclei. Schematically, the algorithm associated to this molecular dynamics approach proceed this way. At each step, a time independent Scheddinger equation involving only the electronic degrees of freedom is solved via some of the electronic structure method like the ones mentioned before, or Hartree-Fock density function theory, etc. Note that in the H.E. Hamiltonian in this equation, the nuclear coordinates, capital R, are not dynamical variables, but just parameters. In this approximation, the electrons move within a static electric field due to the presence of the nuclei. Second step, the electronic wave function Psi 0 found solving this time independent Scheddinger equation is used in the algorithm to calculate the forces on the nuclei via the right-hand side of the second equation. In fact, the forces are obtained as minus the gradient of a potential that depends on the electronic wave function Psi 0. Finally, having obtained the forces, the nuclei are moved according to the Newton-like equation. Note that the electronic problem in the first equation has an infinite number of solutions in principle, each one corresponding to a different energy state. However, among those solutions here, we are interested in the wave function corresponding to the state with the smallest energy, that is, the ground state, as indicated with the subscript 0. Instead, the mean symbol in the second equation refers to the fact that the electronic problem consisting of solving the time independent Scheddinger equation that is of finding Psi 0 can be recast in a variational problem that is a problem of finding a minimum. This is the so-called wave function minimization or optimization procedure. And on a computer, it is computationally more convenient than any other algorithm that tries to solve directly the time independent Scheddinger equation. We will see an example of this minimization algorithm in a few slides. Let's now focus on the methods to solve the first equation. As I mentioned before, many methods have been devised in the years to approximately solve the time independent Scheddinger equation. Some are very accurate and computational expensive. Others are computationally less demanding but also limited in accuracy. However, almost every quantum code used in computational biophysics and biochemistry implements the density function theory, including CP2K, the code you will use in the tutorial. In fact, this relatively recent approach represents probably the best compromise between accuracy and computational cost. And it is therefore currently one of the very few approaches that offer the possibility to deal with the system of order of hundreds of atoms with sufficient accuracy. Let's briefly describe the density function theory. The theory is based on the following two theorems proved by the physicists Hohenberg and Kohn. The first, the ground state energy and therefore the ground state properties of a many electron system is a unique functional of the electronic density probe. Here functional means a function of another function. In fact, the density rho is a function of the three space coordinates. In each point of the three space, you have a value of the electronic density. The second theorem says that the functional for the ground state energy rho is variational in the sense we have mentioned before. The benefit to use this method is that instead to calculate the wave function Psi 0, which depends on all the electronic coordinates, the properties of the system depend in the DFT only on rho, which in turn depends only on three coordinates, the spatial coordinates. The drawback is that the functional rho is not known and therefore as it is the theory cannot be used in practice. Fortunately, the physicists Kohn and Sham in 1965 had the idea to recast the problem in order to make density functional theory a practical method. Their idea is simple. They hypothesized a fictitious system of non-interacting electrons with the local potential that by construction generates the same electronic density rho as the one of the real system, the real fully interacting electron system. In this way, the problem to find the density rho of the real fully interacting system that minimize the functional E of rho is recast to the problem of solving n single electron equations, much easier to solve than an equation of n electrons. Above all, all the terms in the single electron equations are known apart from one, the so-called exchange correlation functional. In these Kohn-Sham equations the phi i's represent the single electron wave functions not to be confused with the psi 0 in the previous slides that is the wave function of the entire electronic system formed by n electrons and rho can be obtained from the n different phi i's by using the first relation in the slides. In practice, quantum chemists have proposed many recipes to approximate the unknown exchange correlation functional EXC for example by calculating it for the simplest cases such as the homogenous electron gas or by fitting experimental data. When you have to specify the level of theory you are going to use to solve the electronic problem with density functional theory, you need also to state explicitly the exchange correlation functional you decided to use. Having decided which exchange correlation functional to use how do we get the electronic density rho in other words? How does the algorithm that solves the system of n single electron Kohn-Sham equations work? This is done through an iterative procedure. Why the iterative procedure? Because the Kohn-Sham equations because they are n are non-linear which means that some of terms in the equations depend on the electronic density itself that is on the solutions, the solution that you want to find. The iterative procedure to solve these equations can be summarized this way. First we start with an arbitrary electronic density in order to define completely the equation to be solved. Then we find the phi i that is the single electron wave function that can be used to get the new electronic density by using this equality. We measure the difference between the new and the previous density if the difference is below some predetermined threshold we consider the new density already converged and we stop the iteration, we go out the iteration. Otherwise we take the new density and we go back to the first step. This self-consisted approach is very commonly used in quantum chemistry and in general when we have to solve non-linear equations like the Kohn-Sham equations. Now all the equations we have seen so far including the Kohn-Sham equations are continuous equations. Their solution, as for example the phi i's in the Kohn-Sham equations are functions defined on all the space. However, to put such a problem on a computer to solve such a problem on a computer to solve such a question we need to discretize the problem. How can we discretize the problem of solving the Kohn-Sham equations? This is done by expanding the wave function on it already density over a finite set of known functions and we refer usually to this set of functions as the basis set. This y let's say in this way the problem to solve a continuous differential equation is recast in the problem of the diagonalized matrix and find again values and again vector. When one wants to specify the type of quantum chemistry calculation is going to perform on a computer they need to specify both the level of theory for example DFT together with the chosen exchange correlation function and also the employed basis set. All these level of theory level of theory used like DFT exchange correlation function and basis set define completely the employed level of theory. Commonly two classes of basis set can be can be identified. They localize basis set such as the atom centered Gaussian function functions very suitable to describe the wave function of localized objects like molecules and non-local basis sets such as plane waves which were originally employed to describe the wave functions of condensed matter or state systems. Both types of basis sets have advantages and disadvantages. In the slide I listed some of them but I do not want to enter now in more details. Really the code that we use in the tutorial CP2K implement also a more sophisticated approach which combines both classes. This is the so-called hybrid or dual Gaussian and plane wave method in short GPW. The method uses an atom centered Gaussian type basis to describe the wave functions the phi i's but also an auxiliary plane wave basis to describe the density. Note that using a plane wave basis set for charge density means using grids in real space to represent the charge density. In fact by a mathematical operation called Fourier transform that computation can be performed in a very efficient way on a grid with an algorithm called fast Fourier transform one can pass from the representation on the real space grid to the representation on the reciprocal space that is the G space of the plane waves. Here you can see how the plane waves are defined and what G represent. Let's say finer grids that is with smaller cells correspond to larger cut-offs in the reciprocal space larger grids in the reciprocal space. What is the advantage of this dual representation that is a localized basis set for the wave function and no local basis set for the density? The advantage is mainly on performance or better on scaling. In fact with the density represented on a sum of plane waves or which is the same on a regular grid the efficiency of the fast Fourier transform algorithm can be exploited to obtain the long range energy terms that we will see in a short in a time that scales linearly with the system size. These things is very important because one of the major bottlenecks of standard Gaussian based calculations the typical calculation that are done in quantum chemistry approaches. In the second part of this lesson I will come back with more details. As you will see in the tutorial in order to set up a calculation in CP2K that uses the GPW of approaches you will have to provide information on both the Gaussian type orbitals and the plane wave basis set that you want to use. Now the largest system investigated so far via full quantum mechanical approaches that is by describing the entire molecular system through quantum mechanics include less than 10,000 atoms and very probably are investigated only with density functional theory. In contrast typical sizes of biological system are much larger than 10,000 atoms. Therefore investigating interesting biological system at full quantum mechanical level is beyond the current state of the hardware and software technologies. But as we have seen at the beginning there are cases where a quantum mechanical resolution is required also for large biological systems. This implies that for these cases at present the only viable way is to resort to multi-scale approaches as for example the hybrid quantum mechanical quantum mechanical molecular mechanical one. In fact in the biological system the region where the electronic description is necessary is usually a spatial limited area of the system for example the region where the chemical reaction takes place and this feature makes a QMM approach very suitable for this system is because in the QMM approach the system is fictitiously fictitiously separated into parts that are described at different levels of theory. A smaller part the QM or quantum part usually the chemical active region or in general the region where the electronics degrees of freedom are important that is thrift a quantum level by computational demanding electronic structure methods for example density function. And the rest of the system which contains the items that for example do not directly participate in the reaction. This part is instead described efficiently at a lower level of theory usually by classical force field. This part is referred as the M.M. or classical part. A QMM interface is the part of the code or a standard code that comples the couples in a coherent way the two different resolutions. Okay, we have reached the end of the first part of the lesson. In the second part we will go more in depth on how the coupling between the quantum and the classical regions can be done. And we will describe different QMM approaches including the one implemented in CP2K code that we will see in the tutorial.