 Hello, good afternoon and welcome to the latest webinar in the BioExcel webinar series. My name's Adam Carter. I'll be the host for today. I'm just going to give you a very brief introduction, just a couple of minutes to tell you a little bit about BioExcel and our speaker today, Rafaelo Potestio, and then I'll hand over to Rafaelo for his presentation today. Just a quick note to let you know that this webinar is being recorded. It will go up later on YouTube. So that includes the question and answer session at the end. So I'm sure many of you will now be aware of what BioExcel is doing. So we're a centre of excellence, been around for over two years now. We're a centre of excellence for computational biomolecular research, which is built around three main pillars. The first is the biomolecular software that we're working with, three codes in particular in this round of the project, Gromax, Haddock, and CPMD. And we're working to improve the performance, efficiency, and scalability of these codes that are widely used in the community. The second important part of what we're doing in the project is the idea of usability. So as well as improving the codes in terms of what they do and how fast they do it, we want to make sure that they can all be used easily. And so part of the project is looking at workflow environments with data integration and different ways that it can be easier to use these three tools and also other biomolecular research tools. And the final part of what we do is consultancy and training. And the webinar, to some extent, falls into this part of what we do. So we are promoting best practices and we're training end users and we're trying to let people know about what's going on in this community. As part of what we're doing, we have a number of interest groups. So if you've not already joined, then you may find that at least one of these interest groups might be of interest to you. Particularly if you're here, it suggests that your interests coincide with at least one of these interest groups already. So do have a look at our web pages by excel.eu and you can sign up to an interest group from there. As we go through today's webinar, we will probably run all the way through and save the questions till the end. It just makes it easier rather than trying to hand backwards and forwards. So but we will leave time at the end of the presentation for some questions. So if you do have a question, you can type it into the into the question session section at the side and go to webinar and I'll come to you at the end. If you've got a microphone and you want to ask your question directly, I'll invite you to to ask the speaker directly. If you'd preferred that I can read out the question that you've typed in and pass it on to the speaker. So I'll put that slide up at the end again to remind you, but that's a that's the way we'll take questions at the end. OK, so I'm now shortly going to hand over to Rafaelo Portescio. He's a researcher in the physics department at the University of Trento, graduated originally from the University of Rome, La Sapienza in 2006 with his master's thesis on lattice quantum chromodynamics under the supervision of Martinelli. And in 2010, he defended his PhD thesis on coarse-grain models of protein structure interactions and was supervised by Nicoletti. He then joined Kramer's group at the Max Planck Institute for Polymer Research in Mainz and in August 2013, he became the project leader of the statistical mechanics of biomolecules group. More recently, he was awarded an ERC starting grant for the Varemol's project on the development and application of variable resolution modeling strategies to the computational study of large biomolecules. This project is underway at the University of Trento where he's enrolled as tenure track assistant professor. His main research interest is in the development and application of coarse-grain models and coarse-graining strategies for soft matter, and in particular those biologically relevant systems. The goals of his approach are to understand the most fundamental or universal features of a system and to improve the computational efficiency of the simulations. He works also on the study of topologically self-entangled biopolymers, so things like knotted proteins and DNA. And he makes use of both standard and ad hoc coarse-graining methods developed specifically for these systems. As you can tell, Raffaello is well placed to give today's webinar. And so I will now hand over the bat into Raffaello and he should be able to start making his presentation from here. Good afternoon. Thank you very much, Adam, for your very kind presentation. I hope everything is going fine and everyone is capable of looking at the slides. It's a great pleasure for me to be here today and tell you something about the work that I mainly carried out in my years at the Max Planck Institute for Polymer Research. Most of the stuff that I will present today has been carried out in that context. And here is a list of the topics I will touch. First of all, I will go through the basic concepts of multi-scale problems and simulations, especially in a biophysical and biochemical context. Then I will discuss how these problems are tackled by means of modeling based on coarse-grained methods. And then I will move towards the core of the presentation that is dual-resolution models of liquids first. And I will focus on two methodologies that have been developed in our group in minds. The address method and the age-address method. I will go through that thoroughly. And then I will discuss also their applications in particular for simulations of proteins. And then I will discuss something like the dual-resolution simulation, the dual-resolution models of proteins and not of the liquids in which proteins. Are to be found. And then I will draw some conclusions and think about some perspectives for this kind of research. I would like to start with a cartoon that gives an idea, an intuitive feeling for multi-scale problems in biology and maybe dramatic manner. But what I like of this picture is the fact that it conveys the idea of similar, if not the same thing happening at different levels of resolution or detail and size and time scale that you might look at. Here, pretty much the same thing happens at the different levels. Obviously, in biological and biophysical problems, not the same thing happens at all different levels. But what is interesting is the fact that all these levels are interconnected. And this is what makes these kind of problems at the same time interesting, very much interesting and very much challenging. Regarding the systems that we focus on, we are in the very low region of this, let's call it diagram, that is we'll focus on molecules, biomolecules and at most macromolecular assemblies. So let's say up to the size of a barrel capsid. On a more formal and pragmatic manner, when we talk about multi-scale problems, we discuss, we think about the fact that the phenomena that take place in soft and biological matter and the sizes and time scales that into which this phenomena take place are smeared over a very broad range of length and time scales. This goes from the very small, like the phenomena that take place at an atomic level or a molecular level where the molecules are, for example, water molecule with its vibrations and rotations and so on and so forth. And here, in order to understand what goes on, we have to take explicitly into account the quantum nature of a system like that. If we increase the range of what we look at, we go to the size of complex molecules with several atoms, tens, hundreds, thousands, molecules composed by this large number of atoms whose interesting behavior can be described in terms of, in many cases, not all, but in many cases in terms of classical, atomistic, but classical force fields. And the larger we go, the coarser typically becomes the description that we have to employ to deal with the systems and deal means typically to perform a simulation of these kind of systems. And the different levels at which we look determine, at the same time, the properties that we are interested in and the tool that we have to employ, again, at the typically the computational level, to study the system. If I want to study the vibrational spectrum of a molecule, I have to look at a very small length and time scale and make use of methods that take into account the explicit quantum properties of the forces that from which this spectrum emerges. If I go at the other end of the diagram, if I want to look, for example, at the large-scale conformational fluctuations of an entire bios, it is clear that I cannot provide a quantum description of the system. Rather, a fairly coarse-grained model is definitely more appropriate and it is the right one to get that particular level of detail. However, many problems are interconnected in the sense that the phenomena that we might be interested in might require a high level of detail, so a very fine-grained description of our system and yet the system is particularly large. This represents a problem because, of course, as in this particular example, which is not recent, it's 12 years old, but things have improved since then, but still we face similar problems. We would like to understand, for example, at the atomic level, at the optimistic level, the properties, the mechanics, the dynamics of the system as complex as a virus with the complexity given by the number of atoms, but the diversity of the molecules which are involved, that is, the proteins that constitute the baroccapsid, the DNA or RNA that is inside the water and the ions that permeate completely the system, and yet the size of the system is that of a system that is better described from our point of view, from our perspective, and this is the data from the point of view by means of a coarse-grained model. So at the same time, we have the desire to know how the system works in detail, in atomistic detail, and yet the computational resources are such that the most, the ideal tool, the ideal description to use to describe the system like that would be a coarse-grained one. There are several reasons why we would like to use a simplified representation for a system like that. The first one, and most obvious, is the one that I just mentioned, that is the fact that a lot of atoms and a lot of forces to compute means that we need large computers, a large amount of time to perform simulations. We need to have large, important computational resources. Then another interesting point that from my perspective is typically overlooked is the fact that a model that is a simplified description of a system provides useful information about the system already at the modeling stage. Putting it in a very simple manner, you could say that if you perform, if you set up a model of your system that is a simplified representation of it, and it works, meaning that it is capable of providing quantitatively and qualitatively correct information, it means that you have selected the right input for the system. Of course, rubbish in, rubbish out. And you might even get the right response for the wrong reasons, but typically, if you've set up a model that works, this means that you have selected the important features for the particular property or phenomenon you were interested in. And this tells you a lot about the system because it means that you have gotten the essence of it. And finally, if you have a relatively cheap model from the computational point of view, you can do something like an exploration of the parameter space that means that you can run not one, but many simulations, simultaneously. And what you can do is, for example, figure out what kind of responses the system has in different conditions of temperature, of pH, or whatever other external knob you might want to turn. Again, modeling is a source of insight in itself, as I mentioned, because setting up a system that is simple and works means that you have gotten the right ingredients, but you can also look at this process from the other end. That is the fact that a simplified description allows us to deal with a relatively smaller, with a fairly smaller, possibly, number of amount of data to analyze. And this helps from the practical point of view, from the point of view of storage and analysis of the data and the interpretation of the data analysis. So focusing on the aspect of course-grainning, several approaches have been developed to deal with complex systems in soft matter. Course-grain methods can be, for example, top-down in the sense that some specific information that you have a priori on your system at the level of the final level of information you are interested in is available. So, for example, the structure of a protein, and this information is employed to build the model at the level of the fundamental interactions that constitute that. What does this mean? This means that, for example, you can construct protein models like elastic network models in which you start from the structure and construct the interactions based on that in order to characterize the conformational fluctuations of the model. These interactions are not the real interactions that you have among the atoms or residues in the system. They come from the structure itself. They're not transferable. They're not the real ones, so to speak, but they allow you to perform simulations or calculations that provide interesting, useful information. Go models proceed in a similar manner, even though they are used typically for different things like folding, and they also rely on the existence, on the knowledge of the structure in order to determine the interactions. And based on the interactions, study the process of folding, for example. Sorry. In the opposite direction, go bottom-up approaches or systematic course-graining as it is called, in which basically you have already a fine-grained model of your system, be that, for example, a fully atomistic representation, and you want to construct a course-grained model based on an algorithmic procedure in which you have a well-defined set of rules that you have to follow to go from a fine-grained to a coarse-grained level of description. What you have to do is to start from a mapping function that tells you which atoms have to be employed to construct an effective interaction center and how. Then you have to provide a criterion that tells you what your model has to do in order to be considered to work nicely. So for example, it has to sample the conformational space according to the same distribution that the fine-grained system would. And then you have to provide some rule to some target for the interactions, like, for example, the multibody potential of the force. The big problem in this kind of area is to approximate the multibody potential of the force. Several methods have been developed to do that. These are only three, like iterative Boltzmann inversion, multi-scale course-graining or force-matching, and relative entropy are among those that are more commonly used to perform a systematic course-graining. And then comes the problem, that is the fact that typically you do not have the chance of separating neatly the different levels of resolutions and therefore you cannot represent the whole system one way or another with one course-graining method or the other because the length and time scales are interconnected. Typical example, an enzyme whose size makes it reasonably good to be treated at the course-graining level, but it has the active site in which the chemistry goes on and this chemistry requires a fairly high level of description which has to be at least an optimistic one if not a quantum one. You would like to describe the whole thing at the course-graining level, yet you cannot because cross-graining mirrors out the level of detail and washes away the chemistry that is important for the active site. In order to tackle with this particular kind of problems, different methods have been developed that are concurrent and or multi-resolution. What does this mean? It means that there are at least some cases in which you can provide a different description of your system with different levels of resolution depending on the position that a certain part of the system occupies or depending on the role that different parts of the system play. These kind of approaches are differently developed and applied depending on the particular properties of the system mainly to be divided in diffusive systems like liquids or gases and so to speak static systems like a protein. In the first case, it is clear that the problem exists of allowing molecules to diffuse back and forth in the two regions that you define to be the ones at the high level of description and the one at the low level of description. So you decide that a subpart of your system can and should be described with a high level of resolution for example, optimistic. The rest can and should be described at a cost-brand level and you provide a geometric separation between these two parts and allow molecules to diffuse back and forth in these two regions. And the fundamental idea as we'll see is that these molecules that cross the boundary between these two domains have to change their resolution on the fly. This is not the case in systems like proteins where you can decide that a certain group of atoms or certain group of amino acids has to be treated at the cost-brand level and the rest has to be treated at the high resolution level. The second part typically being the smallest one and the chemically active one or the interesting one at least. So starting with liquids, the method to tackle this problem that has been developed in the theory group of the maximum case for polymer research quite some time ago is the adaptive resolution simulation scheme, the address method. This method is as simple as it gets in the sense that what it does is to define the resolution that a molecule is supposed to have depending on its position based on a function that is either zero or one. It is zero in the low resolution or cost-brand region. It is one in the atomistic or high resolution region and it smoothly goes from one to zero in a layer that is called hybrid region that allows a smooth transition in the resolution. As you can imagine, this is relevant because of the need to avoid forces that would jump in changing model from one to the other. In fully in the atomistic region or in the cost-brand region, things are easy because you have molecules instructing either with the atomistic or the cost-brand potentials. Problems occur of course when at least one molecule is in the hybrid region. What do you do in that case? How do you allow molecules to change resolution? This is done by mixing the forces acting between a pair of molecules that can be either atomistic or cost-brand or a combination of the two, a linear combination of the two and the weights that determine the character of the molecule, the mixed character of the interaction between the two molecules are given by the products or one minus the products of the resolutions of the two molecules. In this way, if both molecules are in the atomistic region, the forces are fully atomistic if they're both in the cost-brand region. The interaction is purely cost-brand and of course you have the whole spectrum of cases if at least one molecule is in the hybrid region. This method has been thoroughly applied in many cases. I would like to mention just one application in which I was not involved, but it is a very nice one that I would like to mention that is to employ, to mimic a quasi-grant canonical simulation. Essentially, this work by the Bashish Mukherjee and Kurt Kramer tackled the problem of systems whose size has an important effect on properties related to the free energy, the solution free energies of the liquid. And essentially what was possible to do by means of the adaptive resolution simulation method was to reduce the size of the simulation box to a fairly small number of molecules, making use of the fact that in the cost-brand region that you see here, the molecules are interacted through a very smooth potential, a cost-brand potential, and can be exchanged in time. So essentially a water molecule can become a metal molecule or vice versa. And what happens is that you can change the instantaneous relative concentration of the two chemical species by means of a Monte Carlo algorithm in order to keep the relative density of the two species constant irrespectively of what happens into the high-resolution region where something happens because a molecule, a polymer, the PNPAM absorbs the solute, thereby changing its concentration. With this trick, it is possible to change the relative concentration of molecules. And this is something that you cannot do easily if the whole system is atomistic because the forces are much stronger, the energy landscapes or the energy landscapes are much rougher. And because of that, the interactions are, the forces that come out are higher and the acceptance rates are lower. So the address method is very simple, very effective, and it allows to do essentially what you want to do. However, it has some limitations. So as I said, it is a very simple strategy. It satisfies Newton's third law because by construction has an antisymmetric construction of the force. Its force doesn't contain any derivative of the switching function. And this is of course done by construction because the interpolation is performed at the level of the forces themselves. However, no Hamiltonian formulation exists because it can be proven that this force field cannot be derived from a fully Hamiltonian, from a full Hamiltonian. Therefore, the force field is non-conservative. A local thermostat is required in order to have stable simulations. And again, the absence of a Hamiltonian implies the impossibility to perform energy conserving or Monte Carlo simulations or to write down the partition function of the system that for some particular theoretical studies, it might be necessary. So the question is whether a Hamiltonian formulation is viable? And the answer is yes. We worked together with our colleagues at the Max Planck Institute for Polymersage and also others. We worked on the definition of a Hamiltonian that is, as you see here, where the interactions are weighted through the resolution of a molecule directly at the level of potentials and not forces. The terms that you see here, this B, A, A or C, G, are essentially related to the sum of all interactions of one type or the other that a certain molecule has with all the other molecules in its interaction range, range appropriately weighted in order to have correct normalization. And this Hamiltonian generates forces that look like this. The first part is related to the internal forces within a molecule, and this doesn't change with respect to the resolution. The rest is essentially identical. It's very similar to what one has in the address method if not for the fact that the weight is given by the average and not the product, the resolutions. And then you have this particular term that is related to the gradient of the switching function. Now, this is something that is a bit tedious because you have a force that acts only in the hybrid region because the gradient of lambda is non-zero only in the hybrid region and pushes molecules here and there depending on the relative sign, on the sign of these two terms. So the question is how to deal with this particular term that one would like to eliminate? If we look at what happens in a simulation, in particular in the hybrid region of a H-hydro simulation, we concentrate on a molecule at position R with resolution lambda and we ask ourselves what happens to the average of the drift force or what we call the drift force. What happens is that this term, if you constrain the position of the molecule and average over the force, of course you average only on the difference between the two potentials because the rest is constant, you realize that this term can be related to the gradient of Helmholtz free energy as a function of the resolution and it is related to the free energy difference between the two models that you have, the atomistic one and the corresponding one. So it tells you that essentially the force that you see in the hybrid region is related to the free energy change that you have in going from one domain to the other, from one model to the other. So in order to eliminate that, what you can do is to modify the Hamiltonian in order to introduce a compensating term which is, as a first approximation, given by the free energy difference between the two domains as a function of the resolution, so delta F of lambda, which can be computed by means of a thermodynamic integration and introducing this term, you can on average remove the drift force without having to disrupt the Hamiltonian character of the system. If you on top of that, include a term that is related to the difference in pressure between the two models at a certain state point, so conditions of temperature and density, what you obtain is that you flatten uniformly the density throughout the system. I will not go into the details about that. I will be happy to take questions about this, but essentially what happens is that if you include a term that compensates for the difference in chemical potential between the two domains as a function of the resolution and as a function of the position, you obtain a system, a liquid, which has a dual resolution and a uniform density profile. Now, in qualitative terms, what happens is that the two models that you have in the system follow two different equations of state, the one for the atomistic model and the one for the cosmic model. When you put the two parts of the system go in different points of these two equations, if you correct for the drift force, you go into a condition in which you have the same pressure for the two regions by the different density. If you include a term so as to have the chemical potential difference in the correction term, you obtain different pressures for the same density and the different pressures are the ones for which each system has separately that particular density at that value of the temperature. Okay, I will very briefly go through a particular application of this methodology that is the coupling of a classical and a quantum representation of molecules. Quantum means here delocalization, so no elections taken into account, just the delocalization effects due to the quantum nature of molecules, especially for very light molecules like hydrogen atoms or molecules, this is relevant. What I just want to say is that, sorry, what you can do is to couple different representations of the same system of quantum one and a classical one, that means of this Hamiltonian based approach and have the system described appropriately with the same density and the appropriate level of quantumness throughout the different regions. Recently this approach was extended from the static regimes of Monte Carlo simulations to the dynamics regime and it is possible to perform quantum classical simulations with the ring polymers in the high resolution domain so as to calculate the dynamical properties at the quantum level. Another interesting application of adaptive resolution methods in general, in this particular case of page address, is the coupling with an ideal gas. This coupling is particularly relevant because the computational cost of an ideal gas is practically zero. Here by ideal gas, what I mean is that the molecules that find themselves in the cos brain region have no interaction, feel no interaction but thermostat. So what happens is that you just evolve the dynamics of your molecules in the cos brain region through the Langemann thermostat and no other interaction is taken into account. However, even a strongly interacting liquid like water is described within the high resolution domain with the appropriate level with the correct thermodynamics and all the structural and some equilibrium dynamical properties are reproduced correctly. So here you see ready distribution functions, diffusion profiles as a function of time and also the fluctuations in the number of molecules that are in the optimistic region, the ones that you would expect in a fully optimistic simulation and of course it deviates strongly as you go into the cos brain ideal gas region. As I mentioned, the computational cost is remarkable, is remarkably reduced because eventually if you increase the size of the simulation box, keeping the optimistic region fixed in size, what you get is a linear gain in the speedup, meaning that essentially you can increase as much as you want the size of the cos brain domain of the ideal gas domain, but the computational cost will be completely dominated by the atomistic part. So essentially what you perform is a simulation of a system with a large reservoir of articles at the cost of what you have only in the atomistic plus hybrid region. At an extreme case, you can think of solvating a very large box at the atomistic level with a number of molecules in the hundreds of thousands that corresponds to the size of an entire virus completely solvated and increase the size of the simulation box in the cos brain domain and what you would get is a speedup that increases with the size of the simulation box if you keep the size of the optimistic region fixed. Now, this is a nice technical instrument, but from the practical point of view, how can you use it, especially in the context of biological simulations? The first and most obvious thing to do is to take a protein, solvate it into water and treat the solvent itself in dual resolution. There is the computational advantage that one gets from that. For small systems, this is not a particularly large computational advantage. Definitely more interesting way of using it is to understand what kind of correlations you have in the system. So what you can do is to change the size of the atomistic region. In particular, you can decrease the volume, the radius of this sphere that solvates the system and investigate some properties of the system like the number of hydrogen bonds that you have at the interface between protein and water or calculated with means for fluctuations, sub-correlation times of the velocities of water molecules at the surface of the protein. And see at which point you begin to poke into these properties and change their value. And in this way, you can try to understand what is the correlation length between the water at the surface of the protein and the water in the bulk. And by this you can provide a sort of operational description and operational value of the solvation shell, the thickness of the solvation shell of your system. If you do it like this with a Cosway model that is an interacting one, so not the ideal gas, but it's an IBI Cosway model, you see that essentially it is sufficient to have a 1.3 nanometer thick water layer starting from the radius of generation of the molecule in order to have these values correctly reproduced with respect to a fully atomistic simulation. And this is interesting because it tells you how much coupling exists between the water at the surface of the protein and the water in the bulk, not from all points of view, but in particular from the point of view of the internal degrees of freedom. Because the water molecules that exist in the system can freely diffuse back and forth and the thermodynamics of the system is correctly reproduced everywhere. What is lost as you go away from the atomistic into the Cosway region is the internal degrees of freedom of the molecule. So the structure of water is lost in a gradual in a smooth way. And in measuring at which point you can place this decoupling allows you to understand what is the strength, how the interaction between these internal degrees of freedom penetrates into the bulk of the water. This simulation has been performed by means of the adaptive resolution simulation scheme. So the force-based address scheme. The same setup can be implemented with the Hamiltonian scheme. And this is what we did recently in the group of Paolo Carloni. And we performed simulations of proteins in dual-resolution water in which the Hamiltonian adaptive resolution simulation scheme has been employed implemented in in-house version of Gromats. Vania Calandarini has recently given a webinar in the bio-excel context precisely on this kind of work, not only but also discussing this kind of work. And I invite you to go look into that if you want more details about these things. Finally, regarding this aspect of dual-resolution simulations of liquids, I would like to mention the fact that the separation between the geometric separation between high resolution and low resolution description can be made flexible, has been made flexible. And in such a way as to change to adapt the shape of the high resolution region to the shape of a particular solute. This is especially important in the case of protein folding. For example, because you can start from a setup in which the protein is completely swollen in which the peptide chain is open and extended and it has a certain conformation. And as it falls, it collapses, thereby reducing the amount of solvent that is required to be treated at the atomistic level and by adapting the shape of the high resolution region based on the shape of the solute itself, you can keep the amount of high resolution solute really at the minimum. This is done essentially by combining many high resolution regions of spherical shape each centered on an atom of the solute which merge in the appropriate manner so as to provide a flexible description of the high resolution domain in its full glory. And obviously it was tested that this setup doesn't have negative impacts on the properties of the system, like for example, the free energy landscape of certain particular degrees of freedom. Moving from the liquids to the structure of the proteins, as I mentioned, the possibility can be envisaged of modeling the same systems or given protein, for example, with two different models. A very simple model and elastic network model, for example, can be appropriate to describe the conformation of fluctuations of a protein. So the large scale, not the large amplitude but the large scale collective motions that are the most characteristic of a globular protein and the ones that are functionally related, functionally oriented. However, it lacks the chemistry that is necessary to be present in the active site. So ideally you could put together the atomistic description of the active site with the coarse grain representation of the rest of the protein that is parametrized appropriately so as to reproduce the conformation of fluctuations of some reference data. So for example, experiments or simulations or both. And of course you have to check that these properties, these collective properties are correctly reproduced by the dual resolution system. This is relatively easy to do, but of course, let's say the challenge lies in what happens into the active site. The active site in this particular model in the work that we did was treated at the atomistic level, not only as far as the protein is concerned but also regarding the water so that we use the address to provide, to solvate the active site with the atomistic water while the rest of water molecules that are not represented for clarity in this image are treated at the coarse grain level. And then of course it is necessary to validate this approach in such a way as to be sure that all the chemical and biochemical properties of interest are correctly reproduced in the dual resolution model. So you can look at the distance distributions or for example, the electrostatics of the system which is of course particularly relevant in the case of proteins and in particular for the active site. Organzymes, what is relevant, what is important in this model is not only that it can just reproduce what happens in this, and with respect to a fully atomistic simulation, this is of course the basics, but this is not just for this that you do that, you might want to get something more out of the system like this. And more means that you can perform, you can save time in the simulation and perform, for example, free energy calculations in a batch so that you have cheaper simulations to perform, therefore the possibility of running more simulations with the same computational resources. And this means that for example, you can calculate the binding free energies of different substrates on the same enzyme with a reduced amount of computational resources. Together with that, you also have the possibility of employing this tool in order to figure out what are the important, so to speak, residues of the active site. So what happens to the binding free energies if I describe the active site at the atomistic level using a certain number of residues or another number of residues? If I add or remove residues from the active site, how does the binding free energy changes? This is a fairly large set of important questions that are currently being investigated by the means of these kind of models. And of course, the big goal of these approaches is precisely to go big. That is to put together a setup that is computationally as cheap as possible in order to investigate systems that are large. The two approaches that I describe that is dual resolution model of a protein and dual resolution, adaptive resolution model of the solvent can be combined with the perspective of having the most accurate description possible for the solvent so that the soldated protein feels to be in a grand canonical simulation, so to speak, so that no finite size effects due to the solvent are present. And the protein itself is treated at the atomistic level only where necessary or where it is known, because in many cases you might have to deal with proteins whose structure is not completely known and this is yet another topic that has been tackled in Vania Kalenderini's seminar and I would refer you to that. So summing up, the adaptive resolution simulation methods are interesting and important because they cover and touch different areas of science. You can do fundamental physics, fundamental statistical physics with them because they allow you to manipulate the system in ways that are pretty much not conventional and therefore they give you a further degree of freedom to understand liquids and gases. The quantum to classical coupling has been an important topic in the recent work that we did, both at the level of molecules whose quantumness can be represented by means of path integrals, but also as far as QMM methods are concerned, especially adaptive QMM methods that have been developed inspired by the Hamiltonian, sorry, the Hamiltonian adaptive resolution simulation. And finally, obviously, the simulation of biomolecules in which the solvent and or the protein are treated at the dual resolution level. These softwares are implemented in, these methods are implemented in different softwares. The main ones are the espresso plus plus software and the lamps software in a sort of proprietary version modified by us in-house. And of course, we also have a tentative Gromax implementation of the H address method together with the methods developed in the group of Palo Pernoli. So to conclude, a platitude, soft matter systems are intrinsically multi-scale. This means that the system has to be often tackled in a holistic manner if you allow me the expression. Corse grain models are fundamental to understand these biological systems, but they are limited in that they smear, they decrease the resolution in a uniform manner. And this is often not sufficient. Because of that, it is important to go towards methods that put together different approaches in the same simulation setup. With this, I would like to conclude thanking all the people who have been involved in this work. Especially my former group in Mines. And of course, I would like to acknowledge the funding that came from the Max Planck Institute for Polymer Research, the German Physical Society, the German Research Foundation, the Kavli Institute for Theoretical Physics and the ERC, especially this last part as far as the project that funds me at the moment is concerned in which we will work precisely on the development of methods for the simulation or the modeling and simulation of systems at different levels of resolution. With this, I conclude and I switch to the Q&A session. Thank you very much, Rapaello. Thanks very much for a very interesting talk. So we already have a couple of questions that I can see in the questions section. So in a moment, I will come to each person. In turn, I will mute your microphone and you can ask your question directly if you want to. Otherwise, I can read it out from the chat window. Incidentally, if you're watching this recording later on YouTube or the BioExcel website, you can also ask a question in the Ask BioExcel web forum and we'll try and make sure that that is followed up for you and answered there. So I'm going to start then by taking a question from Armin. I mean, I'm going to try and unmute your microphone and let you speak if you want to. Let me just try. Okay, Armin, would you like to answer your question directly? Otherwise, if you can't do that, I will read it out. Okay, I'm not hearing from Armin. So let me read out the question. So the question that was asked was, sorry, my questions are opening up in front of me just a second. Okay, here we go. The question is, the first question was, is there a modified version of Gromax with address available to download? So I think you touched on that at the end that there is work going on in Gromax. Can you comment on what states that's at and what it can actually do? Yeah. Yes, so at first there was a version of address, so the force-based method implementing Gromax and actually available within the distribution of Gromax. However, it was dismissed after a few years. However, so I think that it is still possible to have the address version of Gromax if one looks at the previous versions. There is also a description of the method and how to use it into the manual. However, this is not maintained anymore, as far as I know. So take it with a grain of salt, but as far as I know, there is no current implementation in the latest version of Gromax. In the previous ones, you can find them. Regarding the H-Address method, what we have is a relatively hacky implementation of the method because we had to merge both the dual-resolution protein model from the group of Paolo Carlori and the H-Address method that we had already implemented in Gromax starting from the implementation of address. It is definitely our interest to make this merged Gromax version containing both methods as neat as possible and obviously available. And I think that within the BioXL framework, this is something that is currently dealt with. Okay, thank you. And there was a follow-on question as well, which I think is separate. The follow-on question was, are there adaptive resolution methods available to perform steered molecular dynamic simulations like pooling, unfolding, et cetera, to compare with atomic force microscopy experiments? The methods as they are implemented in Espresso++ and lamps are certainly more than prone, let me say that, for these kind of problems. I say more than prone instead of perfectly ready for because it might really depend on what you are interested in because, so for example, in lamps, we are currently using lamps and H-Address to perform umbrella sampling. So essentially we have restraints and constraints and hand sampling methods like this, which are used seamlessly without any problems. So it was our interest to implement H-Address in lamps in such a way that the largest compatibility with everything else that is already available in lamps can be granted. So I would say if you want to perform steered molecular dynamics, for example, on a protein, so pooling, if I understood correctly, this could be something the question was referred to. If you want to perform a pooling experiment on a protein, which is treated, which is immersed in a dual resolution solvent, so to speak, I think this is something that you can do with no problems, both in espresso plus plus and in lamps. Okay, thank you very much. I hope that answers your question, Amin, if not, or if you have a follow-on, do just type in and we can find out more. And the next question from Alberto De Pietre, who asks, regarding the capability of adaptive resolution schemes to reduce the computational cost of full-atom simulations, could you provide some numbers or quantitative examples? I think this is, yeah. Yeah, that's a very relevant question. Of course, one of the main reasons why we do that is to save time in computer simulations. The hard wall against which you break your head is the cost of the high resolution part, the computational cost of the high resolution part. Essentially, in the best case scenario, as it is, for example, the ideal gas, the solvent you have, the cost of the simulation of the solvent you have is completely given by the atomistic and hybrid region where you have to compute atomistic forces. So the best, the speed-up that you get is essentially given by the volume of your simulation box divided by the volume of the atomistic plus hybrid region. The larger the system with respect to the high resolution part, the larger the speed-up that you get. This is essentially the essence of the slide that I presented with the simulation of the huge water bubble as large as a viral capsid immersed in the ideal gas. Additionally, the more complex your solvent is, the better, because of course, if you have a very complex solvent, if you have a very complex molecule, a liquid composed by a very complex molecule, you have many forces to compute. And this goes to nothing or very little in the cosmic region. Naturally, there is a limit to that. There is a limit in the gain that you get from that. And this limit is given by the fact that you cannot go below the amount of calculations that you have to do in the cosmic region. So the reason why one has to go big in order to exploit at the best these kind of approaches is the fact that for very large systems, you can employ an arbitrarily large reservoir of molecules and yet have the solvated part very small. And by very small, I do not mean in absolute terms, but just relative to the total volume of the simulation. So you can have a fairly large amount of water or whatever solvent that is treated at the atomistic level. However, the rest would be immensely larger and that costs nothing in the simulation if you have a very cheap model. A side remark, of course, there is a component even by the workload distribution in a multi-core simulation. So if you can parallelize your simulation and break into pieces, the atomistic region in a meaningful way, which means that it has to be large enough, then you can reduce the computational cost further because you can reduce essentially, you can distribute the calculation of the high-resolution part among different cores. If it is large enough, you can take advantage also from that. Okay, thank you very much indeed. So both Amin and Alberto have passed on there. Thanks through the chat window for those answers. We're coming nearly to the top of the hour. This is your last chance to ask a question now directly to Raphael. So if you do have any other questions as your chance, you can quickly push the hand up button if you don't want to type the full question. No, I don't think we've got any other questions from the floor today and we are coming up to the top of the hour. So with that, I would like to say thank you very much indeed to Raphael for his presentation today. And if anything, any other questions do come up, then do visit ask.lyroxcel.eu to ask your questions on the forum. One final comment before we go today that we have our next presentation, our next webinar coming up next week, that's Qingdao Wang from IBM. So this is a slightly different angle of the work that BioXcel is working on. So I don't know if it'll directly be of interest to you, but please do spread the word. So this is on CWLXX, it's a new open source tool to run a common workflow language workflows on LSF. So this is to do with the workflows aspect of BioXcel. So if you're interested in that or you know somebody who might be do pass on do spread the word. Okay, thank you all very much for coming along today and we'll speak to you again soon in the next BioXcel webinar. Bye, bye.