 OK, so let's start the last session of today before the poster session. For the next speaker is Thierry Deutsch, of COA Grenoble, who is talking about adaptive and localized basis functions for linear scaling, large systems, and complex quantum mechanical system simulations. Thanks, Paolo. So I would like first to thank the organizer for this invitation. So in my case, I will talk about wavelets. So the idea is to replace the plane waves by wavelets. Because with wavelets, in fact, we don't have the problem of to treat all the systems, for instance, to calculate the kinetic term. But because wavelets have a support compact, then we can localize our whole operation in a small range of the real space. So wavelets are systematic and orthogonal, localized and adaptive. So for instance, in this case, you have two levels of wavelets. Here, you have only scaling function, so in red. And in blue, you have the details, only in a region near the nucleus. Because the support is compact, we developed different potion solvers because we can have naturally different boundary conditions, free wire surfaces. And we used for that a formulation of base on green function to solve the potion solver. And of course, we can explicitly treat the charge systems. So for a water molecule, first, we have a grid step, an extension for the finer region, where we have two levels of resolution. And then we have a second level with an extension where we have only one level of resolution. So this is a coarse region. So in the case of wavelets, we don't have a closed form. So what we have is a scaling relation. So we have a set of filters. In our case, we have roughly 15 filters. And we have a relation with the scaling function and the scaling function at an higher resolution. And what we manipulate, in fact, is the filters. So all operations for us are short convolutions. So it's quite GPU friendly. And this is why we have been developing big DFT on GPU for 10 years. So we have our two level of resolution using two kinds of wavelets. And with that, we use pseudo-potential. So because we use pseudo-potential, we need only two level of resolution. And it's like we have a cut-off energy multiplied by 4 for some region of the space. So it's enough to have two levels. And you can see that for the first three rows, we have a very good accuracy. And for delta test, we have a benchmark of one for all tests in this benchmark. Now, at the present day, we have the version 1.8. So we can do 3D periodic surface and three boundary conditions. We use HGH pseudo-potential. We have a very high precision because everything is analytic. Our filters are analytics. And we can do conchamps or ground states for metals. We have van der Waals and hybrid functionals. We develop also a potion solver for systems embedding in an electronic environment. So we can do a calculation with implicit solvent. We have also a library of structural prediction based on minimal oping as a method developed by Sufane Godeker. And we also develop the order n implementation. Now, I think this year we will have we could calculate non-orthorobic cells, P-A-W pseudo-potential, and we are developing also linear response, TD-DFT. OK, so in our code, we choose to underline our code in order to have a better maintainability. So we have different components. For instance, we have foothold, which is a low-level library. So this is used to process the output, the input files to have an idea of the timing also for the allocation of memory. And it's quite useful and decreases a lot the number of lines needed. We have also our own Gaussian integral for some part of the code based on Fiestal GW code. You have a talk from Stéphane about chess. And we have also the potion solver which is used in the other code, for instance, Habinit and CP2K, not this version, but another one. And we use our potion solver to calculate, of course, the R3 potential and the exchange part. So about the exchange part, so last year, we developed on GPU the calculation of hybrid functionals. So when you use an exchange part, you need to evaluate the potion solver N squared if N is the number of atoms. And for that, if you compare on a blue gene machine, you can see that the ratio between PB and PB is zero on the same system, atomic system, is 25. In our case, for a quite large system, more than 300 atoms. If you use piece dent on CPU and if you compare on... Sorry, gamma is the ratio between the time to calculate in PB is zero and PB is E. And if you compare between PB and PB is zero on piece dent, you have a ratio of 15. And if you port the potion solver part, you can decrease this ratio even more. If everything is done on GPU, you have only a ratio of 3.5. It means that with the GPU and GPU direct, you can decrease a lot the amount of time needed to calculate the exchange part. So here you can see that for some systems. So it was a collaboration with the Argon laboratory. So we did that for the uranium oxide systems. So you can see the number of orbitals. It's quite important in function of the number of atoms. So even for a large number of orbitals, we can do that on a piece dent. And we decrease considerably the ratio to calculate PB is zero. For instance, here in function of the computing nodes, you can see that one self-constitution takes 10 seconds. And if we use GPU, then we can also, for not so large amount of time, we can do a hybrid function on large systems. Okay, so this was about the cubic version of big DFT. So it's like plane wave. Our quantum orbitals are totally spread in the whole system. And so we have a scaling, of course, of n cubed due to the linear algebra because we have dense matrix. And now the idea is to decrease the scaling. So to do that, the idea in fact is to localize. So with the Potion solver, we have n log n, so it's okay. The convolution is n squared. One convolution takes n operation in function of the number of atoms. But if we multiply by two, we have two times convolution to perform for orbitals. So it's n squared. And the main part is linear algebra. So the idea is to have a new approach. And the idea is to localize our orbitals. So we use how we've led to express a localized, optimized minimal basis set. So the quantum orbitals are in the whole system. And then what we decide is for each atom, we have a radius. And we optimize a minimal basis set for each atom. For example, for silicon, we can minimize four orbitals, four localized orbitals. These orbitals are expressed on wavelet. Okay. And we optimize in situ during the iterations. So in this case, density matrix is expressed in our support function, or minimal basis set. And because you don't have an overlap between the different support functions, so localized orbitals, then our matrix, so the Hamiltonian and the density matrix, becomes path. And so what we have is a set of localized, optimized minimal basis set with a very small number. For example, for silicon, we can use four, or even nine, if you like, orbitals per atoms. So we have two steps. The first is to optimize. And the second is to express the density and the matrix. And calculate the next iteration. Okay, so if we compare with the cubic version. So in the cubic version, because you have the dense matrix, you have some difficulties to do a large calculation. And contrary to the linear scaling, you can say that we can calculate for a large number of atoms. Linear, the CPU time is linear, but also the memory. So we decrease a lot the memory, because the orbitals are localized. Another point is we know the times what we can do per atom. For instance, in our case here, we do 20 minutes for 8,000 atoms, but what we can calculate, in fact, is a CPU time per atom. Roughly for these systems, so silicon, germanium, the first three rows, it takes between 5 and 10 minutes to calculate the self-consistent iteration for one atom. So you know that if you have 1,000 atoms, you will take roughly 10,000 minutes for the whole system. And if you double the size of the C10, it takes 20,000 minutes. And you divide by the number of processors, of course, to have the whole time. But this is possible to predict easily the time you need to calculate. If we compare with the cubic code, we have an absolute energy difference of the order of 10 milliv per atom, and this is constant and not depending on the position of the atoms. And the forces are almost exact in the sense that we have the same forces. Another point which is really important is for large systems, because we have few localized, adapted orbitals per atom, and we have a few number of basis functions, okay? One million, for instance, for 2,000 key atoms. And this is interesting because you can use easily, you can have small matrices, you have small matrices, and you perform larger calculations. So just some features about the localized, optimal, minimal basis sets. You saw this transparent slide this morning from Stefan Moore. So I would like to point out about the characteristics of our basis set. So we have accurate results because we have a good localization and because for each atom we have a different minimal basis set, okay? For two different silicon atoms, if you have a different environment, you have different optimized orbitals. So we have a low number of degrees of freedom. We have a low condition number in the sum that the overlap matrix is quasi-orthogonal. So here you can see that it's almost one. It means that it's quasi-orthogonal. The second point is because we use pseudo-potential and because due to our scheme to optimize our minimal orbitals, we have a small spectral width. And this is really important for the chess algorithm because due to this small spectral width, we can use very efficiently this kind of algorithms. And of course, the sparsity is between 90 and almost more than 90%. It depends on the system, of course, and the size of the systems. So Stefan Moore did a simulation of more than 200,000 atoms. We show in this article that we can do for very different systems. So DNA, water, a pair of skites, amino acid, silicon, nanowires, many things. And sometimes it's the linear scaling. The linear version is converged better than the cubic one. It depends, but sometimes this is okay. So the question, what we asked, in fact, or ourselves two years ago, is what do we need so large-scale DFT? So we can do very large calculations. Okay? And do we need to perform very large calculations? So for biology, traditionally, they use force fields, coarse-grading. For chemistry, what is really important is not the size, but the transferability and the accuracy of results. So for quantum chemistry methods, they can use few atoms. And in the case of DFT, traditionally, we use up to 1,000 atoms. And 1,000 is quite large. So we think that, effectively, it's important to have the possibility to do a full DFT calculation in order to test a different solution and in order to reduce the information. So in this article, we review the different large-scale quantum mechanical calculations. So we try to have a quite fair review with some results from one TEP, some results from CESTA and other codes, linear scaling, and not only. And what we think is linear scaling is important to bridge the gap between the different methods. In the fragment approach, you use an approximation. When you say, in fact, my basic set doesn't relax for the different fragments. For instance, if you have a solvent, you describe all molecules of the solvent in the same way. And this is quite fair if you have only some electrostatic interaction. Of course, if you have some quantum interaction, if you have some overlap between the orbital, it's not feasible. So with the linear scaling, now we are testing different approaches on models. So fragments, as I said, constrain DFT because we have a localized adaptive orbital. You can constrain the charge in a different part of the molecule or in different molecules. So you can do charge transfer. One possibility is to calculate, and we did that, for instance, with the GW CESTA code. We calculate some excitation. We do the charge transfer, and we have, using only delta ICF, we have the right energy of the excitation. So we can calculate excitation as soon as we know the charge transfer associated to that. We can also do some atomic charge analysis, and this is important if you want to go to the QMMM and if you want to compare with force field. For instance, polarizable force field uses a representation of the atomic charge on the atom and also the dipole, and it's important if you want to compare with force field, polarizable force field to know if we have the right, the same answer or not. So we do that in water, for instance, and an idea what we explore now is to use force field to do molecular dynamics. Then we extract some snapshot, and we calculate some quantum mechanics quantities, for instance, the atomic charge, the overlap of the orbitals, and then we can have a statistic on that. So combining classical and DFT is feasible. Another point is to have an idea of the impact of the electrostatic environment. So using an explicit solvent or an implicit solvent and comparing all these results. And so to that, we need first to duplicate our localized adapted orbitals. So for instance for water, you optimize the orbitals for one molecule of water and then you duplicate your basis set without optimizing. So this is an approximation. Yes, but it's quite fair if you are very far from your protein you want to calculate or if you want to have an idea of the influence of the water. So it enables manipulation of optimized basis set. And we develop an efficient and precise rototranslation of localized orbitals because behind that we have a grid and we have wavelength. So we can reformat very accurately our minimized basis set and so avoiding to optimize all the system. I would like to point out that if you avoid to optimize the basis set, we gain a factor of 10 for the calculations. So one application is to calculate the charge transport in organic lead. So from coarse graining we have some statistics. So some organic films composed to two kinds of molecules. So the host molecule and the gas molecule. Then we optimize our localized basis set for these two kinds of host molecule and gas molecule. And we calculate for each molecule the site energies and the transfer energies for an electron to go to one molecule to another one. So we have a statistic and then we use this statistic in some model based on Marcus Thelori. So because we have with linear scaling, we need to have more complex files, more complex input files because we need to express our fragment. We need to say where the solvent is explicit or whether the solvent is explicit. And we need also to process our output files. So what we want is to use an input file which is human readable. So for that we use a markup language. To have the notion to go to workflows what could be nice is to use an output file as an input file. In the sense that we can use the output files to rearrange the calculation. So we want to pass easily and process this input and output file. And for that we develop the Fortran Futai library and we have a class of objects, Python dictionaries. And with that we can build easily in the code some YAML input files. So we use YAML because YAML is a markup language which is really easy to read. It's used for all configuration files now for Unix. And you have a very efficient YAML parser in different script language. So the ID, so we developed that. So I show you an example of YAML. So you have a key and some values. And for the values you can have also another dictionary or you can have a list or another dictionary. And so you can put also some comments. And what is really interesting also if you change, for instance, your input file, you can add another line and because for the program it's a tree. The tree is not changed. You have only one leaf. So with that we develop many things using notebooks. And we use that to process our code for instance for the timing. And it's easy for us to process and to have an idea of the performance of our code. So in conclusion, I think with linear scaling DFT we open up new possibilities. So we can reduce the degrees of freedom to perform large systems even of moderate computers. We have different levels of description. I think it's important to explore all these descriptions. And the future directions are in Constraint DFT, QMMM, doing statistics. Also trying to extract atomic multiples from our QMM calculation. And we are also exploring a linear response time-dependent DFT. Thanks. Thank you. Very nice talk. I'd like to know in your fragment approach, how do you select the fragments? Is it manual operation? Yes, it's manual operations. But it's manual operation so we know the fragment. And now we are working to try to use the fragment approach for solid. And what we can do, so for instance for here, what we can calculate in fact is a kind of density matrix for these molecules. And we can say if this density matrix is highly important. So if it's almost important then you can say that the approximation is quite correct. So this is easy for this kind of systems. But if we want to do for instance for graphene, we can do also the same things. We can say okay, I select a small fragment of graphene and I do the same thing. And in this case it's more tricky. But it works. But we don't have an operator like this kind of density matrix to say yes, this approximation of fragment is correct. But for molecular system, yes, it's easy to do that. So you're also working on embedding approach in the case? Yes, so we have a potion solver using PCM. So we have also an embedding approach here. And we compare between explicit solvent and implicit solvent. Okay, we thank our speaker again. The next speaker is Jörg Hutter.