 Okay, sorry, so until now I've described what we would call the plain vanilla version of the algorithm, but there are a number of things that one can do relatively easily which greatly improves the efficiency of it, and from a point of view of applications of course that's very, very important. So the first thing is that whenever we're dealing with systems in which the number of up and down electrons is the same, that's a very common instance actually. Most singlets that we deal with, most systems that we deal with are singlets in which this is actually you have the same number of up electrons as down electrons, and in that case you can instead of working with individual slated determinants, of course I should be pointing out that if you have an arbitrary slated determinant with a certain number of open shell orbitals, in other words orbitals with only one electron in them, that typical those type of slated determinants are not spin eigenfunctions. In other words if you apply the s squared operator onto that you do not generate a constant multiplied by the original operator. And it's a very interesting question as to how one can reformulate FCI QMC in terms of pure spin eigenfunctions, in other words linear combinations of determinants that are pure spin eigenfunctions. Of course we know how to do that in principle, but unfortunately practical implementations is very slow and we haven't cracked that problem quite yet. But there is a useful symmetry which one can nevertheless adopt, which is rather than working with linear combinations of determinants that have a fixed definite amount of spin, you can work in cases where you have this equal numbers of alpha and beta electrons, you can work with linear combinations of determinants and its partner determinant in which all the spins have been flipped. So in other words if you have an open shell let's say you have an up electron in this orbital and a down electron in this one when you flip it you get this particular combination and it's easy to show that in the exact solutions those linear combinations always occur either with a plus sign or a minus sign. And the plus sign always refers to even spin and the minus sign always refers to odd spins. So these particular very simple linear combinations that does not greatly complicate the matrix element calculation is something that we routinely impose now and actually the neon atom calculation we ran yesterday was actually using the spin flip symmetry and it has the advantage that it immediately reduces, it filters out, let's say the odd spin states which is a good thing, you don't have triplets for example left, it halves the Hilbert space, it enables your time step to be a bit longer and so on. So this is a very beneficial transformation almost trivial to implement. So this is done routinely now. Okay and we'll do some demonstrations on the N2 molecule where again we will use this all the time. Okay now a big improvement on our fully stochastic algorithm was proposed two or three years after we'd proposed FCI QMC and Cyrus-Somrigar said that well it's all very well doing things stochastically everything stochastically but actually if you can incorporate some degree of determinism in the calculation which you can generally do that will only help that will only help the algorithm and so they proposed the following variation on FCI QMC that they call the semi stochastic algorithm and the idea is to select a small subset of determinants which they call the deterministic space and they had a particular prescription as to how you should select this subset we have a different prescription as to how we select this subset but let's leave aside that question for a moment and typically D will maybe a thousand determinants actually in our implementation it can go up to about a million determinants so that's the typical number and the idea is that whenever you're doing this is the force update on a particular so so this is the if you like the imaginary time Schrodinger equation which gives you the update on let's say the number of walkers on determinant I and you now divide this into three terms whereas before we'd only divided it into two terms you have the same death step as per normal but now what Omrigar said is that for the determinants that are inside the deterministic space you can actually perform this summation exactly so you have a set of walkers that reside in the deterministic space and just perform this matrix vector multiplication exactly and then you have the terms which are sitting outside of the deterministic space so this D prime and you would do the spawning step of normal SCI QMC as per normal so in other words so you have let's say so this is your deterministic step so you update the coefficients within the deterministic step according to the exact formula and then you have a set of spawning terms that take you from the deterministic into the into this huge space the rest of the Hilbert space so that's one type of term you have a set of terms where you spawn from the rest of the space back into the into the deterministic space and you have also spawning within the within the the huge space so those are done as per the spawning step of SCI QMC and so that's one thing and as well do some demonstration that dramatically reduces the stochastic fluctuations of the algorithm so what we do in practice is we actually select this deterministic set on the fly so we run a normal SCI QMC calculation and at some point we say okay let's harvest the let's say thousand most important determinants that it's determined by itself at that stage and we put those into the deterministic space and we then continue the calculation using the semi stochastic update so that means that we keep this black box feature of the algorithm that we think is is very important so we don't want an a priori selection of what should be in the deterministic space and what shouldn't be so that's the way we actually do it in practice so then the other thing that that that that Omrige pointed out is that when we calculating the trial the projected energy rather than simply using let's say the Hartree-Fogg determinant as your trial wave function you can just use a multi-determinant expansion of this kind so we have another subset of determinants which which form the trial wave function and this subset need not be the same as the deterministic set in fact the deterministic spaces are usually much larger than what we can afford for the trial wave function and so you have this trial wave function which is now a specific linear combination of of stated determinants and these coefficients CI for the trial wave function are obtained by diagonalizing the Hamiltonian in the set of determinants that you've chosen in the trial space so you set up the Hamiltonian in this space diagonalizer and that gives you these CI coefficients and you then perform your your projection onto that trial wave function rather than onto the Hartree-Fogg wave function now so in practice the method does incur a memory bottleneck because the application of H to the trial wave function basically connects the trial wave function to the double excitations of the trial wave function and that is so you have to store the vector which is the application of the Hamiltonian onto the trial wave function and that is essentially the the memory bottleneck of this trial wave function technique which in fact means that we cannot afford huge trial spaces in general but it's still nevertheless better than having a single state determinant and you can see in particular where you have problems in which let's say the Hartree-Fogg determinant itself has a let's say a small overlap with the exact wave function and so you end up with the small denominators this method greatly helps because you end up accumulating more more overlap between the trial wave function and the sampled wave function in the denominator and that greatly helps the the the simulation okay so what we can do now let me fire up a simulation so you see this in practice so what we're going to do is the N2 molecule with a hundred thousand walkers and I'm going to put five thousand so that after five thousand steps after the system has gone into variable shift mode we will switch on the semi stochastic algorithm so you'll see one simulation where it's converging and it's got one set of fluctuations and then you see what happens when you switch on the semi stochastic algorithm and then at the same point after five thousand steps in the variable shift mode we'll also switch on the trial wave function and we'll take 500 determinants in the trial wave function and here I put a thousand determinants in the in the in the deterministic wave function okay so let's just launch this job many luck it'll run so the first thing that we're going to do is a is a standard simulation let's just fire up the number of walkers okay so you can see the number of walkers here it started off at 10 and it's growing and it's approaching the 100,000 that was our target and here this is the number of walkers on the archifog determinants which is also growing it's reaching about a thousand or so so now we've already reached the hundred thousand limit so it's gone into variable shift mode so it's stabilizing the population in the whole space and here you can see it's also stabilizing the population on the on the archifog determinant so it's growing so that's fine let's just see what the energy is looking like so again it starts off here at the archifog energy it exhibits these large fluctuations because there are very few walkers in the simulation at that time and now it's stabilizing you can see it's stabilizing so let's just put the exact energy on it I'll display it from the beginning and there is the exact energy you see it is pretty remarkable I mean to get that exact energy to do a conventional full CI calculation would take probably a day or so on a large machine that we have more or less instantaneously hit here you can still see of course it's there are some fluctuations so this is a problem with a with a Hilbert space of about five hundred million five hundred forty million to be precise and it's going so let me just see when it went into variable shift how am I going to do that okay so that's the shift expressed in the correlation so it was roughly around time step four thousand so let's just go back so it went into variable shift mode at roughly here so and we'd set it to go into semi stochastic mode five thousand steps afterwards so in a few seconds it will the simulation should be reaching I think it's already got there you see this reduction in noise as it goes into as it goes into semi stochastic mode so let's now blow it up I think I'll write out from let's say time step four thousand just to blow this up yeah so you clearly see the point at which it turns on the semi stochastic algorithm so we have fluctuations which are on the order of so this is one milli heart rate what you see that distance is one milli heart rate so before you have fluctuations on the order of several milli heart rates and as soon as you turn on the semi stochastic you get this this reduction in fluctuations okay now this is still with respect to projection onto the heart rate foc determinant now let's look at the energy if we also project onto the onto the trial wave function I'll do it from the same so the the second estimator here in green is now with respect to projection onto this 500 determinant trial energy trial wave function so you can see the the the oscillations in green are still yet smaller than the oscillations in red you can let this simulation now equilibrate under its own steam but it's sort of pretty pretty interesting to see that the kind of fluctuations that we're now measuring are essentially on the order of a milli heart rate which is a tiny tiny energy actually and the idea is then do you let this run for many many thousands of steps and of course one then accumulates averages over these over these values so that is the the semi stochastic algorithm I should point out that this is so the semi stochastic algorithm essentially improves the efficiency by reducing the fluctuation so if you wanted to hit the same stochastic error with the original method you'd of course have to run considerably longer and so Cyrus actually quantified this in this plot which comes from his paper so here you see this the carbon dimer in VTZ so it's very similar to the calculation that we were running at the moment and so this is efficiency essentially measures as a square of the error times the time and as a function of both the size of the of the deterministic space so you get some modest increase in efficiency as you increase the deterministic space but if you also increase the size of the trial a deter trial wave function here so this is with a wave function trial determinant with a hundred and sixty five determinants you see that you have already a factor of two hundred in in efficiency and if you can afford an even larger trial wave function with let's say one thousand seven hundred or so and use a deterministic space on the order of ten thousand then the efficiency gain here is on the order of a thousand or so I mean that's a staggering amount of of of efficiency gain and I should point out that you know in our in our in our implementation of the semi stochastic algorithm we don't there's virtually no overhead associated with this so as long as you can store the as long as you can store this the action of the Hamiltonian on the trial wave function that's the only memory bottleneck all the subsequent actual calculations incur virtually negligible cost so you get this benefit more or less free the Red Cross is the red line so this is for an inappropriately chosen trial wave function so so the choice the trial wave function actually makes the difference yeah so here you can have a huge trial wave function that was inappropriately chosen and it doesn't buy you very much yeah so this is why we actually decided so we didn't use their prescription as to how to choose both the deterministic and the trial space we just let it emerge from the calculation itself okay now so a second major improvement so we gained a factor factor of thousand was respect to the original algorithm by adopting the semi stochastic algorithm but then there's another pretty serious bottleneck which is to do with the time step in FCI QMC so you'll recall that the generation probabilities so if you're sitting in a determinant and you have to randomly choose another determinant which is connected to it then as the connectivity of the Hamiltonian increases and it increases both as the basis set increases and as the number of electrons increases you end up having to reduce your time step in the same rate to which the generation probabilities are reducing and and so that so if you if you just do uniform generation then the time step scales as n squared m squared to the minus one and so this is actually a an example for the nitrogen molecule so at a VDZ level which is like a moderate size basis you have a time step that's about ten to the minus three or let's say three or four times ten to the minus three and then as you increase the basis set to triple zeta and then quadruple zeta you see that the time step gets smaller and smaller and smaller simply because in this case we're keeping the number of electrons fixed just the same molecule but we're increasing m and so so the way m increases is as the cube power of these cardinal numbers so m is increasing quite sharply and then we further incur a m squared so that really hits us badly so for large basis sets we end up with tiny time steps and so when one step to the next that system hardly moves at all and that's a problem so you can see that there are actually two problems here one is the fact that in in realistic calculations the the actual distribution of Hamiltonian matrix elements can vary quite wildly I mean so you might have one connection that's of strength point one and many many connections that are let's say a couple of orders of magnitude smaller than it and of course if you were just to select these things with uniform probability so if you're sitting on on this determinant in you selecting a with uniform probability you would be selecting it with with probability a quarter and so the spawning probability in that case would be tau times the Hamiltonian matrix element which in this case is point one divided by a quarter and so you'd be spawning with the probability that's given by point four times times tau whereas if you were to attempt to spawn on B then the same thing would give you a spawning probability of point zero zero four times tau simply because the made this matrix element is is that much smaller and so the largest allowable value of tau is actually set by this less probable event this one and you would find that in these units tau would be 2.5 for this for this problem and that would mean that every time you attempted to spawn on a you're spawned with probability one but every time you attempted to spawn on B you would only be spawning one and a hundred times and this is a problem so you'd have both a higher rejection rate which is which is always a bad thing from the Monte Carlo perspective because you know all the all the effort that that you all the numeric computational effort to generate a move essentially every time rejected gets discarded so what we would much prefer to do is to have a scheme whereby the generation probability becomes somewhat in fact in the ideal case exactly proportional to the to the Hamiltonian matrix elements themselves because in that case the spawning probability which would be the time step times well it's the Hamiltonian matrix element divided by the generation probability if this proportionality were exact that would just be times a constant and that would be great because then you could choose your constant to be your time step to be one divided by your constant and that would mean that every time you attempted to spawn you would actually you would actually spawn so you would have zero rejection rate and that in some sense would be the would be the ideal would be the ideal algorithm now the problem is that you can actually do this you can ensure exact proportionality but that involves setting up cumulative probability functions that themselves cost order n squared m squared to set up so that would be that would be would make the numerical effort essentially untenable for each for each spawning that you would make so instead we want to kind of a way out this is kind of obvious we want to kind of a smart way of making our selection which doesn't involve the numerical effort of evaluating the Hamiltonian matrix elements over all the connections and this can be actually achieved at least this is the way we do it by using a Cauchy Schwarz decomposition of the of the Hamiltonian matrix elements and these Cauchy Schwarz distributions take only order m effort to set up rather than order n squared m squared so you pay a little bit but you gain a lot in doing so okay so so this is the so this is actually how the how our Cauchy Schwarz scheme works and I'll explain it for opposite spin excitations so let's say we got two electrons i and j that have opposite spins so let's say alpha electron and beta electron and in that case the Hamiltonian matrix element only consists of one term because there is no equivalent exchange term that exchange term is zero for opposite spin excitations and so in that case the Hamiltonian matrix element is simply ijab let's say electron i with let's say alpha spin going to orbital a with alpha spin and electron j with beta spin going to whole b with beta spin okay now what we want to do is to take this four index object and essentially factorize it into something that only refers to two indices because once you can set up distributions on two indices setting up it's much simpler than setting up distributions over four indices in essence so this is where the Cauchy Schwarz identity comes in so that one can show that this four index integral ijab is less than or equal to the square root of now IAA is effectively the self interaction of the transition density electron you know IAA with itself and the same holds true for for the self interaction of the transition density jb with itself so that's so that's Cauchy Schwarz and it turns out to be a very nice very rather sharp inequality in other words whenever this matrix element is zero this turns out to be a good upper bound to it of course there are there are terms in which ijab is actually zero and of course this is never zero but in that case you can usually discard those because when ijab is actually zero it's usually zero because of symmetry and you wouldn't bother to generate it in the first place so you need you need you need to be able to take into account symmetry in your excitation generations which is relatively easy to do but otherwise it's a it's a it's a good it's a good inequality okay so the idea is as follows so what we have to do is what your excitation generator has to do is to return you four numbers ijab that's that's in essence what it has to do and so the ij refers to the electron pair and ab refers to the whole pair so first thing is we write this so the probability of generating ijab we write with this with the form of a conditional probability that we first of all select the electron pair ij according to some known scheme that can be with uniform probability for example and then to select the whole pair ab given that we have selected the electron pair ij and if both these are computable according to algorithm that will be sufficient to compute the actual probability that you want and then in the specific case of opposite spin excitations this conditional probability can itself be decomposed to be the probability of selecting whole a given electron i that's a one spin multiplied by the probability of selecting whole b given electron j that's of the opposite spin so in the case of same spin into excitations you also have to consider the reverse way of selecting it because in that case for a given electron pair ij you can select b first and then a so to speak so that's the only the only thing to bear in mind okay so that's so far so good and now the idea is to select when you come to now look at this thing probability of a given i is you select the whole a given electron i according to the Cauchy Schwarz that portion of the Cauchy Schwarz factor so what this actually means in practice is that for every given electron i you now have to form a cumulative probability distribution that involves summing over all available holes a that essentially sets up this probability distribution and this is what costs order m to to to calculate and then you can now once you set up this cumulative probability function you can now select your whole a out of that out of that function and it turns out to be a really smart way of doing it in other words whenever the whole a interacts strongly with electron i and whole b interacts strongly with electron j that leads you to large ij ab matrix elements and that's the trick so so what you actually see is a double benefit so this was the the way the time step went for the nitrogen molecule as you as you increase the basis set and now with the Cauchy Schwarz trick you see you still see a decrease actually in the time step so it's not absolutely ideal but i mean the decrease is still very much contained but even more important is the fact that you're actually rejecting many many fewer moves at the same time and that overall increase increases the efficiency of the algorithm so what's shown here is the cpu cost per walker per successful time per successful spawn per unit time so that's a measure of the of the efficiency of the algorithm so to speak and so you can see that whereas before with the standard uniform scheme this cost would rise very sharply as you increase the basis set simply because you were rejecting so many moves you know if you just make random moves with large basis sets the probability of actually hitting something large matrix elements is really going down and that means that your rejection rate is going up here you can see the cost again increases fairly moderately as you as you go from a small basis set to a large basis set so that is that is our trick with and so for something like the quadruple zeta basis sets we gain you see almost the factor of a thousand or so again inefficiency so for these large basis set calculations when you combine our Cauchy Schwarz sampling with Omrigar's semi stochastic algorithm we're about a factor of a million more efficient than we were with the plane with the plane vanilla algorithm so that's that's a really very significant very significant game okay so that finishes that portion of the talk which more or less describes the state of the art as we have it with our code we are working a little bit more on on on these Cauchy Schwarz things but I the gains we're seeing and no law no more is dramatic so there's maybe a factor of two more to be gained but I don't think there is any more okay so now comes another topic very important which is okay so you've done this Monte Carlo simulation and you got this very accurate energy but what have you actually learned about the system and so you know how can you actually analyze the results that you have and that basically brings us to another topic which is how to calculate properties so the wave function of course is a is a very very heavy object in general and when you want to calculate anything from it or if you want to understand anything from it you have to try to somehow compress the information that you have in it and the way the best way to do it is to evaluate what are essentially primitive expectation values or or or expectation values of primitive operators and the most primitive operators that you can think of that preserve the number of electrons are let's say double annihilation followed by double creation that's a that's if you like is the generic operator that you want to be able to handle and so these four indices r s p and q give rise to a basically a tensor or a matrix that we call the reduced density matrix so i should point out that of course there are many theories on on how to optimize the reduced density matrices themselves as a means of doing electronic structure calculated but this is not what we're doing we're not concerned about let's say n representability conditions on these reduced density matrices the only thing we're concerned about is given our wave function psi how can you calculate this two-body reduced density matrix and of course this has got four indices so it's got roughly m to the four storage associated with it and once you've got the two-body reduced density matrix you can then trace out one of the electrons to get the one-body density matrix so this two-body density matrix in a way is the most fundamental quantity we most often want to be able to deal with static quantity it only refers to the ground state now well that's all very well but now you see the problem that we face is that whereas before we were doing projections so we had a trial wave function in the bra and the exact wave function in the ket here we have actually got unfortunately the exact wave function both in the bra and the ket so we have a what is a quadratic functional of the exact wave function as opposed to a mixed estimator which is just the linear functional of the exact wave function so let me just point out that obviously if you've got your reduced density matrix then any operator which in this case can be written as a sum over two-body operators can just be obtained simply by tracing your two-body operators represented in the same basis set together with your density matrix so for example you can actually calculate the energy by tracing the Hamiltonian with your two-body density matrix so this psi is an n particle wave function and this essentially so what you've done is so you can think of essentially this if you write it in so if you if you work in first quantized form you would normally write this as r r prime so right rather r1 r2 r1 prime r2 prime and that is basically your wave function your n particle wave function r1 r2 all the way up to rn multiplied by r1 prime r2 prime all the way up to rn prime and then you integrate out particles r3 to rn and you do that same thing for the prime variables and then you multiply by because it doesn't matter which pair of electron you choose so that's that's the reduction the reduction is that you're integrating over the remaining n minus two particles and that's essentially what you're doing here so this so this is a four index object pqrs and that's exactly analogous to the four indices that you see here that's just in second quantized form yeah that's all so essentially it's saying that if all you're ever concerned with are two-body operators then the only information you need is the information that's contained in the in the two particle density matrix of course there are some applications in which you need higher-body density matrices but let's just concentrate on this okay so the first thing is you can calculate the energy this is the energy of course is known to be a simple functional of the two particle density matrix so if you give me the two particle density matrix i can return you the energy just by tracing it out so this is the one body and this is the two body uh two body operators now the nice thing about this formalism is that there are a whole host of other properties that you can also calculate and one of them most value value uh valuable of all is the nuclear gradients so if r is uh is a description of some underlying set of free the degrees of freedom such as the nuclear gradients and you move your nuclear and you move your nuclei around you want to know how much the energy is changing by that is a very very important quantity to be able to calculate and that can itself be written as a trace of again the one body density matrix with how the single particle uh Hamiltonian varies with the nuclear gradient so that's traditionally called the helman-feinman force and you have uh uh the the two body density matrix and uh traced out with how the two particle uh density uh interactions are changing with um with as you tweak the nuclei and if you weren't dealing with atom-centered basis sets these terms would be zero so you wouldn't have to worry about them but of course we do we are dealing with atom-centered basis sets and these are the so-called pool a terms that we can also calculate now fortunately our quantum chemist friends if you're using quantum chemical codes actually produce you for Gaussian basis sets these entities are actually available in analytic form so they could just be read in together with the other four index integrals so to speak so that means that we can then uh directly calculate the nuclear gradients by tracing out by tracing out the uh these things and then there are other uh quantum other entities such as the total spin for example which can be obtained in terms of the elements of the of the uh two particle density matrix and so on um host of um of properties that can be calculated now the question is how can you calculate your uh two body density matrix well if you take the the this original expression and now you substitute for the ci the ci expansion for your wave function uh what you get is the following expression for for this element of the two body density matrix which is a sum over the Hilbert space of the system and for each element of your Hilbert space you take the ci coefficient for that uh for that determinant and you multiply into the ci coefficient uh of the determinant that you get by exciting uh your the the the determinant i that you're on with the relevant excitation operators that you've got so that and then you trace over that product uh to get you this element of the density matrix and then you've got to repeat this m to the four times essentially uh to get to fill your density matrix out of course there are symmetry conditions that you would use uh which are have to be fulfilled by these indices but those are relatively easy uh to to implement as well okay so this is obviously an exceedingly expensive uh thing to do and the question is can we generate this density matrix on the fly as we're generating our wave function itself and um now the crucial point is that in fact when we're doing fci qmc we're actually sampling uh over the Hamiltonian matrix elements which means we're actually sampling over the double excitations so for any given determinant we sample over its connections uh via the fci qmc spawning step and we want to use that a sampling step to uh to calculate the elements of the of the density matrix so how it works is you've got um let's say we're sitting on uh determinant i and we now want to find we now attempt to spawn on determinant j so that's that's the basic step and we know the probability of spawning so we know that i go from determinant i to determinant j with a certain probability so i'm now going to unbiased with that probability when i come to multiply the relevant ci coefficients together so ci times cj i seem to have missed out a complex conjugate if it's relevant so i divide essentially by the spawning probability and i continue to accumulate this uh for each um for each element of the uh of the density matrix okay okay now um well that sounds all very well and good but there is a problem and the problem is that we don't actually have the ci coefficients to begin with all we've got are the populations of walkers and we only the only thing that we know is that the is that the populations of walkers time average to give you the ci coefficients but instantaneously they're actually fluctuating entities so the only thing that we can actually calculate in practice is not ci times cj divided by the generate by the spawning probability but is the number of walkers on i instantaneously multiplied by the number of walkers on j instantaneously and then time average that and of course we all know that the the average of a product ain't the same thing as the product of the averages so what we want is the product of the averages but that's not something we have access to okay so that is uh what we have access to so nevertheless you can say well okay let's uh let's uh proceed boldly as they say and ignore the fact that uh we have this correlation problem of of uh of only being able to average the product rather than getting the product the averages what happens well the answer is that uh we get uh very poor results so that's an example i show here i think this is for a carbon molecule c2 molecule and in green we see the the the projected energy as it normally converges in uh in normal ci qmc so essentially by the time you get to 10 million walkers you've absolutely nailed the energy you know you're absolutely there yet if you compute the energy that you get from first of all computing the density matrix and then computing the energy from the density matrix uh this you get this uh uh purple line here so it is slowly considered two two observations the first thing is that it's variational so it's always above the exact energy and that's why it should be variational because we are dealing with uh symmetric expectation values and that's the source of variationality so that's good news is that at least you have variational energies but the bad news is that it's a bit of a pyrrhic victory because although it's variational it's completely out it's 50 milli hard trees out and uh it's furthermore it's converging exceedingly slowly towards the exact answer so probably you'd have to get down to a few billion walkers before the this variational energy uh coincided anywhere to reasonable accuracy with the um with the exact energy and the same is seen it's a bit less uh it's dramatic in the case of a stretched nitrogen molecule but the uh but the result is not very good so why is our density matrix so poor uh what what is actually going on so when we actually analyze the problem it turns out that uh the um so this is uh we took a toy model which you can exactly diagonalize in so you could get the exact two-body density matrix and then we compared the elements of the exact calculation with the stochastic calculation and what we found was that the diagonal matrix elements that we were calculating were systematically biased they were too large and it didn't matter if we were doing the fully stochastic simulation or the semi stochastic simulation in both cases uh the diagonal matrix elements of our density matrix were too large the errors are always positive in other words whereas the off diagonal matrix elements the errors were somewhat better behaved distributed about zero but the diagonal matrix elements were positive biased to the positive and so now what's wrong well you can see that when you try to calculate this diagonal matrix element uh gamma p q p q then what you have to do is to take uh the population on determinant i that contains the orbitals p and q and you square that and then you average the result um now you can see this uh induces a bias because let's suppose we decompose the instantaneous number of walkers n i to be some unknown hypothetical number n i exact we don't know what that number is plus some fluctuation and by hypothesis we're going to assume that the fluctuation time average is to zero so the actual simulation on average is sampling the exact solution but instantaneously is is away from it then you can see that if you time average the square you end up with the component that you actually want which is the square of this uh unknown exact component plus the time average of the square of the fluctuation and that is always a positive uh positive contribution to uh to the uh to this uh object and that leads you to the systematically biased uh density matrix elements okay so how do we solve this problem um and the problem is really uh very beautiful uh sorry the solution uh is is very beautiful and uh essentially employs what we call the replica trick so what we do is we do two simulations independent of each other so we have two totally independent uh simulations that don't talk to each other uh at all and then when we come to evaluate the matrix elements what we take is the simul from simulation one we take the population on determinant i and from simulation two we take its population on determinant i rather than squaring the the same population and now you time average that and of course because the two populations are strictly uncorrelated it's obvious that the fluctuations are strictly uncorrelated in other words if in simulation one determinant i shows a positive fluctuation there's no reason to suppose that in simulation two it will also show a positive fluctuation sometimes it'll be positive sometimes it's negative and this actually uh uh cancels out exactly and so with this replica trick you actually manage to exactly sample what the unknown quantity that you didn't have access before namely the product of the uh of the of the averages and this is a uh if you now come back and you now look at the error for the for our toy model the error in the diagonal matrix elements whereas before you had this positive bias now with this replica trick that bias is entirely removed and the error is now symmetrically distributed about zero which is what you wanted to be so that's the paper in which this has been published in and the results are really staggering here you have in red the the result that you get from the density matrix but has now been calculated through this replica trick and essentially already a totally trivial number of walkers whereas before you had this huge error you're more or less bang on top of the of the exact result so in this blow up you actually see see the see the behavior and you can see there's a very very fine energy scale that we're dealing with now interestingly enough the density matrix formulation is much better when you're dealing with strongly correlated systems in other words the fluctuations in energy are much much more controlled than than the corresponding fluctuations that you get from the trial way from the projected energy so this is for the stretch c2 molecule which is a very difficult strongly correlated system and you see that the there are actually error bars by the way in red here they're not really visible but you can see the error bars in green those are the errors that you have a 3 o'clock already is it okay those are the error bars that you have already for the projected energy it's much larger okay so i said that you can calculate a nuclear gradients now so we can routinely calculate these density matrices and from those density matrices you can now routinely calculate the derivatives the nuclear derivatives this is an example of the n2 molecule which as you stretch it you know you break the triple bond so it's a very again a very difficult strongly correlated limit and shown here so this is the the energy the binding curve and you have here the force curve that you calculate from this gradient from this trick and you can see for example at the minimum where you expect the force to be zero that's indeed that's indeed what is observed so this is an example of a of a property that we had we have now access to okay and running short of time so i'm going to skip the section on on other properties response functions okay so maybe i'll just quickly finish with an extension of our algorithm now to calculating excited states so excited states is is always a more difficult problem so if the ground state has a sign problem has a has a sign problem you can expect the excited states to have an even worse sign problem so to speak and so we struggled with different ways of getting excited states including quadratic operators h minus a constant squared and so on but none of them work but what does work is in a way the simplest thing that you can think of which is to perform a gram-schmidt orthogonalization but applied instantaneously to just to different distributions of walkers so now that we have the code thanks to the replica trick where you can run two or even multiple simulations simultaneously uh what we decided to do was to evolve one population according to the original Hamiltonian and then a second population with the original Hamiltonian in which you projected out the instantaneous distribution from the ground state calculation and then from those two you could instantaneously project out those two distributions to get an evolution of a second excited state and so on so i was pretty sure this wasn't going to work because you only have a very coarse description at any snapshot and you might not think that actually orthogonalizing with respect to a core snapshot would get you anywhere but it more or less worked out of the box so when i asked uh Simon Smart one of my postdocs to code this up 20 minutes later he came back and said you better come and have a look at these results and so this is this is it is the first calculation that we did for the c2 molecule and we actually had three populations one shown in red and then one shown in blue and one shown in green and uh the blue and the green calculations were started off from the Hartree-Fock determinant where you've just performed a random single excitation into another determinant that has the same symmetry as the ground state so these are genuine excited states and what we found was bang you see the original wave function the original population settles down here and then these two actually settle down so there are the first and second excited states are very nearly degenerate with each other and so if we now blow this up i'll actually blow it up a little bit more so you see this population here settles on its ground state and these two one of them comes and settles on the first excited state and this purple population they kind of repel each other at this point and this one now settles down on the second excited state these the first and second excited states are only separated by six milli Hartrees yeah yeah yeah in this case yeah so they were uh they were using these uh this uh this simple symmetry in order to do that you can see that of course because they're only separated by about six milli Hartrees and the projection time is inversely proportional to the splitting you can see that you have to simulate for many hundreds of so this is in atomic units i think it was here imaginary time in in atomic units you have to simulate by for several hundreds of of units of imaginary time but that's just the name of the game you couldn't project it out any more quickly with these type of projector algorithms but sure enough you can see they basically settle down onto onto onto the respective excited states so here here is an application to the Hubbard model for a system that you can exactly diagonalize and so this is a 14 site Hubbard model and again so one two one two three four this is for five excited states and you can see the kind of accuracy that one's able to hit in other words if there is a bias which is arising because of the way we're doing this projection it's really very small um this is a another case uh excited states this is C2 molecule but in a larger basis set quadruple zeta where there are also very accurate dmrg calculations which have been done by uh Sandeep Sharma and um what we're showing here is convergence of these four uh well ground state and three excited states as a function of the number of walkers with respect to the dmrg results and so essentially by the time you get to about 10 million walkers you've really nailed the calculation it's sub k cal per mole accuracy with respect to the dmrg calculations so this is two different geometries equilibrium and stretch geometries of c2 and these are again binding curves it's very interesting there are actually error bars on these uh on these plots but these error bars are essentially uh not visible on the scale okay so uh should i continue or should i stop here three zero okay so as a final final topic i'll mention um another application which is uh uh to in quantum chemistry is called uh casse cf so of course it's nice to be able to do full ci calculations but at some point uh you really run out of steam i mean uh and that's the reality of the problem so at some point you really although we have a weekly exponential scaling algorithm for a large enough number of electrons you really feel the exponential scaling and then you have to start making approximations now uh the common uh approximation that's used in quantum chemistry is the is the casse cf approximation whereby you say you you uh you categorize the orbitals that describe your system into three three types orbitals which are uh called core orbitals in which you maintain the uh the the occupancy to be double to be doubly occupied so the core orbitals virtual orbitals which are high energy orbitals where the occupancy is assumed to be strictly zero and then you have these active orbitals where the occupancy can range from two to zero and um the idea is then to perform a full ci expansion uh uh of we're assuming this type of truncation so you still have a combinatorially large number of uh determinants to handle because you're distributing n electrons among m orbitals in your active space but it's obviously a smaller combinatorial space in the full ci space okay so that would be a single uh full ci calculation that's what we would be called casse complete active space but now what you then want to do is to perform uh orbital rotations uh um um orbital rotations in order to obtain variationally the lowest possible uh energy wave function that you can construct so this kappa here is an anti-homission operator so e to the minus kappa is then a unitary operator it's a rotation and so kappa consists of uh this matrix kappa consists of parameters kappa pq which are now the variational parameters in our in our um in our theory and this e pq is a is a is a single particle excitation operator with the spins traced out now what you can show is that the derivative of the energy uh with respect to the parameters in your kappa matrix uh can be computed essentially as the commutator of the Hamiltonian with the uh with this excitation operator sandwiched between psi and this is now an entity that we can calculate via our density matrix uh code and so in other words we can calculate the electronic gradient and actually you can show that you can not only calculate the electronic gradient but you can also calculate the second derivatives which basically brings in a nested commutator but that's not a problem either so once you've got your electronic gradients in other words you can now rotate your orbitals which involves the mixing of the core orbitals with the active orbitals or the uh virtual orbitals with the active orbitals or the core orbitals with the virtuals you don't have to worry about active active orbitals because they are uh uh because you've calculated the full ci energy it's invariant with respect to those so those are redundant um and uh you can then iterate so having performed a rotation you now come back and once again perform your your your CAS calculation which we now do with fci qmc so there's a out outer macro iteration that involves the self-consistence step and the inner fci qmc calculation that is the uh that is the ci solver so i should say that the largest CAS space is that you can actually handle in practice with conventional algorithms is 18 orbitals 18 electrons that is pretty much the exact diagonalization that is as far as you can go 24 in 24 which is of course factorially larger is completely out of the out of the question so this is now an example let's say for the benzene molecule uh which where we're now treating the pi system so you have six pi orbitals and six electrons so it's something that you can do a conventional CAS scf on and uh this is so you see these are the macro iterations and you get the behavior uh both for a deterministic calculation or our stochastic calculation the point is that we end up with the same uh same uh energy at the end of the day so we end up with the same solution now the naphthalene molecule grows phenanthrene so this is 14 in 14 that's still uh uh possible to do with a deterministic calculation triphenylene so that's 18 in 18 so that's really pushing deterministic calculations but we can actually now go to a molecule like coronene and coronene and correlate 24 electrons 24 orbitals and perform the self-consistent optimization of the orbitals uh as well so it's in about 13 13 iterations we get there but it's a relatively smoother optimization process so anyway that is the uh that's if you like is the is that is where we're going is heading towards treating large molecules uh like these within this CAS scf uh CAS scf framework okay and I think I will probably uh stop there thank you