 Everybody, it's time to start the afternoon session, and we'll start with the second part of the selection of this, okay? Yeah, well, welcome back after lunch, and we will continue from where we sort of like, yeah, stopped to go to lunch. We had derived that the maximum amount of entanglement that you can store in such a matrix product state is logarithmically limited by the dimension of the matrix, and this essentially will set the course for what we can do, because there is one statement about how much entanglement there is actually in a physical system, and the statement in one sentence is, and this has to be qualified in a second, is the entanglement grows with the system surface. This is what you call an area law. This goes back to black hole physics, the famous Bekenstein entropy of the black hole in 73, but because sometimes this is stated without the extremely important qualification, this holds certain for ground states and certainly for low-lying excited states and also other states, but not in general. So it's a highly specific property. It just turns, I will get to that in a second, but it turns out that for states that we are interested in, it's often true. So let's assume that for a moment and then explore the consequences. Now the surface of a one-dimensional system, if you cut it into two, is of course a point. So that doesn't grow if you go to the thermodynamic limit. So what you would expect is if that is true, entropy is a constant or entanglement is a constant. In 2D, it's linear. In 3D, it's area, it's L squared. Well, like where the black hole physics come from. Now if we take that formula, which we just derived, we see that the matrix dimension we have to allow in our numerical algorithm will therefore scale exponentially in the entanglement. I turn the formula around, whether you take E or two doesn't really make much of a difference, depends on the type of logarithm, but the consequence is always the same. The matrix dimension in one dimension, you can hope that it's independent of your system size. In two dimensions, you see you're running into trouble. It grows exponentially with system size and in three dimensions it looks really terribly bad. So that would already suggest that the method we are talking about or the class of states we are talking about here will work best for one-dimensional quantum systems, perhaps in limiting cases in two dimensions and probably you can more or less forget about it in three dimensions. In terms of that, we will be talking about non-equilibrium algorithms as well. You also can come up with the following idea. You know, I said in the first lecture we use these states to parametrize basically every state out of Hilbert space, which we can do in principle, but in practice, the huge size of the Hilbert space that we are talking about in what we are doing here might just be an illusion. Because what you can prove, mathematicians have done that, that if you pick a random state in Hilbert space, then entanglement entropy is in fact extensive, just like the normal entropy you know from thermodynamics. And what you can say is even worse in some sense. If you then say, I build an average over all states of Hilbert space, then it turns out not only is this expectation value extensive, well like what is stated here, but it's even maximal. It's even maximally entangled. It's really bad. So, but this leaves open the little gap that the states with non-extensive entanglement, that there are some, but this must be a set of measure zero. So it's an extremely small set so that if that's Hilbert space, you are sitting in this tiny little corner here and the merit of matrix product states is because they are good at encoding low entanglement states, among which those states with non-extensive entanglement is that they parametrize this set efficiently. So the ground states and these other guys which have very low entanglement, they sit in this corner, but what you have to expect is that if you then do something to your physical system which kicks it out of this cozy corner, you must expect that entanglement grows a lot. And then of course that the simulation or the use of such a description breaks down. Okay, so before we get into that, let's now start working with these guys. And I have to introduce some diagrammatics. I assume that Miles has shown you some similar pictures already on Monday and Tuesday because this is a standard notation. Remember these matrices A which we had or M to form our matrix product states. We will represent them in this way here. So like there's these row and column indices of the matrix and this is the physical leg sticking out, the sigma. Some people draw exactly the same picture but with the physical leg going downwards. Here it's the convention that goes upwards. And you remember on the left end and on the right end we had this problem of the dummy index because we need vectors to ultimately arrive at a scalar. And so I drop off the dummy leg here. And if you want the complex conjugate of these matrices which you need, not the A dagger but just the A star, the complex conjugate, I just turn the picture around. So if I wanted to have the dagger I would also have to invert the order of the legs that point out. And so this is the pictorial representation which we use quite a bit because we don't want to deal with all these indices which we had before lunch. And so the one rule is what you have to do is you draw this picture and if you connect lines like here, here, here, here, here, here, we have connected such objects. The rule is just that these connected lines are contracted being multiplied and summed. Just was what the matrix multiplications do. So that is what the matrix product state looks like in its graphical representation. Now, let me make one little remark which connects this representation, the MPS representation to more standard RG schemes. Imagine you have an RG scheme where you have a block which only is described by effective states, effective degrees of freedom which I call AL minus one. They live on a site of length L minus one. And now you add site L and you want to come up with a new effective description of the next larger system. This is a typical RG way of thinking but what you have to do is of course the basis dimension would be the product of these two guys dimensions that leads to the exponential growth. So you have to decimate again. So whatever your decimation prescription is, totally independent of that, these new states will be linear superpositions of the old states with these expansion coefficients. And these expansion coefficients are now rearranged in a fancy way. This is just standard quantum mechanics. I rearranged them in the form of a matrix why I take this as the row index, this as the column index and this labels the various matrices. So you come up with this expression here. Now that should indeed look very familiar to you after what we had before lunch. So this simple rearrangement of the expansion coefficients into a set of matrices which is same as we did before allows you to recurse the whole thing back to the first site. And then the state which you have is simply looks like this. It's a superposition over all the basis states in the usual computational basis and the expansion coefficients are just the product of matrices in exactly the same way we had it beforehand. And this is sort of like we could pursue this further but this is a hint why it's possible to frame RGs in particular if there's only one parameter flow in terms of matrix product states then the famous Wilson NRG is an ideal case of that we had a paper on that well now almost 10 years ago where you can show that you can exactly formulate NRG in this framework. It's just that the decimation procedure of NRG is another one than one uses here as we will see. This is just a hint. So what you can do is you can take this picture which I just showed and ask about normalization. Because the basis states which we formed for these blocks or effective states they are of course chosen to be orthonormal or we want that. I mean that's the choice we always want to make in physics. So I say these states that's the one which are constructed from the ones of the next smaller block of the last step with these matrices M as the expansion coefficients they multiply together should give the delta. Now I work out this expression and what you find is that this thing here has to be a delta function. This doesn't look very nice but if you get rid of the indices which just here do a matrix matrix multiplication you can achieve that by transposing this object here so you get to the dagger. You get a relatively simple relationship that you see what you want is that in a matrix product state to have this nice property that the block states are orthonormal what you have to have is that the sum over M sigma dagger times M sigma that this is the identity matrix. This is just here written in a more fancy way with the indices here it's again more explicit. So and because the blocks were built by starting on the left and adding one side after another matrices which you obtain in this way are usually called left normalized. So it's the identity is this a sigma dagger a sigma and I call them a to distinguish them from the right normalized where you basically start growing the block from the right and add sites on the left. So you basically you come from both sides in your chain and there the normalization prescription is the same just that the dagger has changed its position. Graphically the whole thing becomes very simple because what you have you have here the matrix a sigma and what you have up here is a sigma star because if you turn it around it's the star. Now, because multiplication would of course match row with column but here I'm multiplying row with row and then I turn it into a matrix multiplication by dagger ring it in the picture it's very simple it says I contract over the left indices. So if you look at this picture formula here you see it's exactly this this object here you sum over the sigmas that's this contraction and you sum over the al's which are the left legs so this is this contraction here and otherwise you multiply and what sticks out is the al prime and here's a typo embarrassingly there should be no prime. So, and this should be the identity and the identity is just a straight line because which connects one state which itself and the right normalization in a picture looks just the other way around you bring together b and b dagger but now you connect you contract over the right legs. So these are the two types of matrices you want to have and in practice you will represent matrix product states not for mathematical proofs they will typically choose one or the other notation but for computational reasons which we will see in a second you give your matrix product state the following form you start out with left normalized matrices then you may have one or so in between which doesn't obey any specific convention and then you have right normalized matrices all the way to the right. Well, we will see in a second why we want that because at first sight you will say why don't you stay within just one of these normalizations which is basically exploiting this gauge degree of freedom I mentioned before. Before we do that to actually see why this is useful let's make a small excursion into the world of operators because I mean we want now to do something ultimately with our states and for that we need operators. Then in fact it's very simple to come up with a matrix product representation of operators it's as easy as for states. So the most general operator you can write is one which takes an arbitrary basis state as input kicks out another basis state and does this with this amplitude here. And so if you look at this object here and if you rearrange the indices into sigma one, sigma one prime and so on and so forth the naive decomposition like what we did when we had our mean field unsat for states would be to say well somehow this is likely to factorize like this and in fact this is actually true for local observables like if you for example take and spin SC on side I this means in reality there's the identity acting everywhere except on side I where you have the SC operator and this takes exactly this form here. You have the delta's on all sides like side one, side two and so on and so forth except the SC operator on this one side you are interested in. Now that of course is not the most general form but we can again play the same game as before do this SVD reshape whatever game to prove that any operator can be expressed in that form namely that you say I replace this coefficient C by a product of matrices. This is exactly what we had for the matrix product state except that now each of these matrices is labeled by two local states ingoing and outgoing. The state of course had only one state, only one index. The operator needs to ingoing and outgoing. So that's the representation. So it looks as before just with the additional indices which means that we can also invent a graphical notation and the graphical notation now is very simple because the state was this object here. And now the operator is this object here. This is each one of these matrices like the balls were the matrices for the states and now instead of having one physical leg sticking out I have now physical legs everywhere ingoing and outgoing. So that's the graphical representation and again the contraction rule holds the pictorial contraction rule because when I want to apply an MPO to an MPS I take my MPS, this is now this object down here. I put the MPO on top of it and then I contract over all the new lines that pop up. You can also work this out as an analytical formula but there's actually no need for that. And the nice thing is that of course you can now sort of like in this internal contraction you can say I put this into one new object with just one physical leg sticking out like here but what happens is, or here you see better but what happens is instead of say a certain, here had a certain dimension here and here had another dimension. So the whole thing here is now a multi-index of two indices, one labeling the states here and the other one labeling here. So here it's the same situation but with the multi-index the dimension has multiplied. So say a typical situation might be here this is a matrix of dimension 1000 and this MPOs often are very small say it's five then the dimension of the new MPS would be five times 1000 to accommodate for the multi-index, so 5000. This of course means that unless in the very simple case that the MPO is a product of numbers and not a product of matrices the application of an operator to a state will increase its dimension. So this is something you cannot circumvent like some applying something like a C or density or something simple local that's not a problem because that can be represented as a product of numbers which we have seen. So there the matrix does not grow but if you try to work out and we will do that what a Hamiltonian looks like for example the Hamiltonian will not be such a nice object. So what is the problem? So now I do quantum mechanics you give me an MPS I give you an operator we apply it to the state and we have a new state standard quantum mechanics but the dimension has grown and now you do this a few times and your computer basically will start throwing up because it's too much for him for it. So what do we have to do? This leads to one issue which is often a little bit swept under the rock but in the practice of using matrix product states it's one of the things where you can say if this works reasonably the algorithms will work well if not you may expect numerical instabilities. Here I'm just confronting you with situations where it definitely works because what you have to do is if you say well my computer can typically handle matrix dimensions of a few thousand that would be nowadays normal for a tensor network application best codes do a few 10 thousands or up to 100 thousand but this is pretty extreme so let's say a 1000 or 2000 and now you apply these operators and you run out of this envelope what you can do. So the solution for that is that you say well the matrix product state is of course only approximate anyways. We have not yet seen how to produce it but in some sense we have seen an exact representation would probably again be exponentially complicated so in some sense it's approximate anyway. What you can also understand is that matrix product states approximate in a hierarchy because if you say I have an MPS with matrices of this size and you have one with matrices of this size you can always say well I put this one in that top left corner set the entire rest of the matrix to zero so a larger matrix product states contain all smaller ones and because you have additional degrees of freedom to write something here which is not zero basically this will by necessity be the better representation than the smaller one. So there's a hierarchy kind of a monotonic hierarchy of the quality of approximation if you make your matrices larger you have more parameters simply and what you say is well now my calculation has produced me an object of this size now without all these zeros so that's what we have. I now want to compress this here to matrices for matrix product state which is smaller because that's what my computer can still handle of course at the minimum loss of information and the minimum loss of information in our cases of course that you say the distance between the unapproximated this of like well it's again also already an approximation let's say the distance between the state you have and the compressed version of it should be as small as possible. Okay and how do you put this into practice? This is very simple if you are in this mixed normalized form and that is now where I get back to why I said why it's so important. So what you do is you take this state here and what you have here in reality is of course an entire set of matrices and what you take is you stack these M matrices and now this reads as if there were M matrices but in reality what I mean is there are D matrices D was this local dimension D M matrices this is looks could be misunderstood so like we obtained them by slicing what we did previously is we had the big matrix and we sliced it into M1 M2 down to MD now I put them back together again also so I undo the slicing or unsliced them or I stack them and what I do is I put I put sort of like the index which I had previously up here labeling the matrices I put it into the multi index here in this case of the column you can actually do this also on the row side this will just kind of change the way how you do the normalization but the result will be the same so on this object here I now do again as always I do the singular value decomposition and what can I do once I have that I have an essentially I have an orthonormal matrix here on the left which I can multiply into this matrix this does not change anything in it because of the orthonomality in some sense it's just like a local basis change that's all what you do that's totally harmless then what you have is on the right hand side is you this no sorry this is something stupid I said let's put like this we multiply it into the left the orthonomality ensures that I can basically invert the whole thing again and the right hand side I have this v dagger matrix which I also know has an orthonomality property and if you insert it into the formula for the normalizations you realize that the v dagger has exactly the properties you want for the piece to have because what you have is that v dagger v is the identity due to the properties of the singular value decomposition now I stack I sort of like I slice this v dagger into the b matrices and then if I do it in this way that I associate b with v dagger I get b, b dagger and the sum over the sigmas which is implied in this product here now becomes external because I have pulled this out as an index try to work out this result on your own that this formula implies this result here which is just sort of like this right normalization so what you have achieved here is at this point is you have replaced the m matrix by a b matrix meaning that you have now the correct normalization what has happened here now you have this a sigma l where I have put the u in it and you have this s left if I multiply the s also into the a what definitely happens is that I get here it's this position here a matrix which does not have a well defined normalization property I simply don't know what to do about it forget this for the moment let's look first at this picture which means the whole things of like this a sigma l which I had obtained by multiplying the u into it now I also multiply the s into it now I get a matrix m sigma I use m to say I don't know what its properties are but what you have seen is sort of like now sort of like we have again a mixed state a m b but the position of the m has jumped one to the left and this means of course you can repeat the procedure again and again until the entire state is only b or you could have done the whole thing going from left to right then everything would be a in that sense you have achieved normalization because the product of these a or b matrices ultimately gave the identity which means that ultimately if your state was normalized from the beginning ultimately a factor of one will pop out at the left end and that's it if it was not normalized actually the norm will pop out which is also convenient way of finding out how big a state is so in that sense of like it's sort of like you see how you can achieve that it's either a or b so now you may wonder that is nice but why does he do that why didn't he stay in the a notation or the b notation right from the beginning nice that he can maintain it but what's the point about it well the point is that if I want to compress I have to do a little bit more because what I can do is if I look at the state of the state psi I can write it in a simple way by taking all the states I can form on the left from the a matrices the block states and I can form similar block states on the right and then I have immediately a Schmidt composition where these are the elements of the s matrix okay and then there is the mathematical theorem that says the best approximation to that state here is the one if I can afford a certain dimension or a certain number of states here the best compression is given if I throw away the smallest singular values which means compress the compression just means if I say I can afford a matrix of size d say I keep the d largest singular values in that object here and throw away everything else which means that in that picture here where we had our states from the left with all these a's and on the right I have all these matrices b and then here I have this matrix s in between I assume that it's arranged that the large singular values are up here so I write s one and here is s smallest or that's right at s min and here s max so then I say well I simply shrink this matrix and set everything else zero and of course because now this matrix here does not find anyone to talk to here anymore if I multiply it and the same thing here I can immediately compress the column dimension of this a here and the row dimension of this b here that's the way how I compress but if I have done that I have only achieved the compression of a single at a single site namely I make this guy smaller here which talks to the singular values and I compress here so nothing has happened to all the other sites but now I have to do this everywhere and the point why I need this mixed representation is that while I do this compression normalizing and throwing away degrees as normal throwing away entries in the matrix is that then while I go through the chain like a zipper converting b's into a's or vice versa I always stay in the notation that I have a's on the left then I have the hot site I think Miles and Steve White like to call this the ortho normality center and I have b's to the right and why do I need that because this property of the Schmitt decomposition that I just say I keep the information the most important information sits in the largest coefficients ls only holds if these guys form ortho normal sets each but this property only holds if these matrices are of the a type and if these matrices are of the b type which means if you had say a here and also a matrices on the right hand side you would not be in a Schmitt decomposition because you would be ortho normal on the left but you would not be ortho normal in the right because there you need the b type and this is the reason why you go into this mixed representation so that you can freely shift around the boundary between the a's and the b's and in that position you can compress and make your matrix product state again smaller so this is, I do not expect honestly that those of you who have, you will see that for the first time will have sort of like have an immediate grasp of all the details but what I just remember that to be able to compress based on a Schmitt decomposition you have to maintain certain forms of gaging and they are insured by mixing these a's and these b matrices in this very special form that's perhaps a take home message and whoever wants to do more can read it up for example in my 2011 review or I think other people have written about this as well. Yeah, sure, of course. Let me again, just before you see it because here I don't understand anything. This is an extremely good question because on this question of course rests the quality of the approximation that a matrix product state represents assume we had a matrix product state representation which is so huge that it is essentially exact for the sake of the argument. Then we can basically say plot i and that would be the singular values s i and then of course the quality of your approximation depends on whether this curve say looks like this and your computer can manage that or if the curve looks more like this and your computer can also manage that. Yeah, and so this is clearly dictated by the physical problem. Now, if you look at entanglement entanglement was minus s i logarithm s i what you then see is that this kind of curve corresponds to larger entanglement than this one. Actually, it's not totally rigorous. You can come up with very weird states where the relationship is not so simple but they are weird in the sense of mathematicians constructions but if you're a practitioner you say that means entanglement was higher than here. So in that sense you can say it will work well if entanglement is low because low entanglement means a very rapid drop off of these singular values and then you can compress it. So it really depends on your physical problem. So as I said, a typical good code nowadays will deal here with 1,000 to 10,000 and if your problem is of that form then it will work. So if one dimensional spin chains, I would say typically you get away within the low hundreds, 100 often, yeah? For a one dimensional Hubbard model I would say 2,000 looks like good. But again, the parameters in detail will decide. This is really rule of thumb. If you try to apply this to two dimensions where I mean I can anticipate that result where you arrange a two dimensional lattice basically like a one dimensional snake where then this short ranged interaction will look very long ranged on the one dimensional system. They're basically only the sky is the limit for the Hubbard model which currently is probably the most difficult problem we are studying and so the most is hardening. I would say if you want to have for reasons which have to do with the aerial law sort of like there is one dimension which is basically doesn't really matter 10, 20, 30 whatever you want. But here there's one dimension which is really small and then the Hubbard model I would say four was I would say several thousand six, several 10,000 and eight which currently, I mean I think we have currently the most states but I would say 100,000 not good enough. So that's for the Hubbard model. For the TGA model everything shifts to much smaller numbers for frustrated spin models also for example for Kagome you can go up to say 12 or 14 with 20,000 states but really there we are talking about how should I say the hardcore only for the few and very brave in that field to actually manage that in reasonable computation time. So where this problem really becomes very acute is in time evolution but I will treat that separately because they're even relatively simple problems will relatively soon lead you to limits. Okay, good. So that brings me, yes, yes. Yeah, no actually it's not. It's actually the norm which you use here is the two norm. It's the two norm but it's not the optimal state. It's often close to the optimum but there's a very simple argument where you can see why it's not the optimum. I mean I go through it like with a zipper which means the truncation which I do say here if I come from here knows what has happened here but does not know what's gonna happen over there. So there's an asymmetric flow of information. What you can do is you can extend this way of compressing to a variational compression scheme where actually you can even show that the first step in the variational scheme is equivalent to what I have described now and then sort of like you go forth and back through your chain several times with a slightly different approach and then you iron out the remaining differences. It turns out that in many, many circumstances like for example for time evolution which we would get now too, the change from one time step to the next is so small that the error you make by this asymmetry is not the one which will keep you sleepless. It's other stuff. If on the other hand people have done non-unitary let's call it time evolutions in that field where it turned out that the changes were so big that this kind of simple way of truncating it was quite good but not good enough. And then you have to do this variational way which I also describe in the review it's just too long for here and where you can really say this is the best one unless it gets stuck. That of course numerically with variational algorithms you always have the fear that it might get stuck in a local minimum which in those cases I have not seen so often. So it's not a big worry but you can improve on that. Typically however it's the thing that most people do and it's good for 98% of applications. Okay, good. So now let's go to time evolution. Historically everything was the other way around. People knew how to do ground states and time evolution was a big mystery. In fact I think probably no one was really interested because before the cold atom guys could do high precision experiments in time evolution the question of what happens to a quantum state as it evolves in time was of high academic but little experimental interest and this has changed totally. With hindsight pedagogically I would say it's much easier to explain time evolution than the ground state search and remembering those days basically when we had these ground state codes it was basically when we did that a PhD student of mine who is now also has a big professorship in Bonn, Korina Kollat you might have come across her name in papers and that was in part of her PhD thesis that was basically one afternoon of turning ground state into time evolution because that's the much simpler way you have the much more complicated code and you simplify it. So that's now why explaining it to you I start out with the simple thing. So what you want to achieve is you want to calculate something like a time evolution e to the minus IHT and the problem is you could ask it very generically what's the way of representing that as a matrix product operator. So there is maybe also there are definitely also other ways of doing that than the one I put up here. The oldest one and perhaps the one that's the simplest to explain in an introductory lecture is to say I trotterize the time evolution operator into small time steps. So I have a time step tau which goes to zero I make infinitely many or many of these time steps such that this always adds up to the total time I want to reach. And so if you think about it as an example about the Heisenberg model which decomposes into lots of nearest neighbor local Hamiltonians what you would do is you do this trotterization into small time steps then you decompose your Hamiltonian into this sum here and then you do the forbidden thing is that you say well this is e to the minus a sum but e to a sum is the product of the ease. So I turn this in a product of e to the minus IH tau. We all know that this is wrong that's correct classically but these are operators and they do not commute. So this is certainly not an exact expression this is the so-called first-order trotter decomposition you may read up on the higher order ones to make this better but the conceptual issues are the same in all cases. So what you do is in the next step is say well I have done that wrong step but why not? Then I calculate this object here. Well that's simple this takes two neighboring sites which have D local states each so it's a Hilbert space D squared and it pops out another D squared. So it's as I operate it's a D squared times D squared matrix this is just the formal expression for that. Then I have to address the factorization problem because here we have the Glauber formula but the nice thing is the A and B here are in reality H times tau comma H times tau in the commutator. So while these guys do not commute the factor tau squared basically takes care of your problem. So if you make tau small enough the arrow becomes negligible and because if you think about a spin chain where sort of like with these short-ranged interactions sort of like these guys they commute they don't talk to each other it's sort of like the other interactions which cause problems with this one because they share this site. So what you do is typically you decompose this into say well let me do all say odd bonds first and then all even bonds next in a time step and so on and so forth. This is just sort of like notation for this thing we will get a picture in the next slide anyways. So the only thing I still need is because I want to stay in my MPS formalism of course I have to manage to bring the local evolution operator into the MPO form and that's very simple. So that's the local evolution operator two spins evolve into another two spins by e to the minus ih1 tau. This is this d squared times d squared matrix which I just mentioned. So what I do is I the game again reshaping rearranging SVD and so on and so forth. I rearrange it in the sense that I group not ingoing and outgoing but I group ingoing outgoing on site one ingoing outgoing on site two. I just rearrange the elements of the matrix and now on this guy here I do a singular value decomposition and you don't have to go through all the steps but then as before I bring the physical indices up to label matrices and then it turns out this evolution operator is just the product of two matrices which are of the MPO form ingoing outgoing, ingoing outgoing and in fact there are even not matrices here but they are vectors which makes life somewhat cheaper. So I have everything that I need. This I can calculate, this I can calculate. I have the MPO for the time evolution. So if this is my initial state I apply say the MPOs on all the odd bonds these guys here, site one, two, three, four, five, six and so on. Then in the next step I do it on the even bonds two, three, four, five, six, seven and then of course again because I have many of these time steps I continue with this game like this is always an MPS because contracting an MPS with an MPO gives us back an MPS. So the problem in all this is however that as we learned MPO make MPS dimensions grow by a multiplication. And this multiplication factor is the dimension and if you work out the dimension here of this object you find that the growth is in D squared because this was the dimension of this matrix. The SVD will not help you about that. So if you do only very few of these time steps what is the consequence? It explodes into your face. That means after essentially every time step or one complete time step combining odd and even you can do various things honestly I have forgotten the details which is the best depends a little bit on situation you have to work it out but anyways you have to do a compression here then you do your next compression there and so on and so forth which means what you do time evolution is just a sequence of applying a matrix product operator which we managed to work out compress the MPS back to a reasonable dimension apply time step compress time step compress time step compress and in that way at least in principle you can imagine that you could go on calculating up to time equals infinity. Well at least I mean at least the algorithm allows that. The question is just whether you get any reasonable physics out of that. In the beginning people were very much concerned about the trotter errors you make when you do many, many time steps. This is not really the problem. Actually the heading perhaps I should have explained is there is a whole bunch of trotterizing time evolution algorithms. TDMRG, TMPS and TEBD. What I have explained here is TMPS but essentially all these algorithms are up to small details more or less equivalent. So I mean this is really kind of way beyond what I should discuss here. It's basically like local dialects of the same language like the in German the Bavarian dialect whether it's Saxonian or whatever. That's more or less it. Okay but well to lighten up the story I have in the beginning when we were first able to do such things we were totally excited about it so we made movies. Nowadays I think I could make much nicer movies in terms of software but now I'm no longer as excited about doing it. So this looks very old fashioned what you will see and it's because it's old like it has a little bit of trouble into being built into that software so I have to explain first what I am going to show you. It's a Hubbard chain with a moderate interaction doesn't really matter so much. It's at quarter filling and the important thing is I add this will be shown in green. I add an electron spin up at the chain center at time zero into the ground state. And then what that should trigger because of spin charge separation in one-dimensional quantum systems that you might expect that will of course do something to the system that will propagate some wave and what you expect is that charge this will be in red and spin perturbed perturbance will move at different speed and that would basically show the separation of charge and spin actually Immanuel Bloch and company have done that sort of experiment last year or so. And that was I think the first picture in that direction so I will let it blows a place several times so you can see it sort of like you will see at 10 it restarts the insertion of the charge and then you see how the charge spreads with time. Here you see how the spin spreads with time and you see that's clearly happening at totally different velocities. And at that time this was with ridiculously small matrix dimensions of 200 or so. So nowadays one could do incredibly beautiful calculations there. Anyways the question is why do I stop at time 10? Well one reason is that we are approaching the ends of the chain but the real reason is of course that unfortunately we cannot go much further. First observation is there is you can prove that the growth of entanglement in a real time evolution if you start out with an entanglement zero is limited there's a bound that it grows up to linearly in time. Now as the entanglement is the logarithm of our matrix dimensions this means this linear growth of entanglement implies an exponential growth of the matrices you need for your simulation to maintain the same accuracy. And this is where we get back to that picture that kills you simply. Unfortunately in many of the situations that are of interest like the global quenches in cold atom experiments you realize that bound. It's not like a mathematicians bound it's a physics bound and you can explain that. I think I will have a slide on that in a second so let me hold the thought for a moment how you explain that. The second remark I want to make is you can also calculate ground states in that way that might be of interest for those of you who want to write for fun a little code that does time evolution because by imaginary time evolution you can of course also project down to the ground state. I think I don't have to go through the formula in this audience it's clear this will project out all the more highly excited states unless yours are stupid enough to start with a state that doesn't contain a component of the ground state. The problem the nice thing about it is you get it for free the not so nice thing is it's comparatively slow to a real ground state search and it has a tendency to get stuck at not the ground state but for more complicated systems. So if you just want to do this for fun to play around with this I would encourage you to do exactly that to find ground states. Now what the question that I think currently concerns quite a lot of people is how do I implement long-range interactions because obviously what I did now was a very short-range example namely the Heisenberg model. So what do you do there? And they are essentially and that's probably what I will do in the last five or six minutes today and then show the applications as I promised tomorrow. Is there's basically a stupid way of doing it is for example the following. Say well but if you need a quick fix to get a result that may not be so stupid after all let's imagine you have a frustrated spin chain very often a J1, J2 model. Then of course the whole thing would be nearest neighbor again and you could do your standard trotter business if you just create fat sites like which contain two sites each. Then of course only this guy will speak to that one that one to that one and they speak to themselves in a kind of schizophrenic way. But that you can do. The problem is the local bond dimension the local sorry the local states based dimension goes up from D to D squared and then the entire algorithm becomes very sluggish. So this is not really an option to pursue beyond playing and perhaps for nearest neighbors. This is number one. Number two is I don't know whether Miles told you about it. Did Miles explain swap gates? No, good. This is still trotter plus swap gates. This is what currently quite a lot of people use. It's nice because it's quick to implement it's relatively fast. It's again not very good and you will see in a second if you have longer range long range interactions. I mean really long range not just next nearest neighbors. And the idea is the following say we have let me do it in this picture. You have a J1 interaction and you have this J2 interaction and you want to trotterize that. So what you do is so this is site one, two and three. So what you do is in your first step you do a one, two, three. You do that time evolution step. Then you do the swap step. You physically exchange the sites two and three. This is the swap step which we will do in a second. Then of course what you can do is in the usual trotter way do that time evolution step. And then of course you go back by another swap to the original configuration. And depending on how your interactions look like there might be very smart ways of keeping the number of swaps as small as possible. Why do I want to do that? Because the swaps are actually a little bit time consuming because they involve a singular value decomposition. Let's try and this is why I want to do is on the blackboard. Let's try to think about what do we want to do? Say I have here, actually at this point it doesn't really matter whether these are A or M matrices or B or whatever. So this is my matrix product state. And now I want to exchange physical sites two and three. So what you cannot do is say, well, I simply change these matrices. I mean the simplest reason why this will fail in general is that the matrix dimensions need not match. This is sort of like the most dumb reason why it will fail. And you can also try and do it in case where it can exchange it and you will see it's no longer the same state. What you really want is, let me get in indices. Here we have a matrix AB. Here we have index BC. Here we have index CD. And here we have DE. And what we sum over is BCD. I mean in terms of these matrix multiplications. So what you want to do is when you exchange sites two and three you want to keep the rest of the state isolated. The others don't want to get involved into this business. So what you do is basically you form a big matrix M. Let me, I mean you multiply these guys together. Then you have an M sigma two sigma three B and D, okay? Trust by multiplying these guys out. Then, so malt, then what you do is, what do we do? What did we learn? What we have to do before we do a singular value decomposition? Here we have to do the unslicing. But before we do the unslicing, we do, we can do this now. We can do this M, B, sigma two, D, sigma three. Now we make one big matrix out of these D squared matrices which we had beforehand. Then now I make a reshape by sort of like, I can of course rearrange the entries of the matrix such that the label sigma three goes here and sigma two goes there. This is just rearranging the matrix elements. I reshape that into a new matrix M over slash B, sigma three, D, sigma two. And now what I do now is I do again a singular value decomposition which puts me here. Let me call this U, let's put it here. U, B, sigma three, let me call this sum here. S, S, S, V, dagger, S, D, sigma, what was it? Sigma two. So what you now can do is, comma S, it was here there, sorry. So what you now can do is you can basically say, well this guy here, I do into an M, sigma three, B, S, S, S, S, that I call M, sigma two, S, comma D. And then basically perhaps I multiply this to the left. It doesn't really matter. I could also multiply it here into the right. But then this here becomes M, sigma three, B, S. And this here becomes M, sigma two, S, comma D. So it's again the same operations as before. And what you see if you take, we have now taken this object out and transformed it into this object. And you see it talks to the left by the same B as before and to the right by the same D as before. So it connects correctly to these guys. But what has also happened is that the position of the physical spins or whatever physical sites has changed. And in that sense you have arranged to exchange two physical sites. For those of you who want to pursue this business a little bit further, try to kind of take a picture with your smartphone or whatever and try to go through this calculation because it's a relatively simple one where you have all the steps that pop up in a typical matrix products calculation. This reshaping, multiplying, slicing, unslicing, SVDing, they are just being put together in a kind of a smart way and then you exchange two sites. Okay, so if you give me one more minute because then I want to finish this off is there are two more methods which I would just like to advertise because I'm running a little bit short of time if you want to do long range time evolutions. The Trotter swapping of course finds its limits if you have to swap over long distances because there's no way of swapping say side here and side there, you really have to go through it side by side. Another way of doing it is the Krylov or Krylov Russian guy time evolution where basically you calculate successive powers of H applied to psi. Then you turn these guys into orthonormal states these are the Krylov vectors which you might know from Krylov space theory or those guys of you have done Lanzos diagonalization have seen that and then what you basically do is you express your time evolution operator in this new effective small basis and surprisingly it's for small time steps say 0.1 or whatever it's totally enough to calculate here the first four or five powers to get an essentially quasi exact time evolution. Here you really have to go to the details. I have Lesouche lecture notes from 2012 where this is being described. There was a summer school on strongly interacting systems out of equilibrium and we are currently because there is no really good literature on that method trying we are putting together a paper on the comparison between various of these current time evolution methods with dirty details for people who want to implement them. The disadvantage is you don't care what age is the interaction can be as long-ranged as you wish at least methodologically. The disadvantage is the method is relatively small. My current favorite is but it's a little bit like a Ferrari in terms of application you really have to know what you do. There is this new called time dependent two or three years ago time dependent variational principle which if applied judiciously in my view is the most performant method. I'm saying that because the way it was proposed originally as in the literature it's not the way I would recommend it to you doing it. There is there are variants of it which are extremely efficient and extremely powerful unfortunately beyond what I can explain briefly in a lecture like this one but we also will describe that in this forthcoming paper which hopefully we have in the next few weeks and for those who are interested in that I really recommend that. So at this point I would like to stop and tomorrow morning I will show you applications of time evolution, speak a few words about ground states which will become very simple after all the tough stuff you have gone through today and then show how to relate it to more materials related stuff with DMFT. That will be the program of tomorrow and thank you for your attention.