 Okay, thanks again, and welcome back this morning. As you remember, yesterday we discussed quite a lot about the construction of matrix product states, issues of normalization, compression, and related issues. We had a look already at time evolution algorithms, because this is the simplest thing to do in some sense. We also discussed long-range time evolutions a little bit. So what I want to do today is basically continue in some sense with the topic of time evolution to lighten up the whole thing I put in some application, which is a couple of years old, but there's lots of pictures. So I use that one to show you the power of quantum simulation based on these methods. After that, I will move on to discuss two topics, again related to time evolution, namely, what do you do to work at finite temperature and what do you do to overcome this problem of time limitations? This is currently an active field of research, obviously, because this is something what we don't want. You will see in the application that the methods are currently limited to relatively short times. There are some ideas which are too early to be presented for a special class of problems. There is already a very successful technique of extending time evolution to the future, which I will show. And then finally, I will discuss what many people would have thought is the most fundamental thing, but in my view is not how to obtain ground states. This will mainly be a discussion about how to construct the matrix product operator of a Hamiltonian so that you get a general idea how to write down these guys when it's more complicated. Because unlike the states which are produced by the algorithms, the operator somehow you have to encode by hand. And then finally, I want to show in the remaining time whatever this is, the application of a DMRG matrix product states in the context of the dynamical mean field theory, which is a method that is best in higher dimensions and can address real materials nowadays, but has certain bottlenecks. And there might be a chance that this type of methods we have been discussing yesterday and will be discussing today are going to overcome this issue. Well, in certain cases and other methods and other cases, we will have to see. OK, so one thing which we have not covered yet, which however is very important, is if I have a state, I mean what do I do with a state? Ultimately, I want to be able to evaluate observables. And that's a topic we haven't discussed yet. Or we want to calculate the overlap between two quantum states, which is, as you will see, a very closely related question. So the slide that comes is a little bit full, but we will see that it's ultimately relatively simple. What you have to do is say I want to calculate the overlap of two states, psi at time t and psi at time 0. I may call this state phi, just to have simpler notation. So what I have to do is I have to draw the matrix product state of psi. That's the object down here. And then I have to draw the bra. And the bra is all these coefficients complex conjugated. So in our picture, it's the same thing, just that the legs point downwards. So this is then the matrix product state for the bra phi or psi of t or whatever you want. We will get to that expression in a second. So in order to evaluate that, what you have to do is in our graphical notation is to evaluate all the contractions here. And I put arrows here everywhere where you have one of these contractions. And then, of course, the big question comes up, in which order do you do these contractions? For example, you could say, well, because this is a physical object, the state psi. So what I do is I do all these contractions here first. Then perhaps I do all these contractions here first. And ultimately, I contract the two states. That would be one way of doing it. But then if you look at it, if you did it that way, what happens is that, of course, here you have matrix, matrix multiplications. That's what the contractions mean. But at the same time, you have the physical legs sticking out. And you have to do this contraction for each configuration of the physical legs, which means as there are, for spin one-half, there would be two to the power L spin configurations. That means you would have to carry out an exponential number of these contractions to be able to write down state psi. In effect, this would bring us back to the standard notation with the expansion coefficients c. So that's certainly not what you want to do. You would lose all the advantages of the method. But what you can do is you can do perhaps both graphically or in a formula, people understand it better in different ways, depending on them. Let's have a look first at the analytical formula, and then we look at it in the picture. This is, again, this overlap formula. So and I have written down explicitly the MPS for state psi. This is this object here. That's the sum over these sigmas. And that's the bra. It's the sum over the computational basis here. And these are the matrices that go with the state phi, just complex conjugated in order to get the bra. So of course, this whole thing collapses into an overlap of the basis. This is an orthonormal basis. So it's just a single sum of this product of matrices. And of course, ultimately, it's a one times one matrix, if you look at the dimensions. But what you can do now is, OK, this is just rewriting this expression here. What you can do is, as this is a number that we know because this is one of these expansion coefficients, the transpose of a number is just the number itself. So what you can do is, you can take this entire expression and transpose it, which turns it into this expression here. So the stars turn into daggers. And the order of the sequence of the matrices gets inverted. The usual rule that AB dagger is B dagger, A dagger. Nothing more. And now what you can do now is that you see that the sum over sigma 1 just concerns these two matrices, just like you see it here. The sum over the sigma 2 concerns the two matrices adjacent to that. So it gets an onion-like structure. And unlike the onion, which you peel from outside, here you work your way from the inside to the outside in the following way. You carry out this object here. This is a matrix, matrix multiplication. Well, in that special case, it's actually a vector, vector multiplication. But this is just the specialty of the first site. In general, we are talking here about a matrix-matrix multiplication that costs third power of the dimension of the matrix. You have to do this for each of the local states, lower case D. And then you have to do this entire thing here, because this gives you a matrix. Then you have to evaluate that object here, then the next one, and so on and so forth. You have to do this of the order of L times. If you are more precise, you would find that it's 2L minus 1, because you get a product of three matrices in general, which you carry out as AB times C as a sequence of two matrix multiplications. But anyways, the important point is that now the cost of evaluating the entire set of calculations has collapsed from being exponential to being a small power in the size of the matrices. And that's, of course, what you need. Otherwise, the entire advantage of the method would have disappeared. If you want to look at it graphically, you go up here and you think of it like a zipper. Where, so basically, you start here. This is the zipper, basically. You start here. You first contract these two guys here. So it's this contraction. Then you next contract into this object, then this one, then this one, then this one, this one, this one, this one, and so on and so forth. So you move through the zipper and do one contraction after another. So in this case, we know that this is the way to do it. The interesting point is that in higher dimensional generalizations of these states, there's, of course, the same issue. What is the best order of contracting it to get down my numerical cost from something which is done naively exponential and therefore useless to some reasonable power law. And the interesting thing is in higher dimensions, unlike in one dimension, it's not clear what the optimal way is. There are, of course, now strategies to do that, which you will find in the literature. But some people proved, actually, that from the mathematical perspective, the question of the optimal contraction in two and more dimensions, if you write more complicated networks, is NP harder, NP complete? Perhaps someone might know here. I have forgotten. But anyways, NP whatever means that it's one of the problems you don't want to deal with. So you are extremely smart and have the right intuition how to do this. Here in this case, we are on easy ground. OK. So that's how you do that. And now you have the overlap of two states very efficiently. And now the question comes up, how would I work out, for example, an expectation value? I could also look at matrix elements like writing a phi here. This doesn't make any difference. Now, this is shown in the picture down here. What you do is, say, here it's even shown for two point correlator. You just sandwich in the local operator at the point where it belongs, say, that would be an SCI, and that would be an SCJ. So this network, which is shown here, would basically give you the expectation value SCI, SCJ, and allow you to work out correlation functions. And the amount of calculation and effort is essentially the same as here, just that instead of having just one contraction here, you have one contraction here and one contraction there. But these are actually the cheap contractions, the expensive ones are the, or comparatively expensive ones are the ones sitting here. So that's how you would look at two point correlators, or higher order correlators, or even simpler ones, like a local density. And the question is, if you look at two point correlators, because they are so important for us, what form will they take? And I will not go through the motions of the mathematical proof. But what you can imagine is, assume your system is really, really big, then you would assume that the state it has has some kind of, say, translational invariance, well, which may be broken into some periodicity or so. But let's assume for our intuition that it's a translationally invariant state. So basically these matrices, they will look the same on each side, ignoring now the ends. And then what it means in the contraction, I can also consider this object E here. And so the contraction is basically a multiplication of one E after another, which takes us from here to there. That means that what you have here is something which is more or less like, as I said, I would just want to appeal to your intuition. You have something which looks like a transfer matrix, which gets you sort of like from one, basically, site to the next, because as you know it from statistical mechanics. And then of course, sort of like this connects your operators and this transfer matrix will then somehow basically generate the decay of the correlation. And if you remember, this is the mathematical form, we don't have to go into that. And if you remember the one dimensional easing model, which I guess all of you at one point in your life solved by a transfer matrix method, you will remember that transfer matrix methods consistently produce exponential decays, except for cases of degeneracy, of eigenvalues, and it's the same here. With additional point that it may be that the leading eigenvalue of this transfer operator is one, then of course the correlation will not decay at all. So what matrix product states generically produce as correlations is a superposition of exponentials. Because I mean this will be a huge object, it will have many eigenvalues, each of them drives an exponential decay. So the correlators that you get are either long range or superpositions of exponentials. That's an important point in the sense that you would also like to look at systems which ultimately are critical, which have power laws. And the point is the power law, you will never get fully 200%, which also by the way shows that a matrix product state for a critical system in the thermodynamic limit would have infinite dimensions, so the compression will not work. This does not mean this is for mathematical proofs, it's relevant for any finite system size, that's OK. Because what you will observe in practice, say here this is the log of the correlator, which you are looking at, say spin-spin, and this is the log of the distance, and d is taken to be in some sense smaller than the entire system size, so that you don't have to look at finite size effects, and this is of course taken to be large. That can be many hundreds easily. And then what you will find is that these exponential decays, they conspire, so to say, to produce a power law. So what it will look like is what you expect. And then at some point, all the fast exponential decays will have died out, and only the strongest or slowest exponential, well you should say not strongest, the slowest exponential decay will survive, and then you will be found out that in reality, you're only looking at a superposition of exponentials, and ultimately it will go down like this, because then it turns into a single exponential. So if you now increase the accuracy of your simulation, rerun everything with a larger matrix dimension, what you will find is you get the same curve, and then it continues for longer until it bends down. If you increase the accuracy even further, it looks like this, but so I'd like to get further and further, of course you pay a steeper and steeper price, but what you can then reliably extract is say here, OK, this is power law, and from this I get my power alpha, and that's it. So in this way, sort of like matrix product states are also able to deal with critical situations. In higher dimensions, actually, it's already built in into the structure of the networks themselves. So in practice, one knows how to handle that. Good, so this is, we now use this entire machinery to look at a physical example, for this I have to go here, and what I want to show to you is an idea which is now almost 10 years old that was an idea to use ultra-cold atoms as a dynamical quantum simulator. At that point in time, there had been lots of experiments which were called quantum simulators, but they were essentially about static properties. Then people said, well, think about a flight simulator. I mean, this is something dynamical. There's something happening. So what you want to do is a dynamical quantum simulator. For example, to look at issues like a relaxation and thermalization of quantum systems, which is a big research topic these days. The big advantage of an experiment with ultra-cold atoms is that the dynamic stays coherent for really comparatively long time scales. Solids are decoherent very fast. The problem, of course, at that time was you want, of course, to be able to study what happens to a quantum state as it evolves in time. You have to be able to prepare it in a controlled fashion. Otherwise, you don't know what you are talking about. And you have to be able to carry out local measurements. And many of you might know that in the meantime, there are these kind of atomic microscopes in this ultra-cold atom business, where you can really look at one lattice side after another. This was developed around 2010, 11, 12. It's now a widely used technique. But still, the systems where you can apply it are too small to really see kind of emerging thermodynamic behavior. As you probably know from the Merkel simulations, to see thermodynamics, you often don't need the 10 to the 23 sites. Often 100 is way enough to see what's going to happen. Not always, of course. But let's say 10 usually is not. And so in some sense, this is still kind of this kind of proposal I am discussing is still the one which you would use probably most of the time. And the point was that at that time, there was a first idea of how to do local measurements in a very restricted way, namely by superimposing on these optical lattices, which is just a sinus-like curve. Another one, a super lattice with period 2, as you can see it here. And that gives you a bunch of double wells. And by shifting these lattices around, you can also introduce potential biases. So that's a staggered double well. And this people in Bloch's group, I think he was here. I don't think he told you about these things. What they used it for was they were able to load patterns in a way which I will not explain right now. So they could come up, for example, with that on each odd side they could place an atom and every even side would be empty. And by playing around with the staggered potential bias, they were also able to say not where each individual atom sits, but they would be able to say so many atoms sit on odd lattice sites and so many atoms sit on even lattice sites. And that gave us the idea to propose an experiment along the following lines. You use their technique to prepare a lattice where you have an atom on every second site. Then you switch off the super lattice. Then this is what you have. Then the Hamiltonian that guides this situation is a simple Boser-Hubert Hamiltonian. And this is not an eigenstate of it. So what will happen is they will start running around. And ultimately, they will spread out evenly. And they do not double. It's just that the yellow looks the same up here. It should be half as yellowish, because, of course, it means that, on average, you will have a density of 0.5. And then you want to see how does this happen. In fact, the experimentalists carried this out. And I always love to show this picture here because I mean, most of you are theoreticians here, I guess. And I hope the experimentalists are not insulted. There is, nowadays, this tendency that in experiments, you try to produce one very beautiful figure with the higher artists, whatever. We have a resident artist, actually. And then something like, because they hope to make it on the title page in one of these magazines. And that's from the experiment. And if you look at that picture, I think you will all agree that no resident artist has been working on that. The point why I show it in this context is that the editor invented the title, bosons chill out because they relax. We will look at a relaxation experiment. If you know anything about the statistical physics of this business, you realize that what these guys do while chilling is they heat up a lot. So that's why I like to show it. Actually, this relaxation is related to heating up. And maybe in the course of the summer school, you came across such phenomena because this is very generic. Anyways, what the experimentalists managed was really to look at, say, 45,000 atoms or so arranged in many one-dimensional wires, each of them containing 100 atoms or something like this. And they have this measurement technique, which I will not explain here, which allows them to measure in time. Let's say I had all outsides empty in the beginning following our proposal. And then, of course, what happens, they will start moving and ultimately go to one half. But there will be an overshooting and a relaxation and so on and so forth. That curve actually is quite boring. We will get more exciting ones. But of course, the big point is this is really a time-resolved relaxation or dynamical experiment in a strongly interacting system, where really these atoms talk to each other. That's the Hubbard hue of the bosons in that picture. They can actually vary it between whatever, 1 and 20 or something. So really, that's a nice thing. You can look at Hubbard physics very nicely here. So what we did is, and this is the interesting mixture that in this kind of theory experiment collaborations, you have a very weird thing. On the one hand, the limitations of classical computers will mean that we will certainly not be able to do as much as one of these analog quantum simulators in the experiment does, because that's a quantum device. Our computer program is not. On the other hand, these experiments are very complicated. So they are quite desperate for validation that actually the experiment does what they think it should does. So this is something what you see here. We gauged it on the densities, because this is why. We used their experimental parameters. This small one down here is the fact that they have confining trap potentials. Otherwise, these atoms start running away out of these optical lattices. And if you take all that into account, let's focus and B and C. I don't have the time here to go into explaining why you have small deviations here. But sort of like what you see is up to the times we can reach, the agreement is really very good without using a single free fit parameter. This is really an app-initial simulation of the physical situation you encounter here. Nothing has been fitted or fixed or anything. This is sort of like raw data against raw data. In that sense, we were very proud of that. But as you see, the time range of the experiment is much larger, which takes us to the topic, which I will discuss a bit later. How can we perhaps predict from what we have here what's going to happen in the future? And the reason why we cannot get any further is that because of these particles moving forth and back, the entanglement in the wave function is growing so much that ultimately the simulation collapses relatively fast. The entanglement grows linearly in time. And therefore, the bond dimensions or matrix dimensions grow exponentially in time. And at that time, 2011, or whenever we did this simulation 2012, we had to stop at around 3,000 or 4,000 as the matrix dimension. Nowadays, I think I could push that to 40,000, 50,000 easily. But honestly, what that would gain you is probably that you get, let's say, somewhere like here. So you get perhaps one oscillation more. So I mean, this is, as you know, I mean, that's what exponential laws are like. It's fine until you hit the wall and then you hit it real fast. So that's the same here. OK, so that was, of course, not what got us so totally excited. But what we also had calculated just for fun, because the experimentalist originally told us, well, they could only measure densities at best. We had calculated, of course, just using these standard techniques, the nearest neighbor correlators. And what do you see as they involve in time? Originally, they all start out at zero because it's a product state, which you have originally of these atoms sitting there. And then depending on the Hubbard U, the behavior is quite different. If they are non-interact, if they are extremely strongly or non-interacting, the chest stays flat. Otherwise, it goes up and then it saturates to some value, which is depending on the U. The imaginary part is then the current which dies off as the experiment settles down. So this picture is a little bit confusing, perhaps. So let's extract these longtime converged numbers and plot them as a function of the interaction strength U. And then what you see is there is an interesting non-monotonic behavior of the quantum correlations that are being built up in this experiment. I mean, these atoms meet each other, and then they build up correlations. And there seems to be a U where they do it most strongly. And then it dies off again. This one-over-U behavior for large U, one can actually understand quite easily by perturbation theory. This stuff here I can anticipate. Perhaps one of you guys has a great idea. Still isn't understood where this comes from. Of course, you can say, yeah, because the physical mechanism here is different than the one here. But OK, yeah, sure. But can we make this more precise? But OK, so we had calculated that for fun. We had also looked at the currents. And that also shows you the advantages and limitations of these methods. So let's focus on the amplitude of the current. They had invented a beautiful technique how to extract currents. And what you see here is that's the current they measure, the circles and dots. And that's the result of our numerical simulation. The wiggles actually are here because you have to average over cubes of different size in these kind of arrays of one-dimensional experiments which are being carried out at the same time. But anyways, what you see, given that we have a strongly interacting many-body problem, the agreement looks very good. And if you plot it as a power law plot, you would even think that the current decay might be a power law in that situation. That, interestingly enough, is still an open research question because, of course, I mean, this is less than one decade. And you know for a power law, you should have three decades or so. We numerically can certainly not access it so far, nor can the experimentalists, because the current is simply becoming too weak. So that would be another case where being able to longer times would put theory into an advantage again because, for the experiment, you simply get the problem that the amplitude will be unmeasurable at some point. The fun thing is I may anticipate the only theoretical explanation I have seen so far that convinces me slightly. It's unpublished work by Peter Reimann in Bielefeld. He, however, finds that it should be exponential. So this is really an open question because we see that we are really hitting current limitations. And if I tell you, yeah, with present-day methods, I could get perhaps up to here, yeah, that would not really make a big difference for this prediction. But anyways, what we can predict and which shows the extreme power of this method is that at some point, the experimentalists said, now we can also measure nearest neighbor correlations. They could relate it to the visibility in that experiment, and that's what they came up with. This is this curve. That's the long-time nearest neighbor correlator as a function of the interaction strength U. One sees that quantum coherence is being built up from a product state against potential decoherence mechanisms. Actually, one can show that decoherence in this experiment will probably become only relevant after 100 to 1,000 of oscillations. So really, at times far off the graph, I mean, the graph goes this direction. Now here's no time-dependent graph, but you know what I mean. It's far off, probably somewhere in the Mediterranean. Okay, so, and now you look at what we had predicted, and that's what they find, of course, in totally different units. So we got totally excited when we saw that curve. And then we worked out the factors. And then there was a little moment of disappointment, namely that that's the theory and that's the experiment. So the experiment shows a much stronger quantum correlation than theory predicts. There is no factor of two in between which we missed out, but there's also not the usual argument that experiment has some noise and whatever, and therefore the effect is not so strong as predicted by theory, because it's just the other way around. But ultimately it turned out that the discrepancy was that we were actually comparing two different things which we had thought would be the same. Namely, as I mentioned, there's this small confining trap potential, which in our original fund calculation we had not put in because we didn't know what it would look like anyways. We had just put the particles in a box. And in fact, for the density calculations, this makes no difference whatsoever of any relevance. And so we thought that's the same here. If you look at the nearest neighbor correlation in the center of this trap, what does it care about what's going on far away? Well, that's not the case. If you build in the trap into the simulation, you realize that the result is ultra sensitive to it. That's what Emmanuel and Co. measured. That's what we had predicted as the long time result. And now I show how this builds up as a function of time. So the circles are what they measure and ultimately ends up here. And we of course cannot go to this long time, really. But what we can say is we can sort of like predict as long as we can. And this is in fact what we get if we do the full simulation with the full experiment with the full experimental setup. Again, without fitting any free parameter and I think the agreement is really good. And this shows that these quantum simulators can be extremely sensitive to details and that both experiment and theory however are nowadays able to trace those. And in fact, the physical explanation here in this case is extremely simple because the difference between having a box external potential or this confining trap means that in the box sort of like the particles spread out and then they hit the hard wall whereas if they're in a parabolic trap potential they basically evolve a density like this. And as there's no hard wall the particles can sort of like move up move up the potential energy. That of course costs energy. In an energy conserving world you have to get it from somewhere. Where is that? That's the kinetic energy and the expression for the kinetic energy in the Hubbard model. The atomic physics guys always use Che instead of our T because they need T for time. So, but what you see is that in order to reduce the kinetic energy you have to make this correlator bigger. And this is exactly what happens. This is why in the presence of the trap the nearest neighbor correlators evolve to be higher than if you neglect the trap. So, that just as an example of what you can do and I think there might be more analytical and theoretically work coming up in that direction but that's still an open subject. Actually we might be revisiting it because we have some idea how to do longer times there. Okay, now I want to motivate what perhaps for the solid state guys here in the audience, probably the majority in some sense is totally natural that you want to work at finite temperature. So for example, this is the structure function of a one-dimensional spin chain obtained by the program group in Johns Hopkins and a one-dimensional spin chain where you see here this is what you usually know from a spin one half Heisenberg chain. This is the spin on continuum which you see here in the absence of a magnetic field. Probably nowadays this is quite old picture. You can this much more clearly but the point I want to make here is that of course what you are interested in what happens if you expose the whole thing to a magnetic field and as nowadays we have reactors which have extremely high neutron flux and very precise instruments. What you have as opposed to the past where you were mainly interested in asking where is the peak sitting? You can now actually not just say where is the peak but you can very precisely measure the line shape around the peak and there's lots of information in line shapes which we can now finally access. Okay, so this calls for precision calculations but the problem is if you look at the size of the field typically magnetic fields are not very strong in some units. 10 Tesla, 20 Tesla is now something many labs have more or less but in the most extreme cases are magnets that are at 80 to 100 Tesla that is sometimes pulsed or destructive and so on and so forth. So getting really strong magnetic fields is an issue and the question is of course strong compared to what and then if you look at it then the experiments are usually at the temperature of liquid helium. The energies, the magnetic energies of these chains are very often of something like the order of 10 Kelvin or so nowadays. So definitely if you compare that these scales we cannot reasonably say that we do a zero temperature calculation. There have been spin chains in the past where the interaction was say 1000 Kelvin then four Kelvin for all practical purposes is zero and then why are people no longer just looking at these spin chains with 1000 Kelvin as interaction strength? Well, because in order to see interesting physics of course they have to be competing scales so field and interaction should be not totally different and then because there's this rule of thumb one Kelvin is one Tesla that means that you want such things and experiment at these temperatures. So we definitely need algorithms that work at finite temperature. The ultra cold atoms as soon as you turn to fermions instead of bosons by the way are not particularly cold either in fact compared to sort of like metals at room temperature they are really hot. If you work out what the relevant scales are. Okay, so the question is what do you do about that and there is a technique which goes back in mathematical physics actually several decades and was brought into that field in where I'm talking about in this PLL from 2004 exactly in the context of this introduction of the products, matrix products state tensor network picture and it's an idea called purification and it just states perhaps I do that on the blackboard this one line it just states that let's assume you have a mixed quantum state then it's described by some density operator row let's assume we know it and this density operator row has the form that it's row I, I, I where I assume that I'm able to express it in its eigen basis. Well that's not necessarily the case but let's pretend it is and then there's a very simple way of phrasing this in the language of pure quantum states. Let me give this a P for physical and now let's write down the following state psi this lives on the physical space and the state psi lives on something which I call physical plus auxiliary space which has the following form so I double my physical system so say assume I have a spin chain then the physical state space would be the Hilbert space of the spin chain. Now I say I introduce simply I put here another spin chain and I call it's Hilbert space auxiliary space and then these states I, I, P they live here and I say similar states I, A live here I just take them to be copies and then I say well I assume this is the product this is the state of this bigger system and now if I want to have row P this is just the trace over the auxiliary degrees of freedom of psi, P, A psi, P, A I mean basically what you do is you convert that from ket into bra so you get this and you sum over these I's so that this appears and the square root of row I gets multiplied with itself so it turns into row I and that's it this idea is called purification I think it's shown again on the next slide but perhaps it's better to see this on the blackboard and now what it means is that the entire machinery we have developed up to now can also be applied to finite temperature you just have to be able to find the state psi which encodes the information of your mixed state because then you can do time evolution say on the state psi which then mimics the time evolution of your density operator row and so on and so forth the price you pay is that from say a spin chain you go to two spin chains as the object of your simulation and the like okay here that's what I just put on the blackboard once again and so what I want to focus on is on these formulae down here they look a little bit complicated but in fact they are very simple because what you see is ultimately I want to do expectation values and to do time evolutions so that's the expectation value with respect to the reduced to the density operator row that's the famous trace O row formula then the row is this trace over the auxiliary I call it unfortunately here Q I here I call it A is the trace Q of the projector it's just this expression here okay now as this is an operator on the physical state space and this is an operator on the acts on the auxiliary state space I can invert their order so I have a trace over all of state space of this object here and now I use the cyclicity of the trace and then this whole thing here I move that into the front the whole thing here turns into a normal expectation value and the trace disappears so which means that you don't even have to modify your code you can recycle your code for expectation values same thing for time evolution that's the von Neumann equation for the evolution of a density operator I replace the density operator by its trace expression the one here then again this is auxiliary state space this acts on the physical one so I can pull the trace out so I have trace of Q over E to the minus IHT psi and this is simply psi of T and that's the associated bra so what I have to do is I do the same thing as before it's the same expression for a density operator so again the expectation values work as always and the thing I have to do quite intuitively is well I do a time evolution once again on a pure quantum state again something which we programmed I mean that's what we have been discussing yesterday that means we can do real time evolution we can do expectation values if if and that's of course the big if if we know the row of P at some time zero say yes that we will get in a second because what you do is you build in the temperature to get the state for a given state for a given state well I mean question is whether a temperature would actually be defined whether that makes equilibrium assumptions but the point to the question is of course good because as was asked what about the temperature well in principle the mixed state can be anything it need not be a thermal state but that's of course the application we are perhaps most interested in so the question is how do you build it and the thermal state of course is I mean because you see I say I can do everything if I know row well but if I know row I don't need the computer the problem is that I don't know row yeah usually so I mean can write it down but that doesn't help me okay and so for a thermal state it's actually very easy to get from writing it down to actually doing it the thermal state is of course e to the minus beta h and I write this in the fancy way that I say I write this as e to the minus beta h over two times the identity operator times e to the minus beta h over two okay so if I'm able to come up with the pure state representation so the purification of that object here I'm back to what I already know that say that would be the pure state representation of the identity operator let's look at it in a second but then it's just again a usual time evolution but just now in imaginary time not in real time and only up to beta over two that's how I would construct it so if you give me this guy here I am done because if I have the possibility to do a real time evolution I can also do imaginary time evolution in fact it turns out imaginary is much nicer because as opposed to a unitary time evolution which conserves errors once they are in there the the advantage of an imaginary time evolution is basically that in some sense it's a projection down to the low energy states and therefore and errors get also partially projected out again so it's it's much better behaved uh... then uh... then real-time evolution so but we still have to come up with the purification of the infinite temperature state and let me do that uh... on the blackboard let's just do it for the simplest possible case because you know the infinite temperature state is a product state so every every uh... uh... state has exactly the same weight so uh... or the info let's say the other way around the infinite temperature density operator is uh... simply one over the dimension of the hillbill space all along on the diagonal so and that means in the special case uh... of infinite temperature that the infinite temperature density operator is basically the product the tensor product of the infinite temperature density operator just for one site you can work that out to see because it's just uh... i think quite trivial you can physically think about all correlations being killed off but this is mathematically rigorous and now let's think about the spin one half there row infinity on one side would be just one half one one let's put it one half one half here okay that's the infinite temperature uh... density operator and now to purify that is actually very simple just think about the following state which here this is the spin on the far physical space and what i call it q or a uh... q on the auxiliary and if you look at that state and you trace it over q what you get is just uh... one half up up plus down down so this totally mixed state which is typical for t equals infinity is just the trace over a maximally entangled state where each of the combinations has exactly the same weight so what you have here is that uh... uh... what you have to write down is a maximally entangled state which means it's just like this combination every state in p is combined with its partner state in q and they all have the same weight and that's it that's your starting state and then you purify it uh... by taking the trace totally maximally mixed comes from maximally entangled after this uh... tracing process by the way the whole thing works also if say i pair the up with the down here and the down with the up instead of the plus i make a minus because the tracing will of course multiply this pre-factor with itself so any phase a phase will disappear uh... and that state for example would have the advantage the singlet state that you could see that it has sc total equals zero and total spin also zero which allows you if your code has good quantum numbers uh... also to play that computational advantage but conceptually it doesn't make any difference so but this means we are able to write down by hand what the infinite temperature purified state looks like and then so we are set we then do the imaginary time evolution up to beat a half and then we do real-time evolution as much as we want and in fact there's also something which you should watch out as uh... what happens on the auxiliary state space only serves the purpose uh... giving us the possibility to trace over the basis of the auxiliary state space you can do that whatever you want so what people nowadays do is uh... they introduce arbitrary unitary evolutions on cue uh... when they do the real-time evolution this helps a lot to bring down entanglement uh... here in this picture i will not fully explain it but just sort of like as a guide for the eye if you if you do this smart thing this off like the times you can reach simply increase the lot just if you look at the times here the colors are about accuracies you see here we are talking about eight ten and then suddenly i talk about times twenty or so uh... so in in in a special application but that's very very generic and the the transformation which you usually do is is if you have a time evolution e to the minus h t on the physical as the first kind of trick of doing something better that you do e to the minus minus h t on the auxiliary state space this helps you saving lots of computational effort if anyone is interested i can discuss this uh... in the break or so but this is just a trick which you which you use so that's what you do for uh... thermal states i will i will show pictures after the next topic and that's the last one which i will discuss before the break is that what we now are able to do is uh... we are able to calculate uh... pure states time evolution with this trick we are also able to calculate mixed state time evolutions and in the particular case of thermal states we also know how to construct them by an imaginary time evolution uh... and this purification technique so we have everything at hand but the problem still remains that we do not have very long times at hand because of the entanglement growth in one situation where this is particularly relevant is that you very often are interested in observable or correlators like say s plus on some site uh... x and some time t s minus at site zero say the chat center of a spin chain or so at time zero these are similar things in greens functions that's what you are interested in because what you want to do is you want to do a double Fourier transform to arrive at something as s at s plus minus a k omega the structure function or kind of a a greens function and frequency momentum space i mean this is our standard quantities that experimentalists measure if you remember the picture I showed off call in pro hem's experiment and what our piss measures uh... for the more electron oriented guys and so on and so forth that's these are typical quantities and now the problem is the resolution in in frequency space will not be good if your times are relatively short. I mean, that's simply a property of the Fourier transform. Because what you can do is you can, of course, do something really awful. Like, say you have the data up to time t, and then I continue the data by 0, then, of course, the Fourier transform will have mainly wiggles because of this jump. OK, no one is so stupid. So what people do is they introduce damping factors e to the minus eta t. I think you all know these factors from field theory or many body theory where you say eta is 0 plus or something. But then, of course, if time is short and you need that factor to basically dampen out your signal because you can't calculate it for long times, unfortunately, eta is not 0 plus epsilon, but it's appreciably large. And then the Fourier transform, what happens is, say you have your line shape, which would look like this. And what your Fourier transform will see a picture of that in a second, your Fourier transform will give you something like that. Then, of course, you are totally unable to extract anything about line shapes. Do you get the peak right? Yes, but the line shape information is totally lost. OK, so one idea, which is very helpful in certain cases, is the following. What your computer program gives you is, as time evolves, it gives you for all these quantities, values at certain time steps. So you have a time series. And quite generically, the time series, I called it xn. And then, of course, at some point, you stop. And now you would want to know what happens to the time series here. And the ansatz which you use, and engineers have been doing much smarter things along these lines for many decades. But this ansatz is very often the one that does it in physics. You say, the next point in the time series is a linear combination with some coefficients ai of the p preceding time steps. So this one here is a linear combination of, say, these five which preceded. OK, you can always say something like this. The point is, of course, that you say that the coefficients which allow this prediction of the next point in the time series, that these coefficients are constants. Because, I mean, otherwise, they are different here, here, here, and here. It's useless. OK, so if you make this assumption, which is pretty wild, admittedly, then, of course, you can take your raw data, which you have, and use it to determine what the coefficients ai are that match your ansatz best. If you find at that point that it's hard to match the ansatz with a small error, then, of course, probably the ansatz is wrong anyways. But that gives you a first indication. But let's assume you find that you can do that with a small error. So in modern language, you would say you learn these coefficients ai from the data. Well, it's just a least squared error fit in practice. Then you can use this ansatz, of course, and take your last data points, which you have from your calculation, use it to predict the next one, then use that to predict, again, the next one, the next one, the next one. And like this, basically, you can proceed as long in time as you wish. Of course, it may be total rubbish, depending the nature of the ansatz. But in this way, you can continue it. OK, what do we get? And in fact, the results can be pretty impressive. Here, I take the transverse easing model. The transverse easing model is a good case because it can work it out exactly. And there's very few numerical techniques that fail on the easing model. So in some sense, the predictive power is perhaps not that high, but anyways. So what I take is I calculate S of k and t, which means I have carried out the Fourier transform to momentum space, but not yet the 1 to frequency space. There, I stay in the time domain. And so the numerical data we're taking up to time 10, just for fun. That we could probably do for much longer times. And then that was used to fit the coefficients and predict the future. And what you see here is for various k values, the exact curve, the DMRG results up to here. This was actually at finite temperature, beta equals 10 and 50. And that's the prediction for the future. I must admit, I've forgotten whether this picture was for 10 or for 50, but it's the same, essentially. And what you see is the agreement is really good. And that's the inset is the error. And we are talking about extremely small errors here of this prediction. Admittedly, the signal at some point is very small, too. But we can really say that this allows you to extend the time domain of your simulation easily by a factor of 10 or even more, which means that your frequency resolution, on the other hand, gets better by a factor of 10 or more. And in that particular case, what you get is for these various k values, that's what the line shape looks like as a function of frequency. Let's focus, say, 50 is actually the case. Well, it doesn't matter. So 50 is what you have here in this old shape here is what old methods had given that already also sort of worked on the data. It's not a raw data Fourier transform. So some brain power has already gone into producing this curve. And then if you use this prediction, what you get instead is this curve, the dark blue one. And the dots you see here are the exact results from analytics, which means that this linear prediction technique, in this case, is absolutely able to get you the analytically exact result. So the transfer is easing model is perhaps a simple model. So we have used this technique. And other groups have now used this technique in many circumstances. I show this picture here because for analytical reasons that I will explain on the next slide, the case of the spin-on continuum in a spin-1 half chain should be the particularly hard case for this method. And so this was in collaboration with Bella Lake. Wasn't Bella here? This plot has been shown. This plot has been shown. OK, no need for explanations. But what you see is, so I can cut the story short, is that that's Bella's data. And what you see here in black is our stuff. And what you see in blue was the next best theory. So I mean, I think this needs not really much comment. I like particular the one here, E. So the agreement is really extremely good. We are currently, actually I was still waiting. That's a problem. If you have a PhD student, not either, it wasn't the experimentalist group. If the PhD student that later on that goes to work in industry and the paper is not yet finished, then you are in for the long wait. And this is sort of like that's a collaboration with Christian Rueck, whom some of you might know. They have very beautiful experiments on crossover compounds, which is spin systems which, depending on temperature, have a behavior that goes from being one-dimensional over two-dimensional to three-dimensional. And they measure that now with extremely high accuracy. It's embarrassingly high. We have never had such a demanding partner for fits of numerical data like Christian, because their data is now so ultra precise. And they can see this crossover phenomena. I think we have simulations at 4 Kelvin, up to 40 Kelvin, an entire range of temperatures. And the agreement is absolutely perfect. Unfortunately, they have one peak where there is still a discrepancy of 0.4% between our data and theirs. And they get totally excited about this last data point. So I mean, I would say I don't care 0.4%. But they, for some reason, are excited. And the problem has now been resolved, actually. So they had a point. I have to admit that. And now we are waiting for this former PhD student to finish one plot. Watch out for that paper. It will be my name on it in Christian Rueck. And there you will see that this linear prediction is now ultra powerful to do something like spectral functions. So in the last sentence was important, because you may wonder, why did I do this prediction of s of k and t? Well, I mean, I have raw data s of x and t. Why don't I apply this linear prediction technique to s of x and t? Well, for a good reason. Or to the density data, which I showed you with this ultra-cold atom experiment. I mean, if you look at the times, at that time we already had this linear prediction technique. Why didn't we just use it? And tell Immanuel Bloch and these other people what's going to happen? Well, the reason is that, of course, it's an ansatz. And as every ansatz, it sometimes works. And it sometimes doesn't work. And if you work out the mathematics, what you find is that this ansatz is able to produce time series. Here again, I use index instead of a bracket t, because that's the way we have the data. What you get is, in simple words, a superposition of exponential decays. They may oscillate. So they may be imaginary parts and real parts in the exponents. But it's an exponential decay ultimately. If your data has this form, then you are in good shape with this linear prediction. Otherwise, you have to use other ansatzes. This is, by the way, a totally underdeveloped field. In physics, engineers have been doing a lot on that. It might be interesting for some of you. But if you look, for example, at a Green's function in momentum space, I think most of you, at some point in your studies, have seen some object like this. This is the single particle energy at momentum k. And then you have the infamous self energy, which corrects that. It contains both the shift in energy and the lifetime. And if you take that and Fourier transform it back from frequency into real time space, but you stay in momentum space, you get, for example, if that has a single pole, you get simply a single exponential decay at the position of the pole. And if you have several poles, well, then you get a superposition of such decays. And this is why this method works. Whenever you have one of these typical Green's function like structures, this linear prediction technique will do it for you. And the case why I say the spin-ons is a complicated case is because this has all been worked out analytically. There's entire books on that. There you don't have a few single poles, but you have an entire branch cut. So a continuum of infinite number of poles. And so that poses a challenge to this method. But it seems that this branch cut is very well approximated by, say, 10, 15 poles which conspire to show what the branch cut does. So this is really a method I recommend to you because it's so cheap in programming and so cheap in application, given that these time evolutions can be extremely time-consuming orders of days to weeks or even months if you're a patient. And this then does stuff for you in milliseconds. OK, so I would say at this point we make the break for five minutes. And then the next thing will be ground states. And then probably my present-day application will be a bit short, but I think it's better to know how it works instead of just flashing pictures. OK, thanks for your attention. And if there are questions now, ask them now. Now everyone seems happy. There were more questions yesterday, either because you have understood everything or I've totally lost you by now, whatever. Yeah? I don't know. I mean, I know it under the name of linear prediction. Actually, you find all this in a chapter of the numerical recipes, which it seems people had not read carefully enough beforehand. Well, no, it's sort of like there are people in physics who studied engineering beforehand. And they tell me that you are tortured as an engineer. You are tortured with the analysis of time series endlessly. So this is actually I really think what one should do is, and honestly, I haven't found the time to do it. I should try to nail down one of these engineers, professors at, say, Munich Technical University. We have a bunch of these. And nail them down and say, what do you do? Because engineers, very often, they have extremely complicated machinery where you have no way of really fully calculating it. So they measure stuff. And then they use these time series to whatever do some extrapolations. And they have developed a huge set. It just seems that the linear prediction, of course, hits on to, how should I say in some sense, on a situation which is extremely important and frequently occurring in physics. But for example, if you apply this, I did that with Steve White for fun on this density data, which we had produced for these ultra-cooled atoms. And there, basically what happened, the density, well, it oscillated, and it went close to 0.5. And then it started oscillating more strongly again. And so very clearly the ansatz was totally useless at that point, because it simply doesn't fit where I've switched the slide. It doesn't fit the form. Perhaps it's called time series analysis. But I don't know wherever you are now at British University, check out whether there are engineers. They should be able to recognize that and help you along. I once had a set of lecture notes, but they were handwritten and so many errors from one of these lectures of engineers. So I didn't get anywhere from there. I was to whatever, to chaotic. But that was not the fault of the lecture, I guess, but of the notes. But please go for it. I think this is really interesting. Another interesting aspect which I may mention, because I've been discussing that with many people, like Jörg Schmidmeier in Vienna, Frank Verstraht in Ghent, that, in effect, what this linear prediction does is, in some sense, it produces you effective theories for a very complicated quantum antibody problem. And in physics, we are generically interested in producing effective theories, like high energy physics. I mean, is standard model, is this the reality? Well, probably not. It's an effective model of what is really going on in our range of energies, which we can currently reach. In that sense, so all of physics, in some sense, is about building effective theories. And the question is, could one use techniques like this linear prediction or something similar to have a situation where you have the full microscopic descriptions, say you have your Hamiltonian, or you have extremely precise measurement data, which you can analyze with such techniques and use it to build, say, an effective theory from that, an effective field theory, which you extract numerically and then work with that, whether it's useful or not, I don't know. But this is an idea which has been around for several years now, and none of us actually either didn't get around to it or was not able to come up with more than this conceptual idea. But there might be something interesting there. OK, but now five-minute break. Most interesting. So let's now come to ground state searches. And so if, like, ground stage searches, I will actually do at a relatively pictorial level, because the details are, well, they are not difficult, but they kind of assay just lots of indices. But the fundamental idea is much, much simpler and can be very well represented in these pictures. So I want, or a given Hamiltonian, I want to find the matrix product state of some given dimension that minimizes the energy. OK, so that's what I'm looking for. I turn this into a Lagrangian multiplier problem, which I think you all are familiar with, that where the minimum of this is the minimum of that one and actually the lambda is then effectively this quotient you are looking for. So in graphical notation, this means that I have to minimize this expression here. And this is psi. This is the MPO for H. Let's assume for a second that we already know it. I will get to that point in a second. And that's, again, psi. So that's psi H psi minus lambda times psi psi. Now, the problem of minimizing this is that all the entries of these matrices are unknown. And they are being multiplied to each other. So this is an extremely highly multilinear problem. And we don't really know how to do this properly. So the technique, and this is actually the thing that was invented a long time before people thought about it in this way, was that what you can do it, you can do the minimization variationally with respect to just one of these matrices. So you take the derivative with respect to one of the matrices and say the derivative has to be 0. And then, of course, you do this with respect to all the matrices and find out what's going on. So as the matrix here pops up linearly, taking a derivative with respect to it means in the picture that you just remove it. Well, I mean, the derivative of 5x is 5. The x has disappeared. So and the same thing, of course, in the psi psi. Actually, I take the derivative not with respect to the matrix A, but to the matrix A star. And that's actually not the same thing as taking it with respect to A, because you know for complex numbers, you can either say I treat real and imaginary parts separately or I treat C and C star separately. And that's an approach one, of course, typically uses. So that has to be 0. So in the unknown object, you want to improve is, of course, the complex conjugate of this matrix, which was sitting here, which means that it's basically this matrix multiplied by all the other stuff, which you take fixed to be fixed for the time being minus lambda. There's again this matrix times this stuff around there. Sorry, I have to sneeze. Let me switch off. So this is what you have here is a generalized eigenvalue problem. This thing is sort of like this is A times x minus lambda times, let me call this other stuff here, n times x, and I'm looking for x. Now, this is not what you want. A generalized eigenvalue problem is harder to solve than a standard one. And it may depend crucially on the numerical conditioning of this matrix, which you have around here, or the stuff which is multiplied to it. And this can, in practice, be really bad, and then your algorithm will converge very slowly or not at all. What you want is you want to take it back to a standard eigenvalue problem. And this, again, brings us to the topic which I insisted so much upon yesterday, which here is absolutely crucial. You want a matrix product state in this mixed normalization. And I want it in that form that when I'm looking at this matrix here, all these guys to the left are these type A matrices, and all the ones to the right are the type B matrices. Because as we saw yesterday, then this structure here will collapse. And this structure here will also totally collapse. And what you end up with is some matrix whatever multiplied to this object minus lambda, while this object is 0. And this is just a standard eigenvalue problem. And this eigenvalue problem, you then solve, because you are only interested in the extremum, you are only interested in the lowest eigenvalue that comes out. So what you use is methods like Landsoch or Jacobi Davidson or typical large sparse matrix solvers, because it turns out that most of these entries are 0 anyway. And the objects are huge. Because think about this object here. You will reshape it into a vector. And how big is it? A typical MPS calculation, a dimension might be, say, 1,000. So what you would get is 1,000 times 1,000 times, say, the local states-based dimension. Hubbard model would be 4. So it would be a vector of length 4 million. So this object here multiplied to this thing which you reshape as a vector has dimension 4 million times 4 million. And of course, this means there is no way of doing a kind of a standard diagonalization. But methods like Landsoch or Jacobi Davidson work extremely well, because as I said, there's lots of internal structure here, which means that most of the matrix elements are 0. And this operation of multiplying is becoming very fast. OK, that's what you do. And the point is, of course, here is, again, this operation here in full gory detail with numbers. But of course, what you do is I just said, now I explain now how to optimize a single matrix. And what you are really interested in is, of course, you have to optimize everywhere. So what you do is you combine many of these steps. So you start out, perhaps, with a random initial state, or you guess something in a smart way. There are methods for doing that. Actually, most of the time, people nowadays start with something random. And then you sweep this hot site where you want to optimize the matrix. You sweep it forth and back, because if you optimize only once, I mean, you do this in the presence of the other matrices which have not yet been optimized. So you have to do this several times until all the matrices are at their best. In practice, this can be tens of times. You have to monitor this very carefully, because I've seen quite. This is one of the major sources of error in application that people, at some point, say, I stopped this sweeping procedures going through all the matrices forth and back, because I'm sort of getting bored. And then the result might not be converged. Or even worse, I've seen cases where people stopped when the result looked like what they wanted to see anyways. And where kind of continuing to do it would have been a little bit of a disappointment. So anyways, but in order to stay at the simple eigenvalue problem, you have to do that in the mixed normalization, which means while you are moving through the chain in order to optimize these matrices, you do not do that. You do not pick them randomly, but you really go forth and back, like with the zipper, and at the same time, shifting the normalization. That's what we learned yesterday, where we learned how to turn the AAA BBB notation into the one where it's sort of like you have converted one A into a B or vice versa. This technique is actually what you use at that point to always stay within a simple eigenvalue problem. And this is actually why I don't show you explicit formulae, because as you remember from yesterday, that's not totally easy and trivial to write down. But that's exactly what you need here. That's what I've already mentioned. You optimize the local matrices by solving an eigenvalue problem and the convergence, which I just mentioned. You have to monitor. There is various ways of doing it. One way of doing it is by looking at the variance of the expectation value of the energy. If this goes down to zero, you know that you are in an eigenstate. And the hope is that it's actually the ground state. Well, I mean that you usually then see from other indications. Lanzarch and these methods have a very strong pull towards the ground state. But of course, if you have degeneracies or states with an extremely small gap, that might be problematic. Perhaps a last remark, what I have just described is of like old fashioned language would be an algorithm where you have a block of A matrices and a block of B matrices on the right. And you focus on one matrix in the center. In the old literature, this is called single site EMRG. And what was originally invented was a modification of that method where you basically move two sites next to each other, left and right. That in fact, at the end, involves an additional singular value decomposition. That's the so-called two site EMRG. If you ever stumble across discussions relating the two, both of these methods have advantages and disadvantages. Actually, in extremely complicated calculations, like I'm currently doing something with Steve White on the 2D Hubbard model, he uses the two site EMRG. Our code mainly uses the single site EMRG. I mean, we can both switch, but we have more experience respectively. And we need that desperately to check the convergence of each other's results. Because this method here is slower. There's a high calculation account in this method. This method, in principle, is much faster. By a factor, it depends. 4 or 5. It can be shown mathematically to give you the variational optimum. This method cannot. But, and this is the awful thing, very often being variationally optimal in a numerical algorithm means that it's very unstable, that it's very subtle in its convergence behavior. Because if you say you have an algorithm that lands you into a local minimum, and you actually want to go to the global minimum, if the algorithm is really working perfectly, then the consequences, you stay stuck here. If the algorithm is not so perfect, you have a hope that you get out again and get into the real minimum. So quite a lot of effort has gone into developing techniques, which are now very successful of preventing that in this method. Still, what we find is that the approach to convergence in these really complicated 2D systems, in 1D, I would say, don't care. It's all relatively simple. In these very complicated 2D systems, the approach to convergence seems to be faster in this method. So what we sometimes do is, because as we approach convergence, we make the calculations bigger and bigger to get the last digits. So to say that sometimes we say we start with this one to get us relatively quickly to something useful. And then when the calculation becomes very complicated, and when we think that the convergence issues are mostly resolved, then we switch to this algorithm to do the fine remaining details. Or we use them as completely independent checks. So that's a topic which I would say, in one dimension, is not so important. In two dimensions, it's really a relatively subtle topic, which currently we do not fully understand. So what we do is we basically check each other's results to see whether we get the same thing. OK, so that brings me to the last slide of this part of my present, no, unfortunately not. Well, I mean probably this means that instead of rushing everything through in the last minutes, I will solve those of you who want to see this DMFT, DMRG, MPS application have to wait for another talk or for the paper, which hopefully comes out soon. And I would rather spend the last 10 minutes on explaining a bit more about how to build MPOs. And then we have time for questions, and then you have listened to me long enough anyways. So what I want to do now is to construct a matrix product operator for Hamiltonian. I don't know whether Miles exposed you to that, perhaps. I saw one blackboard on the terrace floor where basically the construction I'm going to show you was put on the blackboard. So some people have at least already discussed it. So what I want to do is in some sense you would think that a Hamiltonian, which is a relatively complicated object, so what we write is something like si, si plus 1. That's a typical, simple Heisenberg Hamiltonian. But what this really is, is first of all, this is a sum of 1 half s plus s minus plus 1 half s minus s plus s c s c. And each of these terms here in reality is a string 1 times 1 times 1 times the identity s plus s minus times 1 times whatever. And then you have sums of those. So basically, in that Hamiltonian for a chain of length l, you would have three l terms of that form. And then you may also have a field h s c i that gives you yet another bunch of these strings. And so while it's obvious how to construct something like this, there the matrix product operator is simply a chain of 1 times 1 matrices. So the matrix m sigma 1 sigma 1 prime to produce this would be simply delta sigma 1 sigma 1 prime built as a 1 times 1 matrix. And this object here would be m say sigma i sigma i prime would be sigma i s plus i sigma i prime. And again, it's just because of like, of course, you can turn that into a matrix. But here the label of the matrix is the labels are put up here. So you really decompose that into a bunch of matrices that each of them are just a number. So in that sense, you see that you can construct each of these strings of operators. But then you have to combine them to a big MPO. And one way actually of doing that would be to say, well, I construct these three l operators. Each of them are very simple. And then the Hamiltonian is just the sum of these operators which I have constructed as MPOs. And like with MPS, actually, no, sorry. And the way you do that is basically you would take these matrices and build a big one. Actually, don't worry about that too much. That's not the way you should do it. But that would be the most naive way that you say, well, if I put each of these, in that case, 1 times 1 matrices into a big one on this diagonal, and I do that on each side. And if I multiply them together, this guy will be multiplied with this one, with this one. That one with one here, one here. So ultimately, this will give you exactly this string of matrices being multiplied together for string 1 and so on and so forth. But what that would mean is that the dimension of the MPO would become really large. So in that case, for this Heisenberg model, even without a field, you would have 3 times l of these operator strings. So that would be the dimension of the MPO. So the dimension of the MPO would be, say, 300, if you have a chain of length 100. And that becomes unmanageably large. And in fact, it's totally a useless way of doing it, because what I will show now on the blackboard is that the real answer is, the optimal answer, is that the dimension of the matrix product operator for the Heisenberg chain, including magnetic fields, and it can everything also be inhomogeneous with disorder. Whatever you want is just five, and it's totally independent of length. So how do you do that? And the way to do that is actually, in some sense, the best way is to do it pictorially and say, I make a slight change of notation that I say sort of like I turn these matrices with their indices m sigma i sigma i prime, which you see here on the right-hand side, that I turn these guys of like I read them as operators. So then the Hamiltonian I'm looking for will have this form. But that's sort of like just to keep the notation simple. The main idea which you have to understand is the following. I think of my construction principle like a little automaton that moves through my chain, say that's my chain, 1 to L, it moves through the chain, and it builds up an operator, which is part of the Hamiltonian. And it does it in a way that ultimately, when the automaton has worked its way through the entire chain, it has constructed the entire Hamiltonian, all terms that are in there. And so the way you do that is this automaton has internal states. There have been papers also connecting that to cellular automata and this sort of stuff. It has internal states. And I say it's internal state. The first one is 1. That's the state it has when the automaton sits here, sort of like a neutral starting state. And then it moves past site L. It moves to here. So what can happen on site L? So the first thing that could happen is the automaton encounters a term with a magnetic field. So this is the term HSC, which it might encounter. So if that is the case, you are complete because this is a complete string. So after that, if it hits on an HSC term, there may only be identity operators as it moves further through the chain because this is already one of the strings of the Hamiltonian. So if it gets into this final state, which is five, I mean, I know that it will be five. After that, so like once it is in this final state, it means the only thing that can happen is identity operators until I am at the end of my chain. So what might also happen is that the automaton doesn't find anything happening here. Like if you look at this string here, there's lots of identities before it actually starts hitting onto an operator. So it may stay in its state one if it walks past an identity operator. That's stuff like what also may happen. But what may also happen is that the automaton encounters either an SC and S plus or an S minus. So let's SC S plus, S minus. And this puts it into new internal states, which I call two, three, and four. And why does it need these internal states? It needs these internal states because what we know from the structure of our Hamiltonian is that if I have a term SC, it needs to be immediately complemented by the next SC term. And the S plus must be complemented by the S minus and so forth. So what we immediately know that in the next step, here we will get another SC, say JSC, because these guys have an interaction strength. The S plus has to be complemented by J2 S minus and the S minus has to be complemented by J over 2 S plus to complete the interaction term of the Heisenberg Hamiltonian. Once you have done that, after that, if you move further through the chain, only identities may occur. And this is already what we have here. You are in this final state. So here is start and here it's final. And what you may do as a little exercise is just take a Heisenberg model on a three spin chain. I mean, one is a two is a little bit too trivial, I think. Take that and just write down the full Hamiltonian, all the strings that pop up. It's not that many. It's nine without a field, 12 with a field and see whether your automaton will reproduce all of them if it goes through this diagram. So that's where you can check that this is actually what it does. And so now the question is, how do I translate that picture into an MPO? And that's shown on the next slide. So let me first discuss the case when I am in the center of the chain, meaning not on side one and on side L. So if I multiply these matrices together to give me my full MPO, means I go in here because I come from the right and I go out there. And so if I now label these internal states 1, 2, 3, 4, 5, 1, 2, 3, 4, 5, meaning what can happen if I have one? One can take me to one, then I have to put the identity operator here. One can take me to two. Well, here I have switched notation a little bit between SC plus minus if you compare it to there. I stay now with the picture from the blackboard. So it's SC, S plus, S minus, and I get into state 5 with HSC. So now if I am arriving in state 2, the only thing that can happen is JSC, which takes me into state 5. 3, it must be J half S minus. 4, it must be J over 2 S plus. And 5, once you are in 5 in the final state, it's just the identity. And all the rest is 0. That you have this lower triangular form, you can always insure in this construction. So this is sort of like the MPO matrix somewhere in the chain. And because this sometimes creates a little bit of confusion, because normally you write something like M sigma i sigma i prime, and it has indices ij. This is sort of like the notation we introduced. How does this relate to this one? Well, it's basically that you say I take of this matrix M i, I take the entries ij, which gives me an operator, because it's operator valued. And of this operator, sort of like the entries are then sigma i sigma i plus 1, or sigma i prime, sorry. But this is a detail. Don't get too worried about that. So this is the picture you have. And now the point is how do you get into the whole thing? So well, in the beginning, you always start from state 1 if your automaton sits to the right. So the matrix which you have on site L is just the first column. So this is now on site L. And this is just the first column, which goes down to HSC. And now on site 1, you have to finish in state 5. So what you have is basically on site 1, you get a vector HSC all the way to the identity, this object down here. And you see, if you multiply these guys with many of those with that one, you exactly get a string of operators as you want to have it in a Hamiltonian. And you can work out that it's exactly the ones that you want. That's sort of like how you can really by hand construct the MPO for a Hamiltonian. And what you see is you can now immediately understand what it would look like for a Hubbard model. The Hubbard model at first sight might be much more complicated. But the complication that you have a larger local state space is contained in the definitions of these operators which you put here. But for our automaton, it's actually very simple. The automaton might find for the Hubbard model, for example, that you have what do you have? You have C dagger spin up. You have C dagger spin down. You have C annihilators up, annihilator down. And these, of course, must be complemented by C dagger. By C dagger is by C down. And this one by C dagger up, C down up. These are the four pairings you get in a Hubbard model. I mean, the usual expression that you write T sum sigma C i sigma i plus 1 sigma dagger plus Hermitian conjugate. So the sigma gives you 2. And the Hermitian conjugate gives you 2. That's the fact altogether 4. Well, what does this mean? Instead of these three guys, you get those four guys. So what you need is you need an additional internal state. And so the final one will be 6. So the dimension of the MPO for a Hubbard model would just be one more than for the Heisenberg model. If you have long-arranged interactions, there are tricks how to keep this manageable. Because what you have is now, if you have, for example, a term, like in a frustrated model, you have typically objects like s plus i minus i plus 1 with a j1 plus j2, or 1 half, doesn't really matter. s plus i, let me write it differently. s i minus 1, s i minus 2, s i minus. Sort of like a nearest neighbor and an ex-nearest neighbor term. And so what you have to make a distinction is, let's say, your system takes you with an s minus i into an internal state 2. You may either complement it immediately by 1 half j1 s plus i minus 1. Or you may jump into that term, which means that you need an identity operator on site i minus 1, which takes you in a state 2 prime of this automaton. And then sort of like it is being complemented by 1 half j2 s plus i minus 2. Then we are here in the final state. So you see, long-arranged interactions have the consequence that you pump up the dimension of the matrix product operator because you need internal states to bridge the identities that are appearing between the operators. So you see if you had a j3, you would have to introduce a t2 prime. And that would have to go to a 2, 3 prime because you need two identities because they are three sides apart. And then you complete it. And you see, you can do something like this for short, long-range interactions or beyond nearest neighbor. But ultimately, this becomes sort of like quite a nuisance. Just without writing it on the blackboard, what the technique is that people use there is, you can find out that you can do exponentially decaying interactions without any additional numerical effort in the MPO representation. And then what people do is they use a bunch of exponentially decaying interactions to basically mimic the true behavior of the interactions. And sometimes this works extremely well. One question? Yes? That's a very good question. Let's put it like this. I can't do it. But this might be a limitation of my brain power. Whether there is a fundamental reason that you can't do it, I don't know. Of course, you will have a problem of multiple connectivity. That might be a reason that you can actually show that it's not possible. But I would be very careful with pronouncing no-go theorems here in a situation where I honestly haven't really worked it out. I mean, what people here usually do is, in that context, they do not represent age, but they rather represent local objects for observables, or they represent the age by trotterization to do imaginary time evolution, to go to ground states, something like this. I would be interested. I mean, I must admit, in a different context, I tried something similar. And there, at some point, I got totally stuck. And then I was so confused with all the indices that I cannot say whether it was just me or whether it's impossible in general. There might be such a reasoning that it is indeed impossible in higher dimensions. It works, of course, if you look itself like tree tensor networks or so, anything which doesn't have loops in the network, there I think it should work extremely well. It's just that it will look more complicated, but there's a reasonable ordering you can introduce. OK. And basically, in some sense, we already entered the question session now, because this brings me, if I look at the time, to the end of my presentation, I could not show the application which would have come next. Well, that has to wait for next time. I thank you for your attention for four hours. And then I'm, of course, open for questions. Thank you very much.