 Okay, so we talked about, let's see, we talked about lang shows and then that was the end of the diagonalization part. Then we talked about product states, which were a state that's not entangled and how it's actually a little bit hard to tell if something is a product state. An entangled state is just something that's not a product state, but we also want to have some measurement of how entangled something is. The other thing that we talked about is that we usually think about the entanglement with some dividing line in the system and we maybe don't worry about the entanglement within each part of the system, we worry about the entanglement across the system. So you might have two physical objects separated in space and we wouldn't worry about lots of entanglement that was inside the object. If those two things were entangled. And so then we can define a product state with that dividing line. But then to figure out if something is entangled, you need to use the singular value decomposition. So the singular value decomposition, and we haven't used it yet, but I was just defining first what the singular value decomposition is. It's just a simple matrix factorization that turns any matrix into a product of matrices, a unitary times a diagonal times a unitary. And the diagonal has positive elements along the diagonal that are called the singular values and that's sort of the key part of figuring out entanglement. So how do you turn that into something about a quantum system? So in typically chapter one or two of a quantum information book, you might see the Schmidt decomposition and it really is just sort of applying the singular value decomposition to a quantum system. So we have a quantum system which is written in terms of the two parts with the dividing line, the left part and the right part. And the structure of the quantum mechanical wave function is that a general wave function will be able to mix different parts of the two sides and so there's this psi LR that you sum over all states and the left and the right and you could have all of these terms coming into the system. And if it's got all of these terms coming into the system it might be very entangled. So psi LR is just the wave function written in this basis, but it's got two indices. So we could pretend it's a matrix. Usually in quantum mechanics, matrices are operators. This is still the wave function. We're just going to look at it as a matrix. So we do an SVD on this psi LR and we get a U, D tilde V, the same ones we had before. I've done the tilde one which means that U and V are both unitary. So let's see what this gives us. So the normalization of the wave function says that the sum over L and R of psi LR squared is one. That turns out to, because it looks like a matrix it turns out to look like the trace of psi dagger psi. And so we'll use that in a little bit. But now let's take this SVD factorization and use the U and V to define some new states. So the U and the V are unitary transformations. So if I apply them to the states in this fashion it gives me another, if the first set is orthonormal it gives me a new set of states that's also orthonormal. So I start with the left states L. I apply this unitary transformation so I sum over all the L's. I have the new states I that are the diagonal elements of D. And I get this left state labeled by I that's a special kind of state. And I do the same thing for the right side. I take the V tilde and I sum over all the right states and I change that basis to this I basis. And so then I plug, so then I take psi and so I haven't used these two lines yet. I just insert into the expression for psi the singular value decomposition matrix multiplies. Expand it out in index notation. Then I rearrange the sum to put the V next to the R and I do that sum on R. And I find that I just get this IR piece. So the R and the V turns into this IR and I do the same thing with the U and the L and that turns into the IL piece. And I'm left with an expression for the wave function and it's different from the original expression because it only has one index that gets summed over. So it's got a sum of one index I it's got the I, I is the coefficient and then these are the basis states. So these are special basis states that allows the wave function to be written in diagonal form. So that's a huge improvement over a totally off diagonal form. Now it's diagonal. And this is the Schmidt decomposition. And the D tilde I, I well those are just the singular values which we usually call lambda. And so here's our wave function. So this works for any wave function. It depends on how you cut your system in two. You know sometimes there's one natural cut two separate systems or sometimes we just have one system and we make imaginary cuts. As long as you sort of separate the Hilbert space into two parts. Like one set of spins and another set of spins. That's fine. Okay so then this special basis within those two parts makes it diagonal and it's got simple coefficients. The coefficients are real and they're non-negative. Okay so that's the Schmidt decomposition. So the Schmidt decomposition sort of immediately reveals to you whether the system is entangled. Okay so let's go back. We have the normalization. Okay so we take this form for the normalization and we plug in the SVD and plug it in for both. Use the unitary conditions. Use the cyclic properties of the trace and the U and the V both disappear. We're left with two diagonal matrices which are the same squared and so it just gives you trace it and so it turns into just the sum on I of lambda I squared equals one. Okay so whenever you see something in physics where you sum it up and you get one usually think of it as a probability. So this lambda I squared is the probability of a particular state. It's this state where it's in the Schmidt state I L on the left and the Schmidt state I R on the right. And it's got this diagonal representation. Okay so suppose suppose psi happens to be a product state then it's already in its Schmidt decomposed form. It's in a Schmidt decomposed form where lambda one is equal to one and all the rest of the lambdas are equal to zero. So then it just looks like the previous expression for the wave function except we just read off that phi is I L and C is I R. Okay so product wave function translates to SVD where you get only one non-zero piece. And that tells you if it's entangled or not. So you just do the SVD on this matrix form of the wave function and you immediately see if it's entangled. But you also get a lot, usually things are entangled and so then we can try to measure the amount of entanglement by looking at these probabilities. So the closer it is to all the probability being in one guy that's less entangled. It is totally spread out among lots of different states that's very entangled. So von Neumann came up with the von Neumann entanglement entropy. And it's just plugging these probabilities. Here we go. It's just plugging these probabilities into the standard statistical mechanics information theory formula for the entropy. Okay so here it is. Oh I have a good story about von Neumann. So it's interesting to read the Wikipedia article on von Neumann because they make a pretty good case that he was like the smartest guy of the 20th century. And around here Enrico Fermi is one of the particularly famous great physicists and one of our emeritus faculty at Irvine was a postdoc with Enrico Fermi. And so he told me this story about somebody talking to Enrico Fermi about von Neumann. So there was another postdoc of Enrico Fermi and he said to Fermi, I hear von Neumann is like somebody else in the world. You know is that really true? And Fermi said well yes, you know how much smarter I am than you? Well that's how much smarter he is than me. I don't know if there's anybody around who still knew Enrico Fermi and would say, well that sounds like him but this guy, our faculty member, he's retired and he was his postdoc. So von Neumann defined this and there are other types of entropy that have slightly different formulas that are useful also. But this measures the entanglement. So let's look at the case where one of the lambdas is one and the rest is zero. Well that gives you entropy of zero. The factor in front outweighs the log. And the most entangled of the lambdas are equal and then it looks like something proportional to the number of spins in the system. You know the probabilities of all these states will be like if they're all equal it'll be like one over two to the end. That'll kill off the log. And so you get, in that case you get something that would be an entropy that's extensive scales with the size of the system. So the ordinary stat mech entropy isn't quite like this. The ordinary stat mech entropy we sort of think of as a bigger system than the system we're looking at there's a heat bath and we're really thinking about the entanglement between the heat bath and the system of interest. But this formula just has the system of interest and we're usually talking about the ground state. You don't have to but you're usually talking about the ground state. So this system, the ordinary entropy would be zero because if it's in the ground state it's handled with the heat bath. But we're looking at something different. We're saying the heat bath is itself it's just the right hand side of it versus the left. So we cut it into just one system and then this defines what this type of entanglement is. It's sort of a more general thing. So let's look at an example of measuring the entanglement. So here's two spin one halves and so the natural dividing line is of course one spin and the other. I made up this wave function that has just two pieces of it. It's normalized. The formula only works if the wave function is normalized. So then this could be one of these exercises. Let's take this wave function and to find it's von Neumann entanglement entropy. So the first thing you have to do is write it in matrix form remembering that it's not an operator it's this funny left-right thing. On the left and right sides could be of different sizes so it could be a rectangular matrix. Here it happens to be square nothing that makes it Hermitian or anything. So here's the two probability up up is 1 over root 2 that's this element and up down is this other element. And you can do the SVD so of course you can call it numerically but we just have to find some way of writing it in the SVD form and so you can do that in this case by this sort of trick. Here I've taken the wave function itself it's top two elements are root 2 and minus root 2 and then it's got zeros. So I've got the part of the wave function here but I added the other piece that would make this unitary but then I killed this piece off by multiplying by this matrix. And then I the rest of the trick is to put in a U which is just a diagonal matrix so that's equal to that and now it's in SVD form. So this is sort of it's sort of like 2 by 2 you can always figure out by hand how to do it and this is sort of a little tricky way to do it or you can do it more systematically also you can form a density matrix which I'll mention later and diagonalize it so it's very systematic. So here it is and it's a product state. It's got one singular value and the other one is zero. And what are the two product states? Well you can see that the left spin is definitely up so the left guy is just the up state and the right hand Schmidt state is up minus down with a factor of one over two. So then here are two singular values one and zero but it's a product state so you do this formula for s and you get zero and it's on the and there's the product form written now. Let's make sure this is the last thing. Any questions on that? Switch over to the next set of slides. Yes. What is the value decomposition? There's two ways to think about the question. So first of all let's suppose that we had 100 spins. You don't do the entanglement with each spin you don't divide it into 100 pieces and do it that way. You have to choose a dividing line and all that means is you have to tell me which spins are on the left and then the rest are on the right. Then write the matrix in the psi L R form. The indices L will go over a huge number of possibilities all the possibilities of that set of spins so we'll R. We have this huge singular value decomposition to do. In most cases if you do a big system it won't be practical. It will be sort of the same level of practicality as doing the exact diagonalization. But that's in principle what you'd have to do. DMRG gives you a shortcut to it. You get the entropy as part of the algorithm. So a sort of close cousin of the Schmidt decomposition are reduced density matrices. And that's where that's part of the name for DMRG density matrix renormalization group. So here's density matrices or more properly reduced density matrices. So you have the same split between the left and the right side. Oh, by the way, if you haven't seen much of density matrices before, Feynman's lectures on statistical mechanics has a wonderful chapter 2 introduction to them. It talks about examples from polarized light. It gives a nice little picture of why you should think about quantum mechanics this way. Okay, but so here I'm just going to connect it to the Schmidt decomposition, so I write the wave function in the same form. Okay, and then let's imagine that I want to look at an operator, an operator O that only lives on the left-hand side. If I take the expectation value of that operator within the state psi, I get these various pieces, but the R doesn't connect to the O, so it goes to the R prime and gives you a chronic or delta. Okay, and so you're left with a simpler piece involving the left. There is still a sum on R left over. So you get rid of the sum on R by defining this reduced density matrix Rho that has two indices for the left, an L prime and an L, and it sums over all the R states, so you call it tracing out R. Then with this Rho, there's a simple expression for the expectation value of psi. It only involves Rho and the operator, and it's just trace of Rho times O. And this works, we didn't use any properties of O in doing this derivation. So it works for any operator, any operator that you want to look at on the left-hand side to figure out what's going on the left-hand side, you can find out what it is by tracing it with Rho. So Rho has all the information about the left-hand side for any operator that the only thing that Rho doesn't have is that it's entangled to the right-hand side, but encapsulates everything on the left-hand side. So that's a nice thing. That's a density matrix. It's got some other things about it that I don't have time to talk about, like states that are mixed. So Rho is Hermitian, and so you can diagonalize it. So here's a little exercise, an analytic exercise that suppose we diagonalize Rho, and I can do Rho by tracing out the right, and I get Rho left, or I can trace out the left and get Rho right, and it's sort of just a transpose. The Rho is not a transpose, but you just transpose psi, and it switches between them. So they're closely related. So the exercise is to show that the eigenvalues of Rho L are the same as for Rho R, and they're just lambda I squared from the Schmidt decomposition. So these are essentially the same thing. It's just a different way of writing these singular values, which we interpreted as probabilities, just become the eigenvalues of the density matrix. And the interpretation is the same. The lambda I squared gives you the probability of the left state being in this I, L state. So in DMRG programs, this actually translates to you can write the program diagonalizing a density matrix or you can write the program doing an SVD. And, you know, typical program might have the option for doing either one. They each have some advantages. But it's really a tiny technical difference. It's really just the same thing. Okay, so what do we know now? We know how to calculate the entanglement entropy, von Neumann entanglement entropy, and we can think about doing it for a spin chain, the same spin chains that you are working to diagonalize. So what would we do? So we think of a Heisenberg chain with open boundary conditions spin one half out to size n. We diagonalize that. We have to decide where to split it. Well, if you split it towards the end, it'll give you a small entanglement. So we usually split it down the middle to get the biggest entanglement to see what's really going on. We divide it in two. And so I'll only think about chains with an even number of sites. Okay, so here's the recipe. So we divide it in two. So first we find the ground state with the exact diagonalization. We rewrite that first ground state eigenvector splitting the spins that are on the left from the spins on the right. And then we lump all the spins on the left into one index and the ones on the right to the right index to the matrix. So we rewrite it in terms of the left and right basis states. So it looks like a matrix. Sort of absorb all the individual spins into a single index on each side. Then we do this SVD just call SVD, look at the singular values, square them and add them up with the log thing and you get this entanglement entropy. Okay, so this is a program that a lot of you are almost ready to do like almost had time to finish that general end diagonalization. And so here's what you get. Okay, so here's a more advanced exercise for the weeks ahead if you have time. You can try to verify this table with Julia. Okay, so here's all the even sizes out to 14. And with DMRG we can just keep going on this table to any size you want, pretty much. But here it is with just a simple diagonalization. So here's the entanglement entropy. For size 2, the exact answer is log 2. So this 0.69 is log 2. Okay, it's also the maximal possible entanglement entropy. So if you make all of the Schmidt states equally probable, that's the most entanglement you can have. That's got a simple formula of N over 2 log 2. So that's the most you can have. And so then we can ask the question, okay, how big is the entanglement entropy compared to the biggest it could be? The system's getting bigger, the state is getting more and more complicated. Maybe it gets more and more entanglement. No, it doesn't. The entanglement is hardly changing here. So the maximum is growing with N and so here it's up to 5, but we're still around 0.76 versus starting out at 0.69. It's actually growing as the log of the system size. Very slowly. That's one of the things that we've learned in the last 10 or 15 years, exactly how this works. There's also an alternation which gives us a clue about what the system is doing. It's like, why should they're all even. Why should it alternate big, small, big, small, big, small, big. Well you can understand that in terms of an RVB picture of the ground state. So if you have two spins the ground state is in this singlet state and it's got a nice low energy and on bigger systems it makes a very complicated state, but it's not a lot in common with just putting down these singlets. I've made a fat line for each singlet and here's two spins and it's the exact singlet. If I have N equals 4 it looks a little bit like a singlet on the left and a singlet on the right. Now the singlet has log 2 of entanglement in it but if the singlet is sitting entirely in the left it doesn't matter. It's entangled within the left. We only care about the directions. So this N equals 4 should have a very low entanglement because it looks like two singlets and that is what we find. We find this 0.32 which is not zero so this is just sort of a cartoon-like picture. Yes. It would always be log 2 because this is it would always be exactly log 2. The reason is is that it's a symmetric in between up and down. If you cut it so it only has one site which is all you can do for N equals 4 then the one site has two equally probable states up and down. That's all that matters. So the entanglement entropy is always governed by the smaller system and so in this case for even numbers of sites cutting it just for one site is always log 2. Then if you move the cuts steadily to the center there's a particular form that it takes and there's an analytic prediction for this in the large N limit and a lot of it's known about this. Yes. So RVB is resonating valence bond state. So this singlet here is called a valence bond in just other language and one way to represent this type of state is it's a superposition of valence bonds in all different places and that's called a resonating valence bond state and this idea dates back to Linus Pauling in the context of chemical rings that the quantum mechanical state can look like that but it really came into physics with Phil Anderson proposing it as a possible ground state for frustrated spin systems and so he proposed it for the triangular lattice it's now a good description of the Kagame spin liquid lattice which I'll show some results for okay so in this I'm not actually letting these resonate because in 1D they really can't move around with open boundary conditions they're kind of stuck from the end so you can only write one pattern of these bonds it pairs up the sites. So for N equals 6 one of the singlets one of the bonds is in the middle and so you're cutting it in two so you get that extra entanglement of cutting that guy in two and so 6 is big and so it depends on whether you're a multiple of four or not whether you cut the guy in the middle and that gives you this alternation you can have further neighbor singlets but the near neighbor singlets are directly in the Hamiltonian so if we put in a J prime it would try to make next neighbor singlets we didn't put that in so it's trying to lower the energy and so it tries to make every neighbor pair a singlet but it can't once you make it a singlet with one guy it can't talk to anybody else so it has to do some sort of combination of fluctuating guys that's mostly near neighbor but sort of fluctuating around but it does have beyond nearest neighbor and you know these numbers aren't going down to zero you're sort of mixing up it goes farther away from the ends the effects of the ends that pin it in one particular location dies off and the even odd alternation gets smaller and smaller so now let's talk about why this entanglement is so small compared to what it could be there's something called the area law and laws and physics usually are something that's usually true, mostly true theories can be always true laws are like you know there's Hook's law for springs that's like just a little Taylor series so this is the area law is better than Hook's law but it's you know you have to be very careful about how you define the system to make it true and there's been some wonderful work proving the area law in certain circumstances particularly by hastings but what it says so one of the things that's the log corrections are not going to be considered here they are sometimes present but what the statement is is that if you have the ground state of the system with this division of A plus B but it's really all connected then the entropy is proportional to the area of the boundary that's why it's the area law and it's area makes you think of three dimensions with a cut makes it an area so the name isn't quite it's a little misleading but in 1D you divide the system in two and you say what's the number of sites on the edge and it's really only one no matter how long you make the system so this the area law would say that S is a constant because the cut only has one site on it two dimensions S would be proportional to the length of the line here and so it would be proportional to Ly for an Ly system it wouldn't depend on LX and 3D I didn't draw it but you have some 3D thing and you cut it in the middle and there's an area and the entropy should be proportional to that and so I can use the RVB picture to give a pictorial justification for that which is here in 2D and the key observation is that these singlets that are on the entirely in the left or entirely in the right don't matter so I draw some sort of singlet pattern here I did it sort of organized but it could be more random and then I say okay now I do a cut and I say how many of those singlets are gonna cut, be cut by the line here I lined them up so that all of them in the line were it could be half of them or something and you can see that the ones that are cut is proportional to the area of the line the area of the boundary and so that's where the area law comes from it's just sort of using the singlet picture saying the interior of this guy doesn't talk to the interior of that guy in a typical ground state okay but this sort of gives you the picture that this kind of makes sense in this RVB type of language it doesn't give you a sense that it should always be true and of course there are log corrections to this but this is a sort of detailed research area that people have thought a lot about there's a lot to know about this but each of these singlets that were cut would contribute their log 2 to the 2s and it would just add that up okay so the area law is the thing that makes DMRG work it says the area law tells us that so the biggest entropy would be the volume of volume law and if the volume law applied DMRG wouldn't work at all but the entropy is a lot smaller and that allows us to throw away states and so that's the next section how do we throw away states if the entanglement is small okay so truncating low probability states so the first thing is that if the von Neumann entanglement entropy is small that means that there has to be a bunch of states with nearly zero probability so here's a little schematic plot of the lambda squared versus the index i and I made it so that it's just getting really close to zero beyond a few of these so this would be a low entanglement state and I'm imagining that these guys over here are really small like 10 to the minus 6 or 10 to the minus 10 and we're willing to make that sort of error anyway okay so here's the Schmidt decomposition of our state and it goes all the way up to the full size of the system to the n over 2 and we say no I'm going to throw away all the guys to the right of my pointer and just keep them up say there's m of them, little m and so I'm going to keep those and this is my approximation and m can be much less than m over 2 and this is just like that approximation of a matrix directly from a singular valued decomposition where you only keep a certain number of rows and columns exactly the same okay so then this truncation which cuts down it compresses the system exponentially you know 2 of the n over 2 goes down to something like 100 or 1000 where n is maybe 100 or 1000 itself so 2 of the n is huge so this is the basic idea of dmrg so it uses the density matrix or the Schmidt decomposition finds the lambas and uses that to truncate down the system okay but but this what I told you so far isn't enough because in order to get the Schmidt decomposition we need the wave function but we're trying to get the wave function so it's like a chicken and the egg if you had the wave function you could compress it down to something small but you don't have it so how do you compress it it's called a chicken and the egg problem which came first, the chicken or the egg anybody know the answer eggs dinosaurs had eggs and how did evolution solve the chicken and the egg problem it did it iteratively improved to we got a chicken egg it started with primitive features with crude eggs and it gradually evolved everything together in a self-consistent way till it got to a chicken and an egg and that's what we do in dmrg so the exact same thing we start off with something that's like wait how can you have one without the other we don't have the wave function so how can we get the Schmidt decomposition so what you do is you build it up together and self-consistently so the term rg comes from the historical background of dmrg so ken wilson had a numerical rg method that was used for the condo impurity and worked great for other impurity problems and so that numerical rg looked much more like a traditional rg because it was a numerical implementation so dmrg fixes that by introducing the density matrix as a fix and back then I didn't know about quantum information quantum information was just starting and so there was sort of a duplication of ideas in dmrg and ideas of quantum information at the same time then at some point they came together we learned about each other and so now I'm describing something that is not the way I could have described it 20 years ago I'm using the ideas of entanglement quantum information because that's the way we think about it now but 20 years ago I would have said oh this is an rg method and it's altered by the usual rg method to make it work by this density matrix and so it doesn't look quite as much like an rg method so because it's drifted away from traditional rg methods the name is sort of old you can still think about it that way and rg is still a big part of this whole field but now we think of the best rg of this type is a different tensor network called a mirror that's a bit long story okay yes there's a name that's catchy and dmrg worked and so you should still use that you have to think about if you find something you need to get a good name and some people are not very good at it has to have a good sound this worked so it was the way I figured it out but let's see matrix product states is the heart of dmrg and then the other part of dmrg is the way you iteratively optimize it and so you would have to have a name that included iterative optimization of matrix product states via the line shows method something horrible like that so the solution here involves doing the smith decomposition not just on the middle but everywhere you can so you don't just lump the sites into left and right you give them an order like they're a one dimensional chain one through n and then you draw a dividing line between one and two between two and three three and four etc and you think about doing the smith decomposition on every dividing line each dividing line gives you a big compression of the wave function so you do them all and in the end you're left with something that is very small it's really highly compressed so if you just do it in the middle the compression you get is like the square root of the size you sort of cut the system in half but if you do it on every side it just completely kills off the exponential and gives you something much smaller so that's the first step okay and so we have to work up a little bit to this and then the second step is you want to optimize so this gives us when we cut it at every possible place this gives us something called the matrix product state which I'll explain and then we optimize the matrix product state to minimize the energy and you always so it's like working with the wave function but always in compressed form and it's like you have a zipped up file and you can access a little bit of it and unzip it and look at it and then zip it back up and move over and unzip the rest but if you unzip it all at once it'll be exponentially large so you can't do that so that's sort of the way it works okay so I'll get to that but first I have to talk about the diagrams so the algebra matrix product states gets a little bit messy and it gets even worse if you do generalizations for higher dimensions and so one of the things that some of the people who came from quantum information and started working on this they were used to quantum circuit diagrams and so they started doing circuit diagrams for DMRG and it made everything much easier and so we use these all the time okay so how do these diagrams work? okay so here's a wave function for four spins and the diagram for that is a box which represents the wave function and the external legs that come out are just, each one has a spin attached to it and the total size of this thing each one has a factor of two so it's two to the four two to the four numbers in this box and so this just is the representation for psi of s1 through s4 okay so suppose we had something that only had two legs coming out so every external leg like this and there's some dot or box in the center this is a tensor with two indices well that's a matrix but we're not going to use any of the mathematical properties of tensors in coordinate systems we'll just call something with lots of indices a tensor and so here's a matrix a vector has one index coming out here's a three leg tensor which we'll use a lot of okay so those are some of our basic units okay so let's use this to write down a diagram for this simple matrix multiply so I've got a matrix a and a matrix b I say a times b is c okay so first I go to the summation convention say cij equals aik bkj so k is the internal link that gets summed over okay so here's the diagram for this so c the tensor just has these two external legs i and j and then a times b has this and there's a leg that's internal and it's just like Feynman diagrams for QED the internal legs get summed over or integrated over in diagrams in field theory but summed over here and so that's a matrix multiply so the rule is you contract over all the interior indices and all the external ones are determining the answer that you have yeah yeah there's different I'm not sure all the right terminology but there's something that just gives you the number of indices that's the number of legs yes well that's like eating one cookie or one potato chip you can't just have one you can't just have five minutes on spooky action at distance I love to lecture about that in my quantum mechanics class but we spend a week or so on it so we can talk about it one on one but I don't want to try to get into that especially right okay so that's the simple form of these diagrams but suppose you say well what does the Schmidt decomposition look like okay so you take one matrix and you expand it out so it's like the reverse of this there's a diagonal matrix in here which can sort of be absorbed onto the left side or the right side so you can put a little dot in here or not it doesn't do very much but this is the Schmidt decomposition but if this index I goes over a very small range this is a big improvement so suppose that L goes from 1 to a million and R goes from 1 to a million the whole matrix is 10 to the 12 elements but if I do the Schmidt decomposition and if I goes from 1 to 10 I reduced it down to a 10 million plus a 10 million matrix so 10 to the 7 instead of the 10 to the 12 so this is a big compression and so that's what that looks like so now we want to take these complicated tensors and cut them in two and insert through the SVD and assume that there's a small dimension that goes in okay so the matrix product state is when you do this SVD at every possible step so suppose we started with four sites and here's the wave function for four sites and the first step I cut it in two and I have this extra I2 index that's where I cut it and so this index is small so I've actually cut the size of the problem by taking it square root essentially okay and then I take each side on the left I take the left side and I cut that into with another singular value decomposition and I cut the right hand side with another singular value decomposition and I've got something that's totally spread out and there's only a few legs on each tensor so this has four tensors and the edge ones only have two indices and the middle ones have three the middle ones are like a line and then one coming down and that's a matrix product state and we usually just sort of quickly write it like a comb like this but every time there's a junction here that means there's a tensor living in that place and it's a three rank three tensor except on the edges so that's a matrix product state so the storage so this is an approximation to a wave function and it's an approximation that works great if there's low enough entanglement if there's high entanglement it totally fails but if there's low entanglement it'll work great so the storage if it works great you know you go from to the n down to about n tensors and each of them are m squared they look like matrices here's what each one looks like the indices i will say are m so it's m by m plus another index that only goes over up and down so two m squared so we have I left out the two but n m squared times two and so that's an exponential compression so there's a truncation that came so you decide how much probability you're going to throw away maybe you throw away everything less than ten to the minus eight you know that's a typical number for dmrg throw away ten to the minus eight so then your answer is only good to eight digits well that's fine is it what is it it's very well it's very well behaved we understand a lot about it you have to talk about it would be a little bit more specific when you do the there's details about how you do the svd which I don't want to get into you know you sort of have to it's like sort of a putting in a gauge to the pieces as you go along to make it especially well behaved but it's basically just keeping everything orthonormal on the sides before you do a singular value decomposition but as long as you do that it works very well some of the other tensor network methods should work really well in principle but they they're less controlled numerically than dmrg so they have some trouble that makes them a little bit difficult to work with one more detail about doing the svd so when I did the first cut and I did an svd I wanted to think of that as a matrix that I cut in two but the left side had two indices so I wanted to treat the two indices as one and so that's this little three leg thing I call a combiner it's sometimes called something with fusion there's different names for it but it puts two indices together and gives you one index coming out but there's no loss there's no truncation it's like if you have two and two going in it just relabels it as four one to four so it's like here's an index that if you guys go into that it's like just relabeling them like this so it looks like one index and then it makes it into you call your svd with this form so it just looks like a matrix so you combine all of the indices on the left into one and then all of them on the right into one and do an ordinary svd so then dmrg is an MPS wave function optimized sweeping Lang Schoes plus Schmidt decomposition algorithm so it's got all of these pieces that we were working on and but it keeps it in this compressed form and sweeps back and forth until it converges directly in the compressed form the matrix product form okay and I put up some references here so I worked the original prl was pretty short I worked pretty hard on the prb and so people were able to really do the algorithm based on the long prb but more recently Uli Sholwok has written two nice reviews one was in RevMod Fizz in 2005 but then subsequent to that the matrix product language really took off and so he has this excellent one that is a good place to start in annals of physics that is titled something like dmrg in the era of matrix product states Frank Pullman is coming in a week or so and he's also going to be talking about dmrg but some more advanced topics that I won't get to so he's an expert in dmrg and so is Bella Bauer he'll come maybe in the last week we'll both be here later and I have to get to another conference tomorrow morning so you won't see me after tonight okay so the basic steps of dmrg so we have this matrix product state and we focus on two tensors to improve it works better to have two to improve rather than try to improve one at a time because they talk to each other and it helps and go faster so two of them together look like this four index guy and alpha goes over this interior index but it's really going over a Schmidt state for a cut right there and beta goes over the Schmidt states on the right and so these are both size m and then you have two sizes of two so it's about four m squared object that you work with okay you do line shows on this guy so this guy is not in its ground state if you do line shows you can make it have lower energy you can replace the wave function here so you treat this little tensor thing as a wave function it's a wave function in a reduced basis you can treat it as just an ordinary wave function you can lower its energy with this exact diagonalization step with line shows so it turns psi and it makes it look like just one four index tensor instead of a product of two guys it sort of erases the product nature that it had but it lowers the energy psi prime has a lower energy than psi then you do a Schmidt decomposition to split it back up and so it turns this guy into that and you put this back into the MPS because now it just looks like two links again but it's better links then you shift over by one site and you repeat so you work your way from site one, two, two, three you go all the way to the end then you reverse and go back all the time you're reducing the energy but keeping it in the you're only briefly for two sites out of the matrix product form then you go right back and then you keep going back and forth each back and forth is called a sweep you repeat it however many times you have time for or until it converges or until you run out of computer time but the number of times you might go back and forth for a for a spin chain might be only three or four, two even it converges really fast for a bigger challenging system you might do dozens of times but usually it's not so many so it's a nicely converging procedure so what I want to switch to now is this is the end of the sort of what I would do is a blackboard lecture if you had a lot of time but I'm going to switch over to one of my standard a talk that I gave in June in Brazil and now hopefully with this background you're sort of ready to understand in a regular talk and there'll be a number of interesting new things along the way you know it'll go a little bit faster now but you'll sort of see a lot more of what we can do so I'm sort of jumping forward skipping some historical stuff and starting with this slide that I love to show which is just talking about entanglement entropy it comes from a 12-side exact diagonalization just like you could do now or if you work a little bit harder you can do in a day or two okay so here's I took an eight-site chain diagonalized it took a 12-site chain diagonalized it I got every not just the ground state but I got every energy level and I did the Schmidt decomposition on each of those states so the Schmidt decomposition isn't a ground state you can put any state into the Schmidt decomposition and calculate its entanglement entropy and we were just seeing how low the entanglement entropy is for the ground state but suppose you look at the other states so the first thing you see of course is that boy if I draw all these levels it just turns black the energy spacing between levels becomes exponentially small you really can't distinguish these states up here weird pathological states but here's the entanglement entropy that you get for all of these states on the 12-site system and it's plotted versus energy so the ground state has the red circle on it and you can see how that is a low entanglement entropy and what does it look like it looks like it's below it's a little bit below 0.5 so it's small but then this n over 2 log 2 we saw on the table, that's the maximum possible that's up here and so the ones in the middle the black area here those go up pretty close to the maximum you know there's a little bit that keeps going to the maximum things like conservation of angular momentum they can't quite get up to the maximum but basically they get huge entanglement now this system these Heisenberg chains if you ask you know well is this a strongly correlated system or not you know people would say oh yes it's a strongly correlated system there's no way that you could think about treating it with just a simple mean field theory but in terms of entanglement it's very unentangled but it's not near zero so this is sort of a strongly correlated ground state over here the maximum energy has zero entanglement can you tell me why what those two start states are with zero entanglement ferromagnetic now why aren't they degenerate I put in a little random field I didn't tell you about to split degeneracies so just a tiny little field to make all the circles separate random field otherwise these should be exactly the same energy up and down but then all the guys in the middle are huge okay so this is a fundamental difference DMRG does not work for states in the middle in an ordinary system the entanglement is not small it gets big there's nothing that works to get one state here now in fact there is an exception to that which is if you have strong disorder you can have a many body localized state you might hear about this later on I'm not sure but in this school but there are certain types of states that everybody has low entanglement and that's because they have this little disorder field I just put a tiny thing put a huge thing on and it will make everything low entangled and then there's a couple of papers that just have come out in the last week or so telling you how to do DMRG in that case even for high energy states there's Mike Pullman who will be here in a week or so so you can hear about that's a little bit advanced but he might mention it okay so it's a little bit different to think about okay the states up here those are at high temperature well you don't have to isolate an individual eigenstate to do high temperature you just have to do combinations of them in the right way so we do have a number of ways okay I'm going to show you just a few slides just so you see what you've just been hearing in the usual way of talking about it there's a few extra details so why is the entanglement of ground state small because I gave you the picture of the RVB but there's also a quantum information idea called monogamy of an entanglement which basically says that if you really entangle two things together here neither one can be entangled with anybody else monogamy means you just stay married to one partner so you can't have polygamous marriages in quantum entanglement it doesn't work and so this is a sort of well-developed theory and that helps explain part of the area law it's like once this guy is bound to that guy it can't talk to anybody over here okay and the rest we talked a little bit about okay so here's the Schmidt decomposition the way I usually explain it in a brief fashion you know cutting the system in two entanglement entropy just gone through that exploiting the low entanglement in the 1D case cutting the system in two sort of inserting complete sets or incomplete sets of Schmidt states at each point to get the matrix product state okay and then this one has a little animation thing super crude it just shows how you move back and forth improving the wave function on two sites okay little more carefully drawn oh so there's one thing that I should have mentioned the other way of writing instead of just jumping to the diagrams here's the other way of writing a matrix product state so this a1, a2, am you think of as a matrix with an extra label on it which is just like the Pauli spin matrices they're matrices that have an x, y, z label on them so here's an extra label on them that is in the square brackets and it's just the value of the spin so you can think of it as like a pair of matrices and so the rule for getting the wave function the value for a particular s1, s2, sn you know if we have the rule for a function is that you give me particular values of s1, s2, etc say all up then I can tell you the number so the way that rule works is you give me these numbers the square brackets will pick one particular matrix and you multiply them all together and the first and last are a vector and it collapses to one number but what that number it is depends on the s's because there's really two guys at each place and you had to pick which one and that's what gives you that's how the compression works it just has these matrices but it can give you every possible value in expanding it out okay so how well does this work so this is a 2000 site spin one half Heisenberg chain didn't have any extra disorder now for the Heisenberg chains there's an exact solution due to Hans beta back in 1931 or so and it also works for finite chains or infinite chains but here use it for finite chains you have to write a little program to evaluate the answer but you can know the exact energy and a few excited states so the left hand picture is showing the energy that's an E it's the total energy this system is a thousand sites so it's a big number and the M is how many Schmidt states I was keeping and there was sweeping back and forth and the index I is where you are in the sweep now it's 2000 sites this only goes to the center and then it uses reflection symmetry to replace the right hand side with the left and it sort of then can turn around so a little bit of efficiency okay so you do m equals 10 then you you sort of get that converged then you increase to m equals 15 the energy goes down, change it to m equals 20 and the way you typically do this is you slowly increase the size of the matrices which is the same as the number of Schmidt states you slowly increase them until you get whatever conversions you want and so then the stars here are the exact ground state and a whole bunch of sweeps have just sort of sat there at the scale then the right hand side is showing the deviation in the total energy from the exact beta onsets answer okay and so as you increase the size of the matrices it's falling this exponential scale and it's the total energy it's not per site so this is a really accurate wave function and it's got some curvature here but it sort of straightens out once you're below the first excited state this is a gapless system a critical system and so this would be a system that's hard to do because gapless systems have long range correlations and so they're considered hard but because it's not strictly infinite it's only 2000 sites there's a very tiny finite size of energy gap and as soon as you iterate so that your energy is below that first state it starts looking like a gap system and then DMRG tends to converge exponentially well for gap systems and so it sort of goes to a straight line and of course we could push this to more accuracy how big an M can we do on a on a desktop you know maybe 5000 something like that so you can make this essentially exact to all digits that you might have on the double precision so for 1D systems DMRG sort of is that you know gets you what you want very accurately sort of the best method around we've learned how to get other properties that are of interest in experiment so finite temperature spectral functions dynamics out of equilibrium dynamics things with disorder some things are harder than a lot harder than this but we know how to do a pretty good job on lots of different things okay so here's my usual introduction to diagrams for matrix product states you've seen that here's one where if you connect them up it actually gives you a trace and then they're all matrices matrix matrix product states is variational states I didn't talk much about calculating observables okay so you have this comb state this matrix product state and you want to calculate a property the way you measure something is there's two way functions there's a psi on the right and the psi on the left and you have to do a contraction so the way it looks in the diagrams is you put the first the guy on the right on the bottom the left on the top and you contract them together and contracting is summing over all the values of the spins so that just links up all of the lines but and if you just did all of them linked up with nothing in between it would just be the normalization and you'd get one but if you want to measure something you insert a spin operator at that particular place or if you want to measure a product of two things you insert two of them and then when you contract over everything it gives you a number and that's just that particular property so with this you can calculate correlation functions and lots of other things it doesn't matter how many pieces you put in here but it has to be sort of in this simple way it should just be one term okay so there's often diagrams that look sort of like this with things contracted top and bottom and then other things in between there's another let me see if I've got that in a slide I guess I don't have it so let me just mention it you can also put the Hamiltonian in between here and you can write the Hamiltonian as something that looks like a matrix product state it's called a matrix product operator and you'll see that a little bit in this afternoon in the iTensor the iTensor library because that's the usual way the Hamiltonian is handled with iTensor so a matrix product operator let me just draw it in the diagram so here's the usual comb for a matrix product state the basic unit looks like that and this is an MPS and a matrix product operator an MPO looks like this okay so this is like S1 S2 let me end it S1 S1 prime S2 S2 prime okay so this this general if I make this into a box this can represent any operator on those spins okay but it turns out you can compress the usual Hamiltonian operators to make them look like this they have the same sort of structure of tensors as up here they typically have pretty small bond dimensions pretty small M like 5 and so you can write the whole Hamiltonian in this form and the fact that it looks like this guy means it just fits into the algorithm really easily so you can write an algorithm that just assumes that you've figured out the Hamiltonian like this and then you do all the programming with this totally arbitrary and then you only have the Hamiltonian the system you're doing coming into one part of the calculation everything else kind of looks the same okay so MPOs are one of the big advances of the last 10 years or so in terms of organizing DMRG programs and making them easier okay I want to skip matrix product bases to the old RG way of thinking about things okay and this is what I talked about comparing with the old way of thinking about RG in the new way so I want to skip this also okay so let me talk a little bit about doing DMRG for 2D systems so this will just be sort of an overview of some things of interest okay so the algorithm that I presented was all for a one dimensional chain and it really used that because it kept splitting up the system on each bond okay so suppose we wanted to do DMRG for 2D systems well if the 2D system is infinite you're out of luck but if you want to do a strip and then you can do wider strips and see what the trend is as an approximation to an infinite 2D you can do this sort of scheme this is sometimes called a snake but you can take here's this 2D grid it's only 5 wide here but I make it into 1D just by connecting the sites up with the blue lines okay and it then looks like a 1D system but it has longer range bonds so for instance the first site is connected very strongly to site number 6 on the path okay that's something that you don't usually get with a 2 1D system all of these connections take you sort of deep into the other system it's up the area law the 1D area law it's saying oh yeah I got something inside the system here talking to a guy inside the other system now it doesn't go infinitely far across it sort of only goes 5 across so it messes it up a little bit but you have to think about this as a 2D system to find out how well DMRG is going to work and so you do a cut like this and say okay the number of sites on the boundary is 5 so the entanglement should be bigger by a factor of 5 than in 1D well factor of 5 doesn't sound too bad run it 5 times as long except it's an exponential it's a 5th power so it actually gets much harder so the the number of states that you have to keep M will go as some exponential of the width of the system so at first some people thought that you wouldn't be able to do DMRG at all for 2D systems but it turns out that this exponential is effectively the A coefficient is fairly small so you can do a reasonably big system without without running out of numerical power they become big calculations but if you keep say M equals 10,000 then you can do a system with 10 or 14 for these spin systems which is really pretty big and it could be a cylinder so it's connected with periodic boundary conditions that helps reduce the finite size effects and so that can be enough to get really good answers for some 2D systems it depends exactly on how much is going on on one site the first case is if there's spin one half and so there's only two possibilities and so the area law coefficient is especially small if you had a Hubbard model there's four states on each site that's like twice as hard you can only do half as wide a system yes so I'm assuming that the 2D system has nearest neighbor connections and then I put a number 1 go down to 5 oh so I guess the way this is numbered it goes up to 10 so there's a nearest neighbor connection that connects actually sites 1 and 10 and that would be as long as it gets in this arrangement I could have made it go down and then jump back up to the top and go down again and then it would only be 5 away or 6 away so it's not so far so a 2D finite strip is the same as a 1D system with long range interactions ok but the entanglement is that of the 2D system it is the same we use the simple 2D system to estimate the entanglement we look at this picture it is equivalent they're always always equivalent to a 1D system with long enough bonds so part of the DMRG algorithm doesn't care that it's 2D you write the same code but then you find that it doesn't converge very well with only 10 states or 100 states you have to crank it up to a lot more states to get good accuracy so it sort of knows that it's really 2 dimensional well if each dot so the little guy sticking out here those are the degrees of freedom of the site and they're not labeled here I'm thinking about them as up and down so each little angled thing is up or down if it was hovered it would be empty up electron down electron or doubly occupied so there'd be 4 possibilities ok so there'd be the same picture but there would be more entanglement because of those extra degrees of freedom it's got spin fluctuations and charge fluctuations sort of twice as many yeah well I haven't drawn all the bonds the horizontal bonds that are in there yeah yeah I guess that's unclear so it's a 2D system so all those horizontal bonds are really there in the Hamiltonian but they look like long range bonds in the matrix product state snake ok so I've talked about DMRG since people from quantum information came along particularly Gifre Vidal found one of the first key connections between quantum information and DMRG he sort of reinvented part of DMRG without knowing about it and then Ignacio Sirac and Frank Verstrata were two of the other key people that started in quantum information I have done a lot of major work in this improving DMRG so to do two dimensions here's this thing that I just showed you but as soon as you start drawing the diagrams you can draw a better tensor network like this one that really directly represents 2D so a basic tensor here has five legs not three has the one coming out which has up or down and then it has north, south, east, west the four directions and it doesn't have a snake path it directly represents 2D so there's a tensor leg that connects every nearest neighbor well that matches what the Hamiltonian has it's a much more natural representation it allows much better compression of the wave function so DMRG is equivalent to this in 1D there's three legs in each in these strips the entropy starts blowing up with the width and so your M has to blow up and perhaps the M doesn't have to change very much and so you might have to use those 10 and get a pretty good answer and it's more difficult because it has more legs so that's still a lot of coefficients but it's a much better compression so this has been a very promising algorithm since the early 2000s about 10 or 15 years ago when it was proposed however it's much more difficult to work with numerically in particular we talked about estimating the calculation time of an algorithm you know diagonalizing a matrix is M cubed well in DMRG that M cubed is the biggest piece so the calculation time in DMRG is M cubed times a few factors involving the size of the system okay in this case the calculation time it's much harder to work with this and you get up to M to the 12 it's it's not exponential and exponential is the worst except that depending on the coefficient M to the 12 is worse than an exponential for quite a while and so this is the key difficulty of working with PEPs and so it took quite a while for this to turn into something that's really useful now it is but there's still just a couple of group well there's mostly Philippe Corbeau who has the nicest work with this sort of PEPs and it's just very difficult calculations there's a number of groups now that do two dimensions with DMRG because it's almost like the same sort of coding as in 1D so these two sorts of methods they're both their cousins you know they're both based on this sort of matrix product state or the generalization the whole field is tensor networks and so they're close cousins just adapted for different things one of the things that you can do with this is you can actually make the lattice go out to infinity and the way you do that is you just say well let me make every tensor the same and then just assert it goes out to infinity and then the question is well can you work with that and actually evaluate properties and it turns out you can and so it's amazing that you can actually sort of do numerical works not with a finite size effect but it's strictly infinite and that's what Philippe's work does it goes out to infinity and so that's a neat thing but you still have this M the biggest M that I've seen maybe 14 or so and so it's sort of competitive with the best dmrg work so I am going to save this for later and we can take any more questions and otherwise we can break for lunch yes well you have a slightly different peps for different lattices for instance for the lattice which has an interesting structure it took a little while to figure out that putting a tensor on every site wasn't the best thing to do and so the Kagame lattice has little triangles that touch on corners and they put a tensor in the center of each triangle and then that works better and so each thing you have to figure out a little bit but these subtle little differences that have to be worked out more questions? yes so we always use the commuting symmetries so if it's a spin system we keep track of s of z if it's a fermion system like the Hubbard model we would keep track of two numbers the number of particles and the total s of z other groups have a spin full SU2 symmetry which means that you can work with a smaller m and it's like you're working with a bigger m by maybe a factor of 10 or something so that's a nice thing to do when you have that symmetry it's also a much more complicated program and you know and many systems don't have the symmetries that would make it work so our group doesn't do it but other groups have some systems they do better than us because they put in the full SU2 symmetry and there are other symmetries that you can put in it's difficult to put in the spatial symmetry that you might want to put in because as you step through the lattice you're breaking it up you're killing the symmetry and so spatial symmetries are much harder but the sort of local symmetries you can that's convenient so we gather again a 2 in the computer labs