 If you were planning to present a poster, I hope you have everything ready and tomorrow afternoon we'll set up the posters upstairs on the terrace level. Ok, so we can start now with the first lecture on the introduction to tensor networks. Thank you very much. Thanks. Ok, is that better? Ok. Hi, my name is Miles and I'll be telling you about tensor networks today. I'll be introducing them pedagogically and then I'll end today the second lecture by also introducing machine learning and then tomorrow I'll tell you about using tensor networks to do machine learning. So actually kind of exporting ideas from physics outside of physics to other subjects. But then we'll bring it full circle at the end and apply tensor networks to machine learning on quantum computers, at least mostly theoretically, but I'll show you one experiment that another group did where they actually did this in real life, like on actual quantum hardware. So we'll try to bring it full circle at the end. So who's heard of tensor networks before? Ok, that's great. Who's actually done a calculation involving them before? All right, some of you have. Ok, it's a good number. Awesome. Ok, who works on machine learning in some form or has used it a little bit or interested in it? Ok, great. All right, so there's a lot of people here who have some background already, but many of you don't, so I'll try to go slowly enough and please stop me at different times if you want to understand something a little better, make me slow down. Ok, so let me just quickly introduce myself. I'm currently a research scientist at this new place called the Flatiron Institute and I'll tell you a bit more about that. My specialty is developing software and algorithms to model the mini electron problem, mini body problem. So one thing I'm working on right now is a follow-up on some work I did last year on using DMRG tensor network methods to do quantum chemistry. So this is partly just to say these are very broad methods. I'll have more to say about that. But I've also been working in the last few years on applying them to do machine learning. I'll tell you a lot more about that. So what is the Flatiron Institute? So this is a new place I thought it would be worth spending a couple slides to tell you about. So this is a kind of a philanthropic institute, not maybe so different from ICTP, although the funding is all privately coming from this one private foundation, the Simons Foundation, and it's in New York City. And basically the mission of it is to advance scientific research by advancing computational methods. So in a lot of areas of science, the focus is some topic like we're going to do chemistry or we're going to do materials science or something, and then we might use computers. Here it's like we're focused on advancing computer methods directly. And in some centers this involves a lot of data analysis. In our center it's more about modeling and simulation. And one key aspect of our center is that, or all these centers is that we're developing and advancing open source software and then releasing it to all of you and supporting it. And so the four centers now that occupy this building are the one for computational astrophysics, center for computational biology, I'm in the CCQ, the Center for Computational Quantum Physics, and then the newest one that's going to be coming online this fall is going to be the Center for Computational Mathematics. So that one will be kind of involved in all the other three developing applied mathematics techniques, packages for solving partial differential equations or methods like that, for example. And then I just thought I'd show you a cool picture of New York City and where we are. So sometimes people ask, are we in the flat iron building? No, so that's the flat iron building. It's the triangular shaped building. This is Fifth Avenue, that's Central Park, and it runs all the way down. And we're in this building over here, and the Simon's Foundation is right across the street. So I thought I would flash that up to say, please come visit us sometime and you can check out New York City and talk to a bunch of physicists. So it's a really fun place to work. OK, so let me know if you have any more questions about that later. So I thought I would start my introduction to tensor networks very broadly with the kind of classic application of tensor networks which is one you're all interested in, which is the quantum-minibody problem. So some of these slides are some kind of old chestnuts of things that you've seen before, but I thought I would still mention them. So what is the quantum-minibody problem very broadly? Well, it's a very tough problem, and it's a problem of solving the behavior of electrons in matter. So what makes it so tough? It's a continuum problem. You have electrons moving in a 3D continuum. Well, it's 3D, that's really tough, and it involves strong and important interactions between electrons. They're not always so important, but in many cases they are, and we'd like to deal with them to be precise about the problem. And the interesting thing philosophically is that this is a problem where we know our theory of everything. We know what we have to do. So this isn't exactly the theory of everything. There are many effects that are important, but for many problems it's sufficient just to solve the properties of the eigenstates of this Hamiltonian. So this is the Hamiltonian where you just have, that should say, dr. But this is where you have standard kinetic energy term and some kind of background potential, and then you have the electron-electron interactions, and here the background potential is just coming from static nuclei. So you just have nuclei with z protons, and then you have the electrons feel those nuclei, and then they also repel from each other. So if you could solve that, you could do a lot of physics and chemistry very, very precisely. So we know what to do, but the problem is we don't have really good ways to do it necessarily. So this is this famous quote by Dirac where basically he says, in a nutshell he's saying, basically we know what to do to do a large part of physics and chemistry part, but he says that the equations are much too complicated to be solvable, so it becomes desirable that approximate practical methods should be developed, and we want to describe the main features without too much computation. So that's what's really important, right? So we want to have some kind of way of getting at the most important parts without spending too much computational time. And so tensor networks, which I'll be telling you about so how do you start reducing the complexity of this problem? There's lots of different things that you do. One thing you do is you might take the Born-Appenheimer approximation, say treat the nuclei as just classical objects. Another thing is this is very common is to project the electronic motion into certain orbitals. You don't always do this, but most methods do. That means you just say, we're going to chop up the continuum and we can even go as far as to throw the wave function out entirely and say let's not even involve that, let's use things like density functional theory and local density approximation. And you can reduce things down to model systems. That's another thing you can do and try to get the dimensionality of the problem down. So this is just a cartoon drawing of, say, the 1D Hubbard model and what its Hilbert space looks like. Where you have electrons with spin and some sites could be filled and some could be empty. With the problem that the Hilbert space still grows exponentially with the number of electrons. So it's still a really tough problem. So how can we go about this problem? Like I mentioned, one thing you can do is you can try to throw the wave function away. You can try to use Monte Carlo to sample over electronic configurations. But you can also think about maybe we could work with wave functions. Maybe we could actually reckon with them somehow. So let's start thinking about wave function. So what is the wave function? It's an assignment of an amplitude to every classical state that the electrons can move through. So what are those classical states? Let's say we had some partially filled system and we have electrons that can be up or down and they can be on different sites. So really every site can have four states. It could be empty, up, down, or doubly occupied. So we'd have four to the end states given in sites that we could have. I'm thinking kind of in a grand canonical ensemble since here. Naively stored this wave function. But as you know, that wouldn't work. And why is that? Just because four to the end is extremely rapid growth. So if n is ten, four to the ten, that's ten to the sixth, that's already getting to be painful for classical computers. If n is 20, like 20 sites holding electrons, four to the 20, that's ten to the 12, 30, ten to the 18, once you get to about 130 sites and you try to describe that full Hilbert space, then that's a number of amplitudes in the number of atoms in the entire known universe. So if you just said each atom could carry off one of these probabilities and we could somehow use the whole universe as a big memory, then you could do maybe 130 sites, but you do 131 and you need two universes. Or four more universes or something. If you do 132, you would need 16 more universes. So it would start getting really problematic. So that's not going to work just to store all the amplitudes. Maybe we can do ten sites, 20 sites, but forget about 130 ever. But can nature even really be doing this? So we say that's what the wave function is, but is that even really right? Walter Cohn, I think, I don't remember the exact quote, but he tended to even think that the idea of a wave function was kind of a fictitious idea almost. Could this really be right? Could this really be what nature is doing if it takes that much memory even to store the wave function? Are the amplitudes of a realistic wave function completely independent numbers that you really have to store separately? Or is there some kind of simplifying structure going on that makes the wave function like a feasible concept to work with and think about? So it turns out there is, so otherwise I wouldn't be saying that. So there's been major progress in the last 30 years understanding that there is structure in quantum wave functions and what that structure is and how we could exploit it. There's different approaches, and so when you think about quantum entanglement between particles, it means that the particles aren't all just doing their own thing, but there's some kind of correlations and you can take advantage of these correlations. And when you do this, you are naturally led to the idea of tensor networks. I'll have more to say about what these diagrams mean and what these different tensor networks are. But basically you can think about entanglement patterns in some kind of natural quantum wave function, not just an arbitrary quantum wave function, imparting or stamping a bit of internal structure to the wave function. Or the other way around is to say that you can impose some internal structure that limits what correlations this wave function can actually have, but it could limit it to the right subspace where natural wave functions can be found and manipulated. And so the payoff of doing this of working with tensor networks is you get a set of numerical methods that you can apply to strongly correlated systems. So that's really interesting. It's not just starting from a weekly correlated system, but as maybe a Slater determinant and then tacking on a few more. It's actually almost starting at the other end. In fact, these methods actually work better for strongly correlated systems than they do for weakly correlated systems. So it's kind of complementary in that sense. But they actually work for both weakly and strongly correlated systems. So that's really nice. Their main achilles heal right now as dimensionality. So they're really good, as many of you know, at 1D systems, 2D systems. This can include 3D physics. But in terms of what their weakness is, is that if you take a infinite 1D system, fine, infinite 2D system, okay, infinite 3D system right now is not in the works. Sorry, I shouldn't say in the works. It's not happening, but it is in the works. Maybe years from now we'll be doing that. But it has other benefits, too. Not just numerical methods, but using tensor networks is a really nice framework for how to think about wave functions. And it's been used very fruitfully in exotic quantum states. Things like spin liquids, fractional quantum hall effect states, et cetera. And as I'll be motivating throughout the talks today and tomorrow, in fact, it could be thought of as a very general approach to applied math problems involving really big tensors. So there's a community of mathematicians that include physicists who are interested in studying and decomposing tensors. But some of the mathematicians think about, okay, you have a vector, a matrix, and then a tensor is one or two steps past that. But in physics, we've actually been now for 30 years thinking about tensors with something like hundreds or thousands of indices or even an infinite number of indices, and we have this whole machinery. And that could be very interesting in applied mathematics, and by applied mathematics I'm including machine learning. So that's some direction I think it's interesting to go. So I'm trying to get more people to know what I'm going to go is today I'm going to do some more introduction of tensor networks and mainly motivate them through the lands of matrix product states and kind of give you a pedagogical introduction to those. And then later today I'll tell you about some basics of doing computations of matrix product states and give you some resources. And then end up today by introducing machine learning and actually I'm going to kick that a bit to tomorrow. So I'm going to do some different things you can do combining these ideas together. All right. So to motivate tensor networks in matrix product states, let's think of the simplest lattice model that we can, and then try to kind of think about how could we tackle and understand the wave function of this kind of lattice model. So the simplest one I can think of is the transverse idealizing model because it just has the simplest terms in the Hamiltonian. So many of you know this model. Think about a bunch of sites that just have a spin a half on them, and then they interact through this term sigma z, sigma z that just wants them to point in the same direction. So see if we make this other term very weak, that's what they do. They either all point up or they all point down, but then there's this other term that's kind of the spoiler, the transverse field. It doesn't commute with this term, so that's what makes the model a quantum. And if that term is very large, it wants to spins all just to point in one direction to the right, say, the same thing that I wrote. So there's a phase transition where this behavior switches over, where either you have two ground states all up or all down with some quantum fluctuations on top, or you have one ground state where all the spins point to the right with some quantum fluctuations. And so that's just to be concrete, so we have a model to think about. And so what is the wave function of this model look like? Well, it's just a weighted sum of all the basis states. This is just very introductory, so I think this is probably too easy for some of you. But you know what I mean when I say that. It's just a tensor product space of all these basis states. So there's two to the n of these things if we have n spins. I'm gonna be talking about tensor network, so just for those of you who aren't as used to spin models, say, this could apply equally well to, say, fermions in some kind of orbital basis or slated determinant basis. So when I say labels s1 through sn, some other states. So it could mean orbital states. So here what I mean by this kind of 00101 thing is actually what I mean is that I mean that orbital number 3 and 5 are occupied. So when you actually write this in real space, you'd get a slated determinant. So this is some kind of second quantized form of many body electronic wave function. So I'll be talking in pretty general terms, but just to say this could apply to spins, this could apply to electrons. So what we wanna do is we wanna think about how do we tackle the wave function so that we could try to find the ground state which is one of these things. So let's write the wave function in some more compact form. So that's the full thing written out. So let's write it in a more compact form. Let's label the basis states by these indices s1 through sn which could take two values or four values or different amounts of values. And then in that form the amplitudes look like a big tensor. So the idea is that we have in spins or in electrons or something and then these take a finite amount of discrete values. So naturally the coefficients just look like a big tensor. They just also have in indices. So that's the amplitude tensor. So in some sense that is the wave function. So if we can work with this tensor for some given basis then we can work with many body wave functions. So we need tools and techniques to work with tensors with lots of indices. So just to motivate what is a tensor network and where does it come from let's think of this tensor in different ways. What do I mean by this tensor? So I don't necessarily in this context mean all the fancy things you can say about tensors in terms of transformations and change of basis. Here I'm just thinking of a tensor as some map. It's just a map that says like a rule where it says if you show me one of these spin configurations up, down, up, up, up, up that goes with that configuration. So it's just named by that configuration. Here's another one. I give you another complex number and so on. So we could think of it as a big object full of all these numbers. We can also just think of it as this arrow. It's like a map. So any rule that assigns a complex number to one of these configurations is this tensor. So we don't necessarily have to think of it as a bunch of numbers sitting on a hard drive. We can think of it as some rule we can implement. We can try to actually store this whole thing. Because it's this exponential growth issue. People have tried taking it in the teeth using symmetries and using big computers and just working really hard, but that only gets you to about 50 spins. So there's this new post-doc at the Flatiron Institute, Alex Vitek. I think he's perhaps done the largest ED calculation ever with Andreas Loschli. They used this automatic symmetry finding software that they developed and got up to 50 spins. So that's the biggest you can do. In five years from now you can do 51 or 2. So that's slow going. So this exponential cost is a very serious barrier. So we can get around it though different ways. So the way that leads to matrix product states in tensor networks is to say that we know from physics considerations, from some theory that's been done and just kind of from physical intuition that if we think about this 1D chain of spins, say that defines the transition fieldizing model, that correlations between them get weak in longer distances away. So if I take, say, this correlator, think of it as a connected correlator if there's an order parameter in the system, then as I take i and j farther and farther and farther away, then this correlation function generically decays. And you can show that's always decays, and this correlation decays exponentially, in fact, in 1D if the system has a gap between the ground states in the first excited state. So we can use that. So we can use that by saying, in fact, storing this whole wave function tensor, we can try to store approximations to it instead. So the simplest approximation you can do, this goes by the name mean field theory, there's different forms of it, this is one form, is to say let's neglect the correlations altogether and let's just chop the wave function up into these pieces. So it's a thing that carries in indices here, I'm just taking n equals 6, so it could fit on the slide. And so I'll say, well, that's just an ex, so I'll just say it's an outer product of a bunch of vectors. And this works, you can do this and you can get local properties okay by putting the correct numbers into these vectors, but you're missing correlations just by construction. And you can repair this in different ways, so something that people like say in quantum chemistry do is they'd say, okay, well that's a slater determinant maybe if you properly anti-symmetraizer you work in a second quantized form. So we could fix this by saying, well slater determinants, and that can work really, really well for moderately correlated systems, but that can still break down in a lot of ways. Is your question can this capture a long range order? It sort of can, so you could make this represent a ferromagnetic state where all the spins are pointing in the same direction. So this could capture a long range order in a sense, but you're still missing the fluctuations on top of that. So that's a good question though. So this captures a lot, so that's why I put this green check to say this gets you pretty far, but this is just mean field theory, in some form of mean field theory. So there's different ways of fixing this, one is to sum more of these states up, but there's a different way. So a different way is to actually put in fake indices. So to say, okay the problem with this, you can view it as a math issue, you can say that's the fact from being tensors with one index to being tensors with more indices and then we can maybe work our way back over to this full thing over here. So how does that work? So what you can do is you can put fake indices on these objects and then sum them back up. So you can promote this first one from being a vector with one index to a matrix with two indices, S1 and also this fake one I1, and you can also stick this I1 index on this object, but then sum it back up, so that line is I1. So we still have something on the right that has all these indices S1 through S6, but there's more going on inside it now with this I1 index. And we can also put an index I2 on that one and have that match up with this index I2 on this one and sum back out over I2, and so on, and so on, and so on. And the resulting structure is this thing called a matrix product state. And notice that's not the only way we could have done this game. We didn't have to put the other copy of I1 way over here or something like that. And we could have had this kind of spaghetti web of I1 connecting somewhere far away, I2 connecting somewhere far away, but instead we did it in this chain-like form, so that might make you suspect that this is best for 1D systems, and that is the case. But it actually works pretty well even for 2D systems and other kinds of things that don't necessarily have a 1D structure. So we'll unpack a bit this form and why is it called a matrix product state. So you can get local properties extremely accurately, and you can prove or show that it actually has exponentially decaying correlations generically. So I mentioned there's this theoretical understanding that that's what 1D systems have, well this has it, so in some sense this is the right form of all 1D way functions of the gap. You can argue that this is sort of the correct form of 1D way functions. OK, any questions about that? I'll say some of this in a few moments. You definitely could. So that leads to these other ideas like PEPs tensor networks, these 2D tensor networks, for example. And there's other ideas, too, that are sort of in the realm of tensor networks, they're these things called correlator product states, they're also called entangleplab states, neural quantum states. So there are these other ideas that are sort of a little bit past what I would call a tensor network, but they involve kind of taking or string bond states where maybe you could have not only one of these tensors, but you could say the way function is defined as a product of multiple tensors that are all decomposed in a way like this, but with different patterns of the string. So one that goes 1, 2, 3, 4, 5, 6, but another one that goes 1, 3, 5, 2, 4, 6 or something like that, and then layer those together. And you can get a certain distance with that, although you can start to lose some of the benefits of this more simple approach. So sometimes it comes with tradeoffs, okay, great, yeah. What kind of what is, sorry? Hmm, hmm. That's a good question. So I think you could probably see how s1 and 2, it's worth unpacking this a little bit. So I think if we start there, you can see that if we just ignore for a minute s3 through s6, then what I've done here is say, okay this part, just the s1, s2 part, actually how about I just write this so let's say the wave function only had 2 spins, psi, s1, s2 then I could approximate that by this, maybe I should call these by different name or I'll call this capital psi, okay. And then I could do this product form and we know that's just an approximation. But now if I introduce this i1 index and sum over it, then if i1 goes from if it goes over 2 values, then all I'm doing is writing a 2 by 2 matrix as the product of 2 other 2 by 2 matrices. So that's just to say this can represent any 2 by 2 matrix. Because we know any 2 by 2 matrix can always be written as a product of 2 other 2 by 2 matrices. So that's fine. But where that starts to not work as well is when you introduce more indices. Now it's kind of like saying if this only runs over 2 values I'm representing some 2 by 2 by 2 thing as a product of just 2 by 2 matrices in a sense, like if I kind of cover up that index or cover up that index. So it starts to break down. But then you can still, it turns out you can always represent any tensor in this form as long as these i indices run over enough values, like maybe I have to bump i up to go over 3 values or something at some point, or 4 values as I keep adding more s indices. Now I think to more directly answer your question you were asking about correlations and I said what I just said is to say you can very quickly get arbitrary correlations between s1 and s2 and then s2 and s3. But you can also start to get correlations between s1 and s3 or s1 and s4. You can think of them as being kind of carried almost like passengers on a subway. You can kind of think like I set the spin s1 to a certain value and that puts a certain matrix here and then that matrix comes over here and it kind of does something to s2. It drops off some passengers of the s2 station or some of them get off at that station but some of them stay on the train and go further to s3 and then get off there. So information can keep going arbitrarily far away but it's just that you can imagine by the time I've passed more and more and more stations, the passengers that get off here mostly have all gotten off. So when I get to station 80 all the passengers that got on at station 1 have mostly all gotten off by now so that's the correlations decaying. The influence of spite 1 has felt less and less and less as you go further along. So. For gapless states? Gapped. For gapless states it works very well. For gapless states it also works very well just not as well. So it's interesting because you'll read some papers that will say it doesn't work for gapless states but then if you look at it these are people who are trying to sell you other wave functions. So these are very good people but usually the paper is something about a different kind of tensor network they'll say use mirror because mps doesn't work for gapless states but always be suspicious when someone's trying to sell you a product. So mirror is great and you should use mirror it's interesting but that doesn't mean mps is bad so it turns out matrix product states work extremely well for gapless systems at least in 1D and also somewhat in 2D too they just work less well so you have to work harder basically they can represent power law correlations very accurately out to thousands of sites plus over to exponential decay eventually so then out to 2000 or 5000 sites or something it'll be exponential but as long as you work hard enough on a finite size system or an infinite system but you measure only out to certain distances it can work very well and it gets very good energies because the energy might only be the Hamiltonian may only have a local support of two sites so if you can get power law decays out to thousands of sites the energy is good so yes it's a good question ok mostly ground state there's been some work recently on finite temperature it works pretty well for finite temperature real time is coming along a bit but it's not very far along but also you have to remember what's the bar to compare to so it works ok for 2D for finite temperature but it doesn't have a sign problem these methods in 2D so it doesn't work that great but it's the only thing that can actually treat certain problems or maybe it in one other method so 2D is tough for a lot of things and also time evolution it has problems of time evolution but it's one of the few methods that can actually do it at all so the bar is just very high in physics for success and you know it's interesting to try all these things and most of the problems I'm talking about you can't solve the problem these are numerical approaches so if it's time dependent or independent Hamiltonian you can do a lot numerically but you're right you can't really solve it but you can learn a lot by numerically simulating it with these techniques for sure ok so a little bit more motivation so I'll kind of fly through this part this is just to give everybody's brain works a little differently so I want to motivate matrix product states from some different perspectives so thanks for those good questions so in this perspective where we said ok it's kind of like a map so you can think of it as a machine that eats spin configurations and then spits out complex numbers so we can just make this map in some way, any way that occurs to us we can make up any rule so how about this rule and it seems kind of crazy at first but it leads right back to the same idea of matrix product states so the rule is take a spin and associate a matrix to it and then once we have these matrices what do we do with them multiply them together and get a probability now a product of matrices is not a number it's another matrix but we'll see how to fix that so pictorially what this means is take an up spin and replace that by a up matrix take a down spin and replace that by a down matrix and these matrices can be totally different from each other they can be arbitrary matrices they can be singular, not singular whatever matrices you want only constraint is that they need to have the same size and then use them by saying ok, if I want to assign an amplitude to the pattern up down, up up down go up down, up up down like that so just replace the spins with these matrices and here I'm putting subscripts to say the two matrices on site 1 could be different from the two matrices on site 2 and site 3 or they could be the same and the reason you get a number is that it's easy to arrange for the first pair of matrices the ones for up and down on site 1 and the last one to have a column size of 1 so when you're done you get a 1 by 1 matrix at the end i.e. a number but all the ones in the middle could be 5 by 3 or whatever sizes you want the only constraint is that on a given site the two matrices have to be the same size so then from this rule it's pretty obvious why this thing is called a matrix product state and we'll see how it connects back to the other one so just to show you how the rule works some more if I have that pattern up up down down down I use this pattern of matrices and the idea is that I just replace spins with matrices and multiply them together and that gives me amplitudes so it's just some rule to get amplitudes from spin patterns yeah oh yeah well it could be exactly equal sometimes but most of the time most context where you use this it's approximate and it's approximate because you typically limit the size of the matrices so if you made the matrices big enough you can easily show that you can represent any wave function so it's exponentially big matrices so you don't usually want to do that you don't gain anything if you do that usually well the nice thing is we have some algorithms that do that for you so that's really neat so the DMRG algorithm is a famous algorithm for lots of reasons but one reason is it's adaptive so you start with an arbitrary size of matrices say all one by one and then you run the algorithm and they can grow and shrink adaptively but you could also just choose it do a problem where you choose at a certain size start over and choose a bigger size and see if you do better and that can work really well too yeah it's a good question it can be a problem for other tensor networks how to do that we have ideas but it can be tougher for the other ones this one where you know how to do it very well so this is another way of writing that same thing a matrix product state here I've suppressed those i1, i2, i3 indices that I had in the other motivation but it's the same thing it's like those i indices so now some basic facts about it I already touched on those in some of the questions but let me just say that let's say the typical size of the matrices is m by m as I mentioned on a finite size system the first one might have to be 1 by m but let's say typically the ones in the middle are m by m then if you do some counting you can see that this takes the full wave function which has to have 2 to the n parameters to just 2 times n because there's 2 matrices on each side for n sides times m squared parameters so that's a massive compression you've taken the n from the exponent down to in front and you can actually get the n out of there entirely if you assume translation in variance so this is a huge compression from some kind of exponentially growing approach to some nice kind of polynomially growing approach and you can show that if this m is large enough you can represent any tensor whatsoever it's just that it has to be exponentially big though it basically has to be 2 to the n over 2 and you can represent anything but that's not, you don't want to have to go that big so you try to get away with a smaller m ok, so for the rest of the talk and today, tomorrow it'll be really helpful to use this notation called tensor diagram notation so I want to introduce that right now and it makes it a lot easier to motivate everything else I'm gonna say so this notation works as follows so I've been writing tensors in this kind of classical notation where they're just multi index arrays of numbers in the way I'm thinking about them but it's nice to write them graphically in the graphical notation if you haven't seen it before it's actually pretty simple so all it is is that you have this blob which is the tensor and then the indices are these lines sticking out of the tensor and there's less rules about it than you might think like basically the indices can stick out of any direction and that's kind of it I mean, people impose conventions on it in different context, they might have rules about which way the indices point in things but generally there's not that many rules so there's only really basically two rules you can have a line sticking out and you can put labels on them if it helps but later you can start leaving the labels off and this just shows you how that looks for different basic tensor so simplest example of a tensor beside the scalar is a vector with one index so that's just a blob with one index matrix two indices so there's a matrix three index tensor so it has three indices then that's rule number one rule number two is that how do you notate sums how do you notate complicated sums so rule number two is joining lines means that you sum over the index where you join the lines so if you have this tensor which has two indices i and j and this tensor which has one index j and I join the line j of the two tensors that means I'm summing over j so that means the thing on the right and so we can see that that's a matrix vector product and you can already see some benefits of the notation so one benefit is that even if I didn't know what this was but the result has one index sticking out so I know that the result is a vector so I just showed you that the result of a matrix vector product is a vector and you can also kind of see where this notation comes from this idea of joining this line sometimes to help yourself with complicated sums you might put a little line underneath to kind of help you guide your eye from one index to the next you can kind of think of that line becomes this line but this notation keeps you from having to hunt through a bunch of letters and it can be very tedious and having to have indices with subscripts and blah blah blah blah so you can start omitting the names which is really really useful so you can say here's a thing with two indices and it's connected by one line to another thing with two indices so I don't have to come up with names for those lines necessarily but if I did I could write it this way and that's a product of two matrices and again I can see that the result has two indices so it's a matrix but it definitely results in a scalar because there's no indices sticking out they're all joined up but that's actually the trace of a product of two matrices and you can kind of even see graphically the idea of how you can permute within the trace you could take the blue thing and slide it around the ring around the back and bring it into the front again so it's kind of nice to see things like the permutation symmetry of the trace graphically in the matrix, tensors or anything you could put that in and that can be useful to do and so sometimes you do have to be careful about naming the indices because you may notice if you're rigorous minded this is a bit of an ambiguous diagram here because is that the first or second index of that matrix often you know from context or you have some kind of convention about maybe things go left to right or you kind of just know from the context that's the nice thing the matrix product state you can, I think that's a good way to think about it, you can think of it as somehow maybe throwing out some of the correlation or sort of damaging some of the correlations at very long distances and concentrating all your effort to preserving the correlations at shorter distances I think you can formalize that quite a lot so you can study in detail what is the structure of correlation functions in the matrix product states you can identify this thing called the transfer matrix of a matrix product state so you can, what if this blue shows up it kind of shows up so if I take a matrix product state let me go ahead and show the next slide which might help a little bit which has that diagrammatic notation and let's say it's translation invariant this is just a comment for those of you who understand it if not this don't worry about this comment too much but I can grab one of the tensors out of it and put some over the side index and I can think of this as a matrix from this space to this space like this pair to that pair and you can actually study that matrix it's usually very expensive to diagonalize it but you can grab the first few eigenvalues and the ratio of the second eigenvalue to the first gives you an upper bound on all correlation function correlation lengths so you can actually get a rigorous upper bound on the correlation length of an MPS by getting this object and there's some nice papers I think there's one by Zauner and Frank Verstrade's group and they studied the whole spectrum of this thing and found that it actually had a lot of information if this was a ground state you could actually find lots of information about the excited state structure just from the ground state by studying this object so it's a really interesting thing to do actually the inverse and identity inverse I don't think so unless it's a unitary matrix so it's a ground symbol so some people do there's papers where people sort of create new symbols so there's a lot of extensions to the notation that are maybe not standard but you can do them but identity for sure that's a good question so the identity is that you just put a line so let's say I have a tensor with four indices and I want to multiply this one by the identity you could have a dot and say that's the identity but what you usually do is you just leave the dot off so if I sum with the identity it doesn't do anything it just makes a longer line and I can just shorten the line back so then if you want to write say a product of identity matrices you do that or something and that would be the identity operator on four spins or something like that so it's a nice notation oh yeah, contravariant and covariant so I didn't mention that but you can, so the standard thing to do there is use arrows in the convention I use out arrows or contravariant and in arrows or covariant some people flip that around the other way but arrows is usually the way to do it yeah, and then with arrows it should be thought of as out and in kind of how contravating covariant are sort of geometric kind of ideas so, okay, yes how do you choose the matrices? yeah, I'll say some more about that it depends on the context so what you could do is let's say you could you can create them by hand and then that could be a starting point to then do time evolution by apply unitary operators to this thing in this form but you can also search for them numerically so you can use algorithm like DMRG and that will find optimal in matrices for you I'll tell you how that works it depends on the context so there's basically I'd say two uses of tensor networks one I would say is like kind of calculational methods and one is like variational methods maybe or optimization methods and then you start down to answer to the problem you want in some form that's difficult like a path on a girl then you say my job is to compute this thing so then you use tensor network methods to sort of work on it but you start from the right answer and then you end with an approximation to the right answer but the other methods are more variational or, you know, optimizational or whatever, I don't know and those you just start with even it could be random numbers for these and then you try to say to apply gradients or small eigenvalue problems and then you do it at a time so I'll say more about that in some detail so just to say why do we use this diagram magnetic notation I think it should be fairly clear but it's just that if we really write out this slide kind of unifies the notation from the two other slides about matrix product state here's the matrix product form of a matrix product state here's the one that shows the eye indices but on this slide I've written them as alpha indices which is a bit more standard so when you write it out all the way it looks really bad I've seen that notation in a lot of papers and it's just terrible, right? I mean it's clear but it's very complicated so we like this diagrammatic notation better because you can leave off the names so you can leave off the alphas you can even leave off the s's and that's just a lot cleaner once you get used to it than looking at expressions like that in papers and it starts to help you have an intuition for these things and be a little bit more creative about extensions you could maybe do with them so what are some of those extensions on that note? so the two most well-known extensions of this matrix product state idea are these two, the PEPS tensor network which is, the name is not so useful it's a tensor of projected entangled pair states but you could just maybe a better name would be tensor grids or something like that and all it is is just the idea of extending MPS to 2D I'll say a bit more about that and the mirror which is extending matrix product states by making them layered so kind of how there's regular neural networks kind of like deep MPS or something if you want, I mean that's just a motivation I don't know if that's really true and this has been introduced and motivated in these works so I can send my slides around if people want them later to have these references so what is the PEPS tensor network so you can think about tackling a 2D problem by saying okay if matrix product state works well for 1D which it does and I'll have some slides explaining some of the applications a bit later then maybe we can try that in 2D by putting them along the columns but then you say oh wait, what about the rows of the system so we left those out they're not going to be correlated beyond mean field so you might flip it the other way so you do something very simple and you say well what if we just connected up in the pattern of the actual 2D lattice and you say okay success so there was a lot of enthusiasm when that idea was first proposed but then it was realized this isn't quite as easy because now you have loops and it makes so many of the sums you want to do even just normalizing one of these things exactly but you can do really nice controlled approximations that help you do all the things you need to do of optimization and things like that so someone who's made extensive use of these and has really developed a lot of these approximations is Filipe Corbos so I encourage you to look at all his papers on PEPS this is just to give you a flavor I'm not going to have a lot more to say about them but we could discuss them in the afternoon if you want so you can actually address infinite 2D systems with PEPS by taking all the tensors to be the same or maybe the same within a small unit cell and you can do these interesting updates where basically you double the network onto itself summing over most or all of the site indices then you think of these corners as these kind of semi-infinite contracted tensor networks and you use these techniques these things called like TRG or corner transfer matrix basically they're kind of like matrix product states at the boundary that you bring in to reduce all of that to one big tensor so in that indices you can think of as something like the product of all of those but compress down you can do that for the corners and the sides and you can get an environment I mean you really do all these steps and then update that one tensor here in the middle by attaching it to this environment computing one update and then redoing this over and over and over again basically so you can actually get really state-of-the-art results for challenging 2D systems this way it's a generalization of matrix product state to have a layered structure and so why might you want to do that what's the reason? Well the reason is that you can show rigorously that at least at long distances a matrix product state only captures exponentially decaying correlations so if I measure 2 operators that's in distance x and then take that further and further apart decays exponentially but in mirror you can show at least you can construct mirror so there's you know if I put the two operators near each other they have some correlation with each other but then as you bring them further further apart the correlation function is dominated by these contributions that take the shortest path up into the network and through like that so it's kind of like these geodesic paths through the network and so the idea is that these geodesic paths only grow the length of them only grows very slowly as you drag these two points apart because it can kind of go up and through the network in shortcut so by contrast what if it had to bump through the bottom it would have to go through this one then through that one you basically get like a contribution from every tensor in the bottom it would be like a product of however many you went through so that's why it goes exponentially in the matrix product state but here you can basically hit fewer tensors by going up and over this way so it's a really nice idea so that's sort of the key aspect the key point of a mirror so just to give you a bit more perspective I would say that right now the status of things is that these are starting to be operationalized quite a lot for really tough 2D physics problems mirror is sort of more in the space where it's really nice ideas maybe connecting to high energy physics even but it's not quite as kind of weaponized for sort of attacking challenging problems as peps in MPSR sorry? what do the triangles mean so I didn't get into that just for the sake of time I'm a bit limited but basically the triangles are just on some level all they are is tensors but they're constrained to versus these have four but if you think about it as a quantum circuit or as some kind of unitary network they're all unitary so the ones with four they just mix two sites together so they change basis of two sites the ones with three take two of the sites one from one side, one from the other and compress or reduce them it's like a change of basis and a projection so it's saying take say this four dimensional space and represent it by a two dimensional space so it's not a unitary it's like a first a unitary then a projection so it's like a RG step like a renormalization group step you merge yes yes you could so you could try this in momentum space that unfortunately has not worked out too well and then it gets to the issue that I don't have a lot of slides we could talk about it more about the idea of the area law of entanglement and it turns out in momentum space generically it doesn't hold except exactly at non-interacting points and it doesn't even hold that well even near once you even introduce a tiny amount of interactions you're hit with the volume law of entanglement like the maximum amount of entanglement scaling right away pretty fast so unfortunately that hasn't been that much of a success to apply tensor networks to momentum space grids but fortunately the real space approach is working quite well and now applying it to things other even than lattice models in physics that's what I'll be talking about a good bit tomorrow so applying it to lattices but the lattices might be pixels of images or something like that not even physics lattice models necessarily and there the momentum space approach has actually worked a good bit better so turns out image just have different correlations than sort of wave functions do in some sense which isn't too surprising but it's interesting to think about the differences we could talk a bit more about this so unfortunately I can't cover everything about tensor networks because I literally taught 18 hour one week course earlier this year about it so there's a lot to be said in going in these different directions but mostly today about matrix product states so and we can kind of loop back to some of this but let me in the remaining 10 minutes just go through a couple of concrete examples and this kind of gets to a question over there about how do we know what to put into these matrices so let me just tell you about two examples of exact matrix product states that we can write down and then we'll talk a bit more later today about ways you can find them numerically as well so okay, so two examples that are meant to be just kind of instructive so one is a singlet okay, just real simple simplest thing we could think of is a non-trivial wave function so we could start with a product state but that one is too easy, so let's do a singlet so let's think of two spin-a-haves and we want to represent this wave function so this in a way could be one of the tougher wave functions represent because it's maximally entangled so we can write it suggestively like this we can say let's factorize it by some kind of vector of the states for the first spin and a vector of the states for the second spin and then write it as a product like a dot product of these two vectors and I think you can see how this would give the same state above so we would have one over root two up times down and but the times here is like a funny times it's like a vector sum but it's like a ket outer product because of the kets it's just putting them next to each other but there's a sum running here internally which is summing over the two states in each vector so it's kind of a funny thing and then the other combination is one over root two down times minus up and that would be this other term so it's some kind of ket valued vectors that I'm dot projecting together if you'll allow me to do that but you can make that mathematically rigorous so why is that a matrix product state maybe it's already fairly obvious to you why it is as this is basically already a valid form of a matrix product state in fact the oldest papers on matrix product states started by thinking them in this form so this is just a one by two matrix and this is a two by one matrix if you want and then the index the kind of S index that sticks out of the matrices is just telling you what combination of up and down you're in for each entry of the matrix here you're totally in up, here you're totally in down but let's try to unpack this a bit more confusing than helpful on this slide but if you want to try to explode this out and think about these tensors then you can kind of think about it like this you can say if I fix this now the spin index is the one kind of coming out of the plane out of the board if I clamp the spin index to be up then that vector turns into this vector one over root two zero so that's one of the matrices that's like the M one up matrix if you want zero one over root two so that's the M down one matrix and then on this other spin if I clamp it to be up it's zero minus one if I clamp it to be down it's one zero so that's like saying again if I write it in this form of psi S one S two equals M one S one M two S two it's like saying M one up equals one over root two M one down equals zero one over root two and it's kind of similar for M two just the other half of the slide ok questions about that so it looks complicated now it's like taking a really simple wave function and making it look terrible right so that looks really simple and that looks really complicated but computers like this better they don't know what a ket is you have to tell them what a ket is and what numbers are they can store arrays so that's the advantage and also the real payoff is that you can compress really big wave functions down here there was no compression but you can compress them when they get really big so we'll see that on the next slide ok so how do you compress a much bigger wave function as a matrix product state so let's take a very important example so this is just a state where we have amplitude phi one to be on the first site or in the first orbital whichever way you want to think about it or amplitude phi two to be in the second orbital phi three, phi four and so on so it's just totally generic one particle state with amplitudes phi j to be on site j ok and that's what I've written here in this kind of second quantized form so how do we write that as a matrix product state let me just show you and then we'll kind of see why this construction works so what we do is we say ok using this kind of ket valued matrix notation which is the most compact one that I know for writing it on slides let's make the first matrix be phi one of one or just one time zero so basically that's gonna say that if we reach into that matrix and pull out the one state that's gonna carry this amplitude phi one with it or we could reach into that matrix and pull out the zero state and we won't have any amplitude yet and then we'll kind of see why this works so the second one looks like this it's just zero state on the diagonal phi two one here on this off diagonal and that one is just the number zero so that's neither the ket one or the ket zero that's just the zero of the vector space ok that's not a vacuum state and then after that the pattern just repeats so you just have the same pattern over and over again but now on site three with phi three on site four with phi four and so on so some kind of like transfer matrix construction but for things that are not the easing model like other things and then that continues all the way up into the next to last site then on the very last site you have this capping off vector and if you look at this actually that's just the second row, the bottom row of that matrix and this is just the first column of that same matrix just with the correct site number so it's really the same pattern throughout it's just that you have to have some starting and ending conditions so you start by saying I'll start in the first row in the next column that's just some way to kind of make sure that you get the right result ok so how does this work so one way to look at this is to tell a story and say ok I start in the first row and here I could either pick phi one one and that means I leave in state one meaning like this the vector index is set now to state one because I picked that entry so that means I'm going to go into this matrix in the first row and in the first row I only have one option so I can't pick that so I pick vacuum but now I'm still leaving in the first column which means I come here in the first row so I have to pick that, I have to pick that so I just get vacuum the rest of the time if I pick this if I pick this something else happens I leave on the second column I go to the second row and now I still haven't picked one of these yet so I can pick that or that so maybe I pick vacuum, vacuum, vacuum then I pick phi four then I pick this now I've picked phi three one so now I've gotten that and then now I'm back into leaving on the first column and I just get zeros the rest of the time so I get these particles going in at certain places only once or I can just study this by multiplying the matrices out so I can say let me multiply this vector by this matrix and when I do I get that so I get this combination so that first term is coming from that first term is just coming from basically that zero goes over and meets this phi one one and then let's see then that meets this empty state and that just products with that zero so I get then this is going to come over and do something interesting this column is going to come over and kind of mix with these two it's either going to expand these to have vacuum state on site three or it's going to expand this to have a particle on site three this one is just going to bump this vacuum along so that's what happens same kind of thing this column here again is going to either expand these by having vacuum on site four or it's going to expand this one by having particle on site four this one's again just going to bump the vacuum up and so on and so when we're done we get that state above we kept repeating this so anyway that was kind of the labored explanation of how you write that as a matrix product state but I think it's interesting that you can take this state that formally lives this kind of, you know, two to the n-dimensional space and write it as a product of just two by two matrices so that's a massive compression of that state not so impressive because it's a single particle state but you can actually do that for many particle states as well if you do this construction it doesn't scale but there's a really nice paper by Fishman and White that shows you how to scale this kind of construction to like arbitrary amounts of particles if they're not interacting it's interesting just to note that there's lots of non-trivial wave functions that have exact representations as NPS not just that single particle state but all of these other examples as well the AKLT ground state if you know what that is the Majem Dargoš ground state that's just a product of singlets the GHZ states, W states these are of interest to people thinking about quantum information and things like that cluster states which are related these are actually an example of an SPT state but they're used in this thing called cluster based quantum computation the Katayev toy chain ground states these are the states of this model written by Katayev that actually have Majorana edge states so if you ever find yourself confused about what do people mean by edge states or Majorana edge states, read this paper that I can send you the link by Katayev and it's an exact solution to a model that you can write down and solve yourself and understand all the details and it has a matrix product state representation which is neat to see as well so a lot of these are mentioned in this paper by Pres Garcia and I'll be back in a minute so, a few minutes for questions if you have them let's see so if you mean constructing them finding them numerically then it's the linear dimension of the matrix cubed so it's not so bad and that's a really good scaling otherwise it can depend a lot if you mean constructing them in time by a time dependent process then it can be a problem that can lead you deep into the exponential part of the Hilbert space so it can be a lot worse so it just depends then on what state you're heading toward so we'll have time for questions in the later lecture I think I might be the next one too so anyway, thanks for your attention