 So the floor is yours, thanks. Thank you, Diego. Thanks, everyone, for coming back after lunch. Today we have a slightly more intense schedule, but hopefully you can handle one more. I will continue to what we discussed yesterday, but with a slightly more emphasis on the applications. So just as a quick review, yesterday we discussed a basic logic behind using neural networks and discussed how to apply neural networks on a data that you get from someone to analyze and analyze the outcome. We did slight tip-toe dip into some unsupervised methods, and hopefully you are at the spot that you more or less understand the code that we gave you, at least the basic structure and which line to change if you want to change number of neurons. Is this, yeah? Are there any remaining questions from yesterday or any concepts that you would like me to repeat or talk about more today? Anything that you feel like is not 100% clear. Everyone understands everything, that's great. So today we are going to talk about how to incorporate neural networks into actual workflow or specifically experimental workflows. This question sort of arise yesterday a few times that when I was trying to push you like what are the weak spots of this super smart thing that I don't have to use my brain, just the neural network classifies the phases and then everything is solved. One of the things that came up was this question of, but you gave us this Monte Carlo data set and what if we cannot generate that, right? And we talked about that if I have a nice problem that a basic MCMC Monte Carlo works for, I can just go and generate, you know, thousands on configurations and start doing all of these games. But if I can do this, it already means that I knew the solution to the problem in the first place. So it's a nice proof of principle, but this step to the learning something new, it's sort of missing. So how things work in practice is that you don't get a nice self-contained labeled data set of Monte Carlo configurations. You get some experimental data or maybe a simulated data from some approximative way we are using to simulate the experiment. So today, we are going to take the methods that we learned yesterday. The beautiful thing is that you will see in the code that you can just exactly take it. It's just that maybe the input and how you prepare the data set and how it will fit into the overall workflow will be slightly different. But again, have in mind that everything that we learned yesterday on this slide, dummy problems will translate just as well. Okay, then I showed you this constellation of different machine learning and quantum physics techniques, a lot of which you are going to learn more in depth during this school. There was this one that I wanna particularly highlight because it's something that my group scares a lot about and that's an automated control of quantum devices. Basically, you may notice that nowadays with the quantum computers are getting bigger and bigger and more and more powerful and that's a good thing. And sometimes you have this like a nature paper saying, oh, we outperform classical computing by this and this. So now quantum computer is useful. And then usually a week later, you have some tensor network simulation that simulates this experiment. And so we are in this dialogue that on one hand is productive in a sense that is pushing both fields forward and I really wanna say that this is not, this is not like some bad thing, right? Like this is super great at both experiments and the numerical people are making all of this amazing progress. Where there is a little bit of the friction is that if I'm just trying to simulate the experiment that already exists and not combining things together, this idea that I can maybe enhance classical machine learning data set with quantum data or take a classical data and enhance some quantum machine learning algorithms becomes a little bit complicated and prohibitive because on one side you usually have this specific setup of the quantum experiment who was in kind of quantum computing lab for a visit before or are an experimentalist. Oh, really? Oh my God. Okay, so I will say more or so about experimental stuff. So you have this like a dilution refrigerator, then you have this huge rack with the microwave electronics and then you have some computer next to it that runs the windows. And at the computer there sits a PhD student or a postdoc who is changing things and watches how the experiment behaves. And then when you are doing a numeric, you just make some advanced sampling algorithm that is solving the problem at hand. So the way I get the data from these two things is super, super different. And it's true that I can calculate some expectation value that I can compare from both. But I think that just connecting these two things is one of the outstanding challenges in science nowadays. And I am talking about this a lot because I think it's particularly important and I hope you will remember it when you are going on with your scientific careers. So in contemporary quantum experiments, we talked about yesterday, right? Like wave function is big and then there is the experimental data. This is two examples of the very famous chips from academia. Of this is two examples of the quantum computers that did something impressive. This one is a 17 superconducting qubits from Andreas Wallerach lab at ETH Zurich. This is the first chip where they did a quantum error correction demonstration. And this one is a six quantum dots in silicon in a Leven van der Seypen lab in Delft. And this is the first chip that did the universal quantum computing in this semiconductor quantum dot setting. So these are kind of good examples of a semi-recent flagship experiment that I wanna operate if I wanna achieve something useful. To reading out two of these things, if I wanna check if my error correction works or if I wanna check if my universal quantum circuit did something useful, I am gonna need 100,000 to millions of measurements. And so the data is there, right? It's just like that the data is somewhere in some file on some Windows computer and it's not labeled or easily transferable as a sample that I could use to train my neural network unless I send my students to the lab, make them sit there for two months and figure out the workflow. So this is just to say that we have some kind of data. And so I wanna bring back even the idea to present it like this, I wanna bring back this DM from that Anna sent me on Twitter, like do we actually have a data that super, super large scale machine learning models can be trained on? Because if you think about it numerically, it's a super valid question, right? Because I am not gonna like a lot for the things that I wanna numerically creating data from it's super expensive for most things that we care about. And there is a lot of smart people you are going to hear from Miles for the rest of this week, telling you how to do it numerically efficiently, but there is still an extreme overhead for the problems that we really care about. But then there is this other side of the people who are sitting in the basement at your university who have terabytes of the samples from the quantum states that actually just takes nanoseconds or milliseconds to generate, it's just that, yeah, we don't talk to each other enough to make it useful, so yeah. All of this prelude was just to motivate this field of automated calibration, control and read out of quantum systems, where basically this is doing not even like the final step that I wanted, all the numerical techniques and quantum measurements will work perfectly in sync and we will have immediate way to produce, mix, experiment theory data set. It's even simpler. I just want the experiment to run autonomously so the data can be collected and labeled autonomously because whenever there is a human sitting, this is looking like the gate is a little bit weird. Let me randomly turn up a voltage two millivolts up. Then you will not end up with a consistent data. So you will end up with something, but then it's not very practical or you will have to have that poor human to add to that already insane workload, also some systematic labeling of whatever their intuition tells them they should measure. But so in the end, on some like abstraction level, what it boils down to is this. You have some complicated optimization landscape where the axis are all the control parameters of your experiment and now we are not talking yet. The J and U and U of your Hamiltonian like you heard in the first lecture today, we are talking all the different voltages, all the parameters that shapes your pulse, all the things that the experimentalists are actually turning the buttons when they are creating the thing that we as a physicist want to study. So the idea here is instead of human sitting there and turning the buttons, you just use a neural network to do that. The reason for that is that you could have a normal algorithmic method, but maybe you know that some of these quantum experiments, they are famously unstable and wiggly. Like you can prepare a particular state or tune up a particular gate and then 10 minutes later you have to start over. And then there is a lot of these things because usually the devices I will show in the first part of my talk will be in solid state. There is a lot of just like extra effects that we don't know how to model, right? Like I can write a simple Hamiltonian to describe superconducting qubit or a quantum dot, but those things are sitting in a complicated solid. So you know the substrate matters, all the electrons that are around it matter and that changes the behavior of the device and the effects of noise are very complex. And then we say always this thing when we talk about machine learning that the generalization power is amazing and it's really quick to evaluate and it's really robust towards noise. So the idea was why not? Let me explain this idea on like how this tuning works on a really simple pictorial example where actually we don't need to understand the physics much and can't just concentrate on this like image recognition factor. And then I will give also some more elaborate physics examples. Who knows what quantum dots are? Okay, so quantum dots are, it's one of the solid state implementation of qubit. Roughly speaking, you just trap an electron in the piece of material and you create the boundary of the electron by these gray electrodes. We call them gates. Here in this plot, there are three quantum dots that those are the dashed circles. And basically you can just think about them in the electrons that are three electrons just isolated in all the spatial dimension. If you measure a device like this, this is the I underscore QPC current, you get a plot like this. The two axes are the voltages of the gate. The PG1 and PG2, it's called splunger gates but that's not important. So those electrodes have voltages. I put them on my X and Y axis and then the density plot is a current that goes through the device, meaning how are my electrons hopping from dot to dot. The color scale is derivative of the current. So basically this plot, it's called the charge stability diagram and what it tells you is how many electrons I have in each dot. If I start from a lowest corner, I have zero electrons in a first dot, zero electron in a second dot. If I move over one transition on the blue line, I am going to get a big change in the current. I have an electron hopping into my left dot and in the right dot, there is still nothing. If I go one up, I am going to have one electron in each and so on. So actually we don't need to understand anything that is going on with this device because if we wanna look at this charge stability diagrams, it just looks like this. So we just need to count the transition and we can prepare whichever state we want and preparing this various electronic state has to do how you choose your basis for a quantum computing problem. But this is one of those challenges where preparing this concrete electronic states was the thing where someone is just sitting there and tuning all the voltages until it looks good. So what you can do is to just say, but this is an image recognition problem, right? If I just care about for counting if my transition went like this or my transition went like this, I can measure a small subset of this big charge stability diagram and just have a neural network classify, do I have this? Do I have this? Do I have both? Or do I have nothing? And based on that, you can embed it into some really simple Python loop that tells you how you should walk, how you should change your voltages. What we did specifically here was that you see on the picture down here is that you first measure bigger windows so you go to the state zero, zero and then you start measuring smaller windows and you let the neural network classify what kind of transition you have and then you slowly walk into the state that you want. The classification here is done by just a convolutional neural network like we learned about yesterday. We have some like elaborate labeling procedure but I will not cover that here. Ask me if you are interested but the code that you would use to do this is exactly the one that we did yesterday with the icing gauge theory. And this is the point, right? So if I have the image classification of Star Wars and Star Trek like we did yesterday, I can just easily do image classification of these windows in a charged stability diagram, right? This is not super deep but then the result of this not super deep thing you can see here, these are examples of the tuning rounds on the device and when you follow the green line you see the steps that the algorithm took to get to the state two one because that was the state that the experimental team picked that they wanna prepare. If you look at this plot, you are going to notice that there is a lot of white space and also if you are really observant you are probably going to notice that the resolution is much, much lower than of this beautiful plot that I showed you at the beginning because if you don't have to count on human squinting on it but you are just scaring about some gradient feature in a data, actually you need to measure much less because the neural network sees the thing a different way than we saw it. So we actually managed to take like order of magnitude less data point and mostly we only need to measure the square the neural network is looking at at the moment. We don't need to measure the whole thing and then put the charge numbers and then count the transition. We can just do a small subset of this measurement like this and so this is something this is something that we did that we did back in 2020 back then it didn't work super reliably we ended up in a correct charge state I think about in 60% of the run but it was the example that you can actually measure like very little of the points and you can outsource the decisions to the network and the time will be comparable to your, you know best experience tuning postdoc who can then do something better with their time in the meantime. Yeah I should have said this at the beginning because I saw a lot of hands of people saying they don't know what quantum dots are. We care about quantum dots a lot because if you are just strapping quantum if you are just strapping electrons in a solid state that's very compact and then people say oh this is potentially a super scalable quantum computing platform because the problem that you have with maybe superconducting qubits or other platforms that the millions of qubits that we say that we need does not fit in the current technology you cannot put it into the dilution fridge because they are too big this is not the problem with quantum dots and then you have other problems because all the electrodes have to fit on the chip and there is a lot of wire and yada yada yada like it's still complicated but at least like technically it's something that can scale but you pay for it with all of this moody noise for which the let's say fixed algorithmic solutions usually do not work so great. So this idea of replacing things with the neural networks is very popular in a community nowadays and most leading labs in the world they have some neural network embedded somewhere in their workflow. This example I am showing is just one tiny subset of this quantum dot tuning because you need to do a lot, a lot of steps to prepare your qubits and do the gates but this is a nice pictorial example how this works where we are as a community right now is that people are actually building larger and larger quantum dot chips this is the four times four 16 qubits example from Delft and then you can imagine that the problem of tuning will become even more annoying because I just showed you example with two dots and each of them had one plunger gate and I hit other parameters from you for the time being but that's a still two parameter problem so I can plot this two dimensional charge stability diagram find the transition, solve it, done, right? But if I have let's say 50 voltages in my system probably before I tune them all up in pairs everything has declared and I can just like start over so that's a simplification but it's really not super scalable and not to mention that then also you are asking your experimental colleagues to measure so, so, so much data because even for this previous experiment that I showed you we actually measured bunch of those charge stability diagrams and had to cut out all those windows and label them it was a lot of work and that was just for two dots and now, okay, now let's say that I have to redo it for 16 or a 50, that's not so easy so for the generation of experimental data we just released this package, it's called QDCIM and basically the package, what it does is that you just define your chip you just put where your qubits are you say when your sensors are, where your gates are and it produces the charge stability diagrams on demand so you can just generate your machine learning training set on the go, it works on the laptop, it just runs and yeah, like even for a big device one of these takes a few seconds so you can actually reasonably prepare a small training set like this of course we are using very naive model because we need the computational complexity speed but the idea here is that if you have a way to actually generate a lot of these then you can pre-train your models and then only fine tune them with the actual experimental data which again offloads some of the difficulty of this collecting data and, collecting data and labeling we managed to make it fast because we basically map everything on a convex optimization problem and for that there is a lot of known powerful solvers questions to this I know that nobody here probably cares much about quantum dots we will move to neutral atoms presently but I wanted to show this because it's a simple example of the success of neural network in a like actually realistic experimental conditions everything is clear and let's just move on so this is something that I care about a lot because I gave all this like a grandiose introduction about how we need to connect experiment and theory and so that's really like that's really something that is very purposeful for me and so in my group we are working on this type of applications a lot across different experimental platforms so maybe let me give you a couple of examples from more contemporary applications the first one actually is still going to be simple but I picked it because you had this amazing code atoms tutorial this morning and it's also really in the heart of the topics covered in this school it's going to be also a little bit old school calculate application because it was like 2021 so that's like decades in machine learning time but it again lets us adapt the simple code we learned to use yesterday to the to the actual physics problem I will not say anything to this slide because you had this amazing like a dragon introduction in the morning I hear and I think nothing can beat them everyone who came to this knows what's a quantum simulation yes okay great so you also know that there is a lot of this like awesome experiments and it's one of really like one of the leading platforms in terms of size and we even had some super cool quantum air-correcting state preparation paper earlier this year so it's a very complex it's a very cool evolving field if we remember what we were saying first thing yesterday about the scaling of the wave function and yeah I am sure it came up like also a lot in the morning is that if I just for this kind of systems if I just calculate the vanilla Hilbert space size for the systems that we already have and 50 qubits in this context is actually really little you know that you have 200 plus nowadays so actually there is a question like if you want to call them a qubits particles but the Hilbert space of that is going to be 10 to the 15 and we already exploded out of this exact solvability window yet again so let's say let's take a super super super super simple model for this kind of physics problem again you had this in the morning I arrived like some minutes late but I saw Bosse Hubard Hamiltonian at the at the slide but so if you make an optical lattice you trap some atoms in it in the end in the most simple approximation you can describe them by hopping from side to side how they interact on the side so the hopping will be J the onsite in our case repulsion will be U and then you have some chemical potential so then the question is how to learn the parameters that govern the physics of this large scale simulators because I cannot diagonalize it and even if I could how do I know what my toy model has to do with the experiment right so we wanna solve an inverse problem we wanna take the experimentally accessible information and learn Hamiltonian and you already heard from Francesca in the morning that this is very important and so I am gonna just give you super minimal example of how one can do that in this platform so common experimental sequence in this setting is this one you have some initial state that you know how to prepare for example you can be deep in a mod phase where you have some foxtate so you prepare some state that is the eigen state of some trivial Hamiltonian then you do something that people call a quence did that come up in your morning lecture this is like a time evolution experiment so I'm not sure okay I see a confused phases I will explain it in detail basically you just basically push your system you just vary some of your parameters rapidly and then you let the system time evolve and then you take a measurement and then the question is from the measurement can I map take it as an input data and map it on the Hamiltonian now let's take a pause and digest it because this is a little bit more abstract than what we did before right before we were just like configuration is it A is it B done or we had a picture of this chart stability diagram that maybe you don't care about but it's just some picture is it doesn't have a line does it not have a line but this is now harder right because we are like it's again a picture because the way the measurement works in these systems is that or one of the ways you can measure in the system is that you literally take this occupation picture and modulo some parity constraints you just get this kind of information like I have an onsite one I have an atom onsite two I don't onsite three I do onsite fourth I don't so it's like this fox space kind of image and then I wanna take these and map them on the Hamiltonian and so this is somewhat harder and more abstract just saying is there a line in this picture now we are going beyond just identifying visual features we wanna construct some like kind of complex inverse function but we said that and when I mean an inverse function from I go from the measurement to the Hamiltonian but we said that the neural networks are a good approximators of arbitrary functions so why shouldn't it do this one as well just a quick comment about this about this experimental sequence so we really like digest it and understand it let's think just about having one qubit I will start from the state let's say zero and then I will do some dynamics when the spin is processing around the axis so the Hamiltonian the unitary of that is as written above me and you can think about it like this then I am doing measurements in a Z basis and this measurements will have a limited information right or let's say varying amount of information based on when I choose to measure so this spin is processing and if I'm lucky and I measure in a Z basis then it's here then I learn the state right but if I am unlucky and I time my measurement when it's like this in the X eigen space then I just have a random number generator between one and zero and I will need a lot of lot of data to learn to understand like what's happening so this is like it so it's like this but in a more complex system that I have a lot of atoms in optical lattices that are evolving with respect to some Hamiltonian that I wanna learn yeah so yeah it's again like instead now instead of having the spin I have some initial state I do some evolution I do measurement I wanna reconstruct the Hamiltonian so this was very generic introduction into this the collaboration we had with Marcus Greinergrub at the time was this setup they have the optical lattice in this kind of they call it a quasi van diegeometry where it's very long in one dimension but then it's just two sides in the other so it's like this like a leather kind of architecture and then we want to learn the Hamiltonian of this type out of equilibrium time evolution in the system so Pozuban Hamiltonian you saw it before I write it in a second quantized basis but you I think there was examples of that in the in the morning lectures but just to quickly repeat the J is the hopping amplitude the U is the onsite repulsion and the U term is the chemical potential now the twist comes because of course I can write here as a theorist J is for hopping U is for repulsion U is for chemical potential and it's done and then it's a three parameter problem easy peasy right however in practice in optical lattice those are going to differ lattice side by lattice side because that is yeah finite size effects the lattice fluctuation it's actually different if you are in the middle or if you are on the edge and just like a general you know thermal and other instabilities so the problem they were actually interested in solving suddenly become became kind of high parameter high parameter problem because if I only put 10 particles on this lattice which would be very very little I already have a 10 to the 13 Hilbert space with 350 parameters that I need to find so yeah that's another great landscape to explore so we started from a super super minimal super super minimal subset of this landscape and that was taking taking eight sides four times two with four particles in it so there will be if you would count all the possible J's and all the possible mu's and u's you would have 25 parameters to estimate so that's still annoying even for this small subset but let's walk through that we picked this subset for a reason I will show you a little bit later in the talk how to scale it up so what you would want is this I did my time evolution I took my snapshots those are this blue configurations here then I want to feed it into the neural network and I want to predict all my use J's and mu's so actually I think I from now on I will just drop the chemical potential so we just have two parameters to think about so this is very similar like machine learning wise this is similar exactly what we learned yesterday I have input it's a configuration now it's in a different basis but it's still a configuration I have a feed forward neural network and I wanted to tell me a parameter right so super similar there is a catch with this that you get to explore for yourselves also if you want in the exercise notebook that this problem that I was talking about that you need a lot of snapshots to characterize the state it keeps coming back like let's remember this procession I was talking about of the spin like one measurement actually tells me nothing and that's a completely trivial example so you need to have a lot and then we have a lot of parameters even if I for the sake of the argument drop the chemical potential fluctuations I still have 18 U's and J's and if I need to create a data set that has enough representative samples for all the possible combinations or at least for a sufficient subset within these combinations it's not going to scale great and similarly training the network that has 18 different outputs is also not easy so we did something really, really dramatic and we made two choices first was that we took these snapshots and calculated different correlators from that basically they go up to four body density correlators in this case it's because we said that we have maximum four particles if you would have more you would maybe have to choose more and if you go through all these correlator options you will come up with 170 of them and then we did the second thing then we factorized the problem so we made a separate neural network for each parameter that we wanna estimate now this is a really big approximation I know that but go with it for now and we will see how well it works compared to some other methods afterwards so yeah I wanna slow down here even more to stop at this point because usually when you get experimental data unless you have a really simple classification problem that you are looking for is this a good piece of data or a bad piece of data and you have a representative sample for both it's probably not going to straightforwardly work just take those data and start training if you have this kind of multi-parameter option because first of all in a quantum measurements you need many for every instance and then the number of instances is not going to scale grade because you need to take into account all the possible combinations are you with me I know like it's after lunch this is like hard on everyone is this more or less clear okay so this is the type of thing then if you are overwhelmed by your parameter space complexity or by your data set complexity those are the kind of things that you wanna start thinking about in preparing your data set in particular here each of this neural network was like this that we had five hidden layers you see that there is sorry not five four hidden layers you see that there is a quite a bit of neurons because even after I simplify and simplify it's still very complex and the loss function here it's slightly different what we had yesterday but conceptually similar here we do a regression I wanna value of U1, U2, U3, J1, J2 and so on so I am asking just the L squared L2 difference of predicted minus correct parameter for each of these configurations so if we take 2500 experimental snapshots you have like a density plot of the precision on each side and you have this distribution of the estimated J's and estimated U's and the sort of the gray line gives you the error place where I give you like the one sigma error and overall that is like yeah less than 0.1% estimation error in these parameters we had questions about how precise this is going to be because we made this compromises right like we first calculated the correlators and then we made a separate estimation network for each parameter so we took one of this standard Bayesian estimation algorithms to calculate the same thing on the same amount of shots and there we actually had to take into account the joint distribution without factorizing things too much and you see here that the neural network distribution for the small number of shots those are those NN 2500 and Bayes 2500 is much narrower and only when you scale up all the way to the 20,000 shots 20,000 foc state measurements the Bayes distribution starts approaching the neural network distribution so we don't have an analytical explanation for that probably it has to do with this like a good generalization to the low amount of data but this was very consistent what the experimentalist told us that they observed that previously they were taking the measurements in the 10 to the 4 and it was super super time demanding that that amount of measurement was needed for this kind of estimation and so training this kind of independent neural network spare parameter basically tells you takes you one order of magnitude lower again this is a supervised learning and all the error bars will be done statistically so there may be like a fundamental questions that you can have about this and I encourage you to ask me those but it was still a really nice example that if I really wanna push the number of measurements as low as possible the error distribution that my neural network is doing is actually relatively narrow comparatively speaking to the state of art non machine learning methods okay and now there is this now there is this yeah hidden hidden thing that we that I bet it under the carpet at the beginning of the talk which was we need to scale to a larger system sizes because doing this for eight sites does not help to anybody but like it's funny to remember that even for eight sites with four particles this was as hard as it was so what to do next in this particular setting we were lucky that we were in a situation where the experimental team is able to shape the big lattice by randomly raising the boundary at a specific site we pick so we agree that eight is a good size because if I do this boundary basically have like to block hopping on the site eight and on the site one there will be a boundary sites effects into the lattice side that is right next to it but in the on the square in the middle if the estimation should be more or less okay and then what did we propose was to indeed like raise this boundary on the site number eight site number 16 site number 24 then run this estimation then do the snapshot measurements run the estimation procedure I just walked you through but only look at the four ones in the middle the green ones because the boundary of course is destroyed because I need to create this blocking of my lattice and then what one can do is to shift by one and then I get a different four middle squares of the different I don't know how many it is 16 parameters that I need to estimate and then I shift by one and then I shift by one and for each of that I do 2,500 shots so within 10,000 shots in total I am going to cover all of this lattice of arbitrary length as long as I can take the fuck picture across the length of my system of course like this is a this is a this has some limitations right because I didn't like and I think Francesca was like going into the direction of some of these like this is kind of a vanilla model and I'm only considering nearest neighbor hopping the moment you start considering nearest neighbor this breaks but maybe if you are sure that in you wanna go only up to the next nearest neighbor you can repeat this experiment with 16 over eight but the beautiful thing was that even in this kind of like even in this kind of limiting setting there is a way to take your task for us we were really given like that yeah more than 10,000 shots nobody wants to measure that so we knew that 10,000 shots is a maximum and then there is a way where you can optimize your data set preparation your neural networks and your whole algorithm to reach the 0.1% on J and 0.3% of you precision within this maximum of experimental experimental snapshots taken I don't know how much like I feel like everyone is feeling a little bit sleepy today but I do have to note books for you if you want one is the idea that you just take directly the snapshots and just train the model that maps the snapshots on the parameters of the Hamiltonian this training, this is a complete nightmare you need so much convolutional filters and so much trainable parameters that would not run in the call up so we pre-train the model for you and you can just load it and evaluate it if you like and then in the notebook four I have this workflow of the correlators we again simplified everything to the maximum so we only have one J, not N of them there is no hopping or there is no repulsion or no chemical potential estimation but your fun task is that you will pick which correlators you want to use and try to maximize the J precision how are we doing this, I still have 45 minutes maybe let's take I get 10, 15 minutes that at least I have you open the notebooks and you scroll through it and ask me if you have questions on the content so we are in the place that like at least you can use it in the world where you don't get this beautiful big Italian lunches do I need to write the link again or I didn't put it on the slide but I hope that I know that there is a lot of ultra cold atom people so I hope that this kind of notebook helps you if you want to get started on this kind of project it's a snapshot data not expectation value data but the whole workflow for the paper is kind of there in a simplified version that you can run so hopefully it's nice the you will notice that the neural network is just copy pasted from what we did yesterday so it's again just a supervised task with a slightly different loss function and that's it so here you notice that we need to take the convolutional layer for the snapshots and it's just super huge because even after I run it through the first convolutional filters I still have a linear layer of the size 64,000 and then I need to go through all of that all the way to single neuron that predicts the value of your J so if you would have compiled this model you would see that the number of the number trainable parameters is completely insane so we just have you load it and then in the notebook four we didn't put any code because you just need a dense neural network and the architecture is not like super weird like a few fully connected layers with a little bit of dropout should get you to the good J prediction from these snapshots it's actually not bad at all like we trained everything for the paper it's just like one day at the cluster like all the jobs it's more that yeah first the Google collab and yeah no that's basically it and it's also it's a different scale if you do it for yourself because even in Google collab it will train in one hour it's just that it that doesn't scale with one and a half hour lecture thank you for stretching for me then let's do something harder now no I didn't get to show my meme for the break so now you get to read this masterpiece and then we can continue this is by the way a useful lesson for someone who is new to training neural network models okay I wanna give you I wanna show you one more thing this is going a little bit back to the solid state stuff it's a thing that we are currently doing it's using a very fancy machine learning model and yeah I will try to sweep all the solid state stuff under the carpet because I know you are more in a optics and an ultra cold kind of space who have heard about Kitayev chain before okay we have some people but not everyone who have heard about Majorana fermion before okay better so basically there is this famous paper by Kitayev that says that if you have specific type of spin chain at the edges it's going to host this Majorana particles that have a very specific properties and from the people think it will be useful for topological quantum computing but in the end they are like a topological modes that live in the middle of the gap and this Kitayev chain is a very famous condense metaphysics model that people are trying to people are trying to realize amongst other reasons to observe this Majorana fermions experimentally and maybe do some useful things with them the picture you see here it's one such realization of the chain I will not go to the detail it's just the two sides there are two quantum dots against the dashed circles they are not so different from the quantum dots that we had before except that now they are coupled to the superconductor in a very specific way to create the conditions described by this Hamiltonian I will not even write the Hamiltonian but what we need to know is that the physics in this case is dominated by two parameters T which has to do with hopping from dot one to dot two and vice versa so super similar to what we had before with this quantum dot transitions and then there is delta that has to do with the cooper pairs in a superconductor splitting into the quantum dot but then happens in practice is that your charge stability diagram this is again the same plot as I was showing you before it's some the density it's some current and there are two voltages on the axis the based on which parameter dominates whether the hopping or the cooper pair splitting the charge stability diagram is going to look differently if the hopping dominates it means that you have no charge transition from going from zero one zero electron in one dot one electron in the other dot to one zero where you just hop that's a odd space so you have no charge transition but if you go to the two other corner you get this avoided crossing because you are going into the different space with a different parity that is the even one that is that is covered by this cooper pair dynamics and so you have avoided crossing this way you have avoided crossing the other way so before we just had this straight line and we were like hop hop hop hop count the line done now we have a slightly different physics dynamics so we have this avoided crossing and then there is a point where both this physical physical constants are more or less the same where everything becomes degenerate your avoided crossing will cross and the point that is middle of the diagram it's called sweet spot if you write a creation and annihilation operators for that mode you are going to get what is technically a Majorana fermion how is a Majorana fermion in two sides useful for anything that's a separate issue that I am moving right past but basically to see this topological modes visually for you right now translates into seeing a blob in the middle of the charge stability diagram your avoided crossings got and that's the topological mode we want to see so okay there is some complicated physics behind it so the T is the is the sorry, delta is the yeah T is elastic cotanilink and delta is the Andreyev bound state constant sorry I am talking for too long time but you know what I mean right it's the cooper pair splitting thing okay then Vika I can just show you I can just show you afterwards in a favor yeah okay so then this we just did a simple Hamiltonian learning problem where we took the snapshots in ultra cold and mapped it on some J's and U's of the Hamiltonian same thing here I am gonna take this avoided crossing and map them on the delta and T of the Hamiltonian specifically I am only going to map it on a delta to T ratio because that's the one that I care about because that's the thing that I want to it to be one to find my topological mode here we did it in a here we did it in a slightly fancy vase who have heard about guns before we have a lot of people yes but some people know generative adversarial networks do the thing of probably heard about this like a Facebook thing generating realistic human faces that's a generative adversarial model how it works is that you are feeding some kind of noise vectors into a generator which is a neural network and some real data that you want the model to teach to generate into the discriminator and then generator is learning to produce a fake data and then the goal of the discriminator is to distinguish them and so what you end up with it's a this kind of minimax problem where you want your generator data to be as close as possible to the real meaning the discriminator should be maximally confused about whether the moment your discriminator cannot tell whether you are getting real or fake your generative adversarial network is well trained motivation for us to explore these alternative let's say more complex machine learning models was the following something that nobody like yelled at me for during the previous part of the lecture probably because you are sleepy is this if I train my network in a supervised way on a numerically generated data which I need to do how do I then know that I apply it on the experiment and I get like your J is 1.01 celebrate and so how do I know that I am right right like in this kind of setting doing any kind of realistic error bar or the verification if you don't have independent experimental way to tell it's kind of problematic right everything works well and good if I am comparing to another labeled data set but the step where I want to generalize and I need to trust that the model is predicting well that's a huge pitfall of any kind of supervised learning because it may be that your training data set has some features that your real data set doesn't have and yeah how do you tell the beauty of the generative model is that you can generate a lot of data look at the distributions and generally enhance the workflow and that was something that we've been after here specifically we just simulated data from this very minimal Kitai of Hamiltonian that just has these two parameters one for the hopping, one for the Cooper pair splitting feed them into the generator together with something we called conditions specifically what is the ratio of delta MT for each of those charge stability diagram and then when we train the model we try to test it on experimental data meaning we feed the experimental data together with the conditional Hamiltonian parameter into the discriminator and then what you get out is this kind of distribution that tells you how confident is the discriminator that this is a correct data point so you just don't get one point that tells you J is one you get a confident distribution of your discriminator and what you want is to find the point where the probability that the experimental data point fits the parameter is correct so I know I am saying this very, very like yeah bigger picture generally but it's just an idea that your discriminator confidence is changing as a function of this conditional parameter and because you have both generator and discriminator you can first actually do a very simple thing that you just look as what your generator is producing because it's kind of like a backward check to know if your neural network actually is like learning something useful because for example with these supervised models sometimes maybe you heard this anecdotal example that it learned to distinguish between the two classes just because one of them has a slightly different color filter based on how the images were generating and maybe that's not the feature you want your model to learn and having a control over this is kind of fragile especially if you are after I wanna put this neural network on the quantum computer and let it run by itself so first we looked at what the generator is producing and voila it's actually producing images that are really, really similar to the experiment I also included here on the experimental side like some like bad ones so you see that it's not like a hundred percent but we sort of comfortably got to the point that experimentalists can't easily tell which one is which and then we use this discriminator rating probability function to decide whether we are approaching the sweet spot topological mode from the left or from the right so you get this kind of lines you will remember them from my first lecture because what I am plotting here is this ideal linear dependence that I was telling you about is confusion learning yesterday I have on my axis the predicted ratio of these two parameters I'm interested in and on the axis the sort of experimental value for those parameters that is an sort of independent procedure you can measure them to check which is very helpful if you do this kind of project so what we want is that the gun model is predicting the same thing the experimentalists can measure so first thing you notice in these images is that the dependence is more or less linear good but then you also notice to some of these red points and in the upper plot those are the three points where the conditional generative adversarial model that we built labeled the data correctly and the experimental label is actually wrong so this is funny right because you have this kind of experimental way to extract the parameters but it takes time it's expensive and it's very noise prone but if we change this model it turns out that actually it's very accurate in predicting which regime you are in and it even outperformed a human being in this case for the, yeah, this was the first instance I was aware of in a very verifiable way so we actually got 100% accuracy for the machine learning model in this case. On the second panel that's a measurement at a different magnetic field that was very noisy and that is also some of these dots that got mislabeled by the human but then we also have some the orange points that were basically so noisy that we cannot tell who was right so we let it undecided but in general this was very robust thing to see what we are doing right now is that we are embedding this kind of thing into a fully automated tuning runs when we connect it to the gradient descent that controls the voltages of the experiment and this is one example of the workflow that gets automated like this. You start all the way in the upper corner from this beautiful avoided crossing and you see that slowly, slowly the two charge transitions are getting closer and closer to each other until you get this topological signature sitting in the middle and all of this can be achieved now without anyone touching this experiment which again this was one of the tasks that was previously kind of complicated to automate. Yeah, I don't have much time to talk about this project in detail. When we did this, it's also a fun example of the transfer learning because we actually never ever trained not even fine tuned on the device we were controlling. We had a whole another device that does the KITIF model in a completely different platform and we trained on that one and then we applied a trained model on this two quantum dots with the nanowire and it just works without any further retraining which was also a lot of fun to see. So that is a lot of things to explore in this space. Ruben is here with an amazing poster and all the technical details if you want to learn more. I know this is not a great audience for like a solid state cubist but I want to show off this paper because it's a fun machine learning model. It's a great example of the transfer learning and for us is a state of art application of this kind of automation in experiments that people actually care to automate. With that I also want to mention Tom, Vinny and Dima. Who are here somewhere and yeah, thanks Vinny. Who are from my group and can talk to you more about the things that we work on. I got a task to introduce you to the neural networks in general so I didn't get to talk about a lot of exciting things that we are also doing. So talk to Tom about computational complexity of neural network quantum states. Talk to Dima about classifying topological phases with neural networks and talk to Vinny more about automating quantum experiments and he's also a quantum machine learning expert. If you want to learn more about these things we have this EDX course running right now. It's very specifically targeted towards semiconductor quantum devices meaning you are going to see a lot of discharge stability diagrams but there is a lot of lectures, a lot of them comes with the pre-filled notebook with some exercises. I think if you want to get more hands-on experience this is a great, this is a great tool. Also for the notebooks now I got to showcase it to you just like here is a code, enjoy. And I know this doesn't work for everyone because it's a short course but in this platform we have a TA walking you through every single notebook line by line if you want to get more into this topic. I also want to highlight this amazing book. It's going to come out at Cambridge University Press. Right now it's on archive. Anya, David put this together. It's a fully like a PhD student led project from the lectures that we gave at the school in Warsaw. A lot of co-authors is here in the audience I think. This is a great example that you can actually get a textbook from a grass roots collaboration rather than one senior professor getting all the citations for being an editor. This was a huge success also like on a community building level but outside of that the book is really great and it has all the topics that will be covered in the school maybe with the tensor networks are not heavily featured but other than that it has all the topics that you will learn about this week. So this is like a state of our document that I super recommend. Some of the things I talk about are also going to be there. Okay, that's it. Thank you for listening to me for these two days and ask me anything.