 Five, I welcome everyone who has joined, who is joining this webinar series, is a co-work webinar series dedicated to enthusiastic of coherence and coherence properties of X-rays sources. So we are having a nice reaction from people. We had already first webinar last week where Pablo Villanova-Perez introduced the concepts of coherence and today I'm very pleased to introduce you Thomas. Thomas Ackerberg is a researcher of the laboratory of molecular biophysics in Uppsala University. His expertise really lies in the developing algorithms for coherent diffraction imaging for structural studies of viruses and macromolecules. He will also, he will just bring us in this world of algorithms and routines to make a good use of coherence from data collection to data analysis and he will give us also some examples at the end of his own research and so at this point I will give the word to Thomas and thank everyone for being here. Okay, thank you so much tonight and just to check and someone just confirmed that you can hear me okay and see my slides there. Yeah, we can hear you all right, thank you. Okay, thank you. Yeah, thank you. Okay, I'm really pleased to be here and talk about a little bit about my research but mainly about you know co-interfractive imaging and sort of what makes it great and it's sad that I can't be there in person but I send you my greetings from you know cold and infected Uppsala. This talk is going to be I think quite heavy on you know methods and algorithms and I'm going to give you some examples but I think also one reason to sort of talk a lot about you know the methods and the kind of data that we get and what we can do with it is that in my experience I feel like that sort of feeds back a lot into experimental design to really understand what you can do with this method you have to understand how you do it and sort of what you need in your data and I've tried to make this into sort of to cater to a mixed audience so some things will probably be or I'm sure we'll be repetitions for some of you but hopefully everyone can find something and I would also sort of take the time to say that if you have any questions like I don't mind at all if you interrupt during the talk there will probably be time for it's the end as well but if it's something that you think about and you feel like you want an answer to sort of to follow the rest of the explanation then it's better that you ask me okay so then let's start to talk about you know to talk about you know coherent interactive imaging we I think we'll have to start by you know putting it into context by by relating it to what I call conventional imaging and so so you know this is the most silly and and simple explanation you have of an optical microscope you have the light source an object and the lens the light diffracts from the from the object and it's sort of the lens will kind of bring it together to a perfect image in sort of what's called the image plane here where you can see the hopefully larger version of the duck now yeah if you want to use this method to study really small things like this smaller duck for example and you might know that you know optical light is not working you need to so use something with a shorter wavelength like an x-resource like which I've illustrated here which is kind of light bulb and of kind of source and you can see that you know light diffracts in the same way from the smaller duck but I'm not with the lens there because you know lenses are problematic when it comes to when it comes to x-rays it's possible to make and and sort of to understand that I mean you probably know this but you know to compare to for example medical x-ray you know x-rays just go straight through most things so and will go straight through more or less straight through the lens it's possible to make some kind of of extra lens so for example there is one type of lens for zone plates which I can illustrate here which will and which will essentially work as a lens and you can do like there's a whole field of the x-ray microscopy that uses this kind of principle and the problem is that so first of all this is sort of better representation so a lot of the x-rays will either go straight through or or sort of not end up from the image so and so you lose a lot of your a lot of your x-rays which is that you have to put extra damage into your sample because you put extra radiation into your sample and and in addition to that if you want to image really small things like like by like biological molecules or something like that you really need closed atomic resolution which means that if you want need a zone plate to work you have to manufacture it close to the same resolution which means that for really small things the zone plate is not super accurate so you might end up with some blurred version of your object okay so the solution or well because one solution at least is to say okay forget about the lens forget about the zone plate and instead just add a detector of where we had the lens collect the the diffraction over there and then try to use a computer to sort of figure out a sort of calculate backwards what must the object have looked like to create this kind of diffraction pattern and uh this is essentially the principle behind brain diffracted imaging some some people call the brain diffracted imaging or brain diffraction imaging i grew up saying diffracted imaging so keep saying today um and why is it called like this so diffraction is some pretty note um or one sort of bit shaky definition is the bending of a wave around an object such that it behaves like a new source and you can see that this is true for the light reaching this duck sort of changed direction and the duck itself behaves kind of like a new source but this was true also for the for the case of the optical microscope and coherent which i mean we've had lectures on this so this is a sort of too much dumbed down version of it but essentially this all having the same wavelength and the reason i'm using sort of this dumbed down version of it is that when i argue for why you need coherence for diffracted imaging and you know x-rays just like optical light on on this very nice album cover that it will refract and sort of by different amounts depending on the wavelength so will uh the diffraction from sample like this duck and in the top case of the optical microscope and this is fine because what happens if you have the sort of lots of different wavelengths or sort of lots of lots of incoherence in your beam um the well it will diffract in different ways but the lens that will sort of work as an inverse of this will bring it all together perfectly in the image plane so it doesn't matter that you have this mix of different wavelengths and on the other hand in the case of current diffracted imaging you will just get sort of a mess of many different sort of diffraction patterns corresponding to different wavelengths and it's at least today it's impossible to work out sort of what card came from which wavelength and so on so that's why we need for this type of imaging to work we need coherence and these kind of experiments in practice they can they can look vastly different here i'm just going to show you you know two examples on on the left it's an experiment more to test the source that they were building and where the image sort of handmade silicon nitride objects and on the right we have an experiment where you aerosolize small biological particles like pyrus and hit them with the free-level laser beams it's two very different experiments but you know the same general setup and the same general principles and and to understand what we can do with this data so essentially the name of the game today is going to figure out like how what can we do with the diffraction pattern as you can see in both these two images and sort of what does it mean that what we see there and how can we try to interpret it and in terms of the first question what does it mean and i'm going to use this formula i don't know if you recognize it but it's something it's what's called a Fourier transform and it's sort of very common actually in image processing in many ways but so it's sort of very fortunate that it turns out and it's had such a prominent role in in diffraction and just sort of briefly like i'm not going to derive exactly why the Fourier transform has this role because if i when i do this undergraduate it's usually like a two-hour lecture but just sort of very briefly like the assumptions that goes into this is essentially sort of if you look at the the blue part here the exponential essentially we're saying that each point or each point source refracts sort of equally in all or more less equally in all directions and which you can see down there is sort of an illustration of that where you can see the wave fronts coming sort of out from from one point of the sample and now the next part the density wrote the meaning of that is that okay we have this these point sources that that sort of reflect radially in all dimensions but they uh and the density there there will determine how strongly each point will be practiced so that's sort of that part and finally to bring it all together the the interval over dr just tells us that each point in this sample and so essentially scatters the independently and sort of by integrating over all of this we see sort of how the different points integrate and so on so that's sort of a very brief explanation for why the Fourier transform is it's important here or why the Fourier transform essentially describes the fraction and I should mention though before moving forward that you know this is kind of a simplification so the two main or most important simplifications that we've done here is one called the Born approximation which essentially tells us that the the light that hits beams and most of it just goes through and the diffraction is in reason why that's important is that we can neglect the interaction between the already scattered lights and the sample and we're also going to use the front offer approximation which more or less tells us that the distance from the sample to the detector is large and at large meaning that it's large compared to the size of the sample and the size of the wavelength the length of the wavelength and okay so I'm going to show you now a few different examples of Fourier transform and I find this useful because and sort of when looking at the fraction patterns it really helps to have seen a bit of Fourier transform and sort of get an intuitive feeling for what the Fourier transform does to a sample or to an object so so starting I'm just showing you here the sample to the left so here it's just a circle and to the right I'm showing you what the diffraction looks like and I'm showing you the amplitude and the phase so as you know the Fourier transform it's a complex transform so you will get complex numbers out and then the fraction these two these numbers and so which is the amplitude which is the length of the complex number and the phase or sometimes argument of the complex number so the amplitude is just you know the strength of the wave at that point and the phase is sort of this argument and and it corresponds to essentially the shift of the wave and and I think for most of these most important to keep track of what happens with the amplitude so you can see a circle we'll just we'll have a diffraction that essentially it's historically symmetric more or less and you can see these like sort of waves or wave-like structure going out from the center a big speckle in the center so it's very characteristic and and is also the kind of what you would expect if you have for example you know just a roughly spherical particle and if we try to make the same sample like the same sphere oh I should mention before moving on also that and the reason why it's not perfectly symmetric so where you have these wiggles in the corners it's just because the pixel size of the of the sample that sort of starts to play around so and the perfect circle would be perfectly symmetric and moving on to what will happen if you just make this sample larger so let's have a look at it twice the sample twice or a circle twice as large you can see that the Fourier transform it also scales but scales inversely and this is very characteristic for a Fourier transform that you know if you make something bigger in real space or in sample space you make the Fourier transform of it you know scale down with the same amount and and you can also see of course if we go in the other way smaller we make the Fourier transform sort of wider instead and here I only have a few more things and we so if we do a square we can see that you know it has the same kind of behavior but instead of having this radial symmetric now we break the symmetry in real space we obviously break the symmetry in Fourier space as well and and what we can see is that or the reason why it looks like this is that you know we have these vertical and horizontal edges and you know the top and the bottom edge will interfere with each other and therefore get the strong sort of vertical streak in the pattern and the same for the horizontal streak and the other two edges and if we start to scale this one for example we make it wider in along x we get the same kind of inverse behavior that on the x-axis it now becomes smaller and we can also say that it's rotation invariant so if we rotate the sample the the diffraction rotates the same way and and last thing if we go back to the original square last thing I want to want to show you is that you know if we change this sample a little bit but I'm just going to add you know random structure inside the square so you can see that it looks like this and so what's happening here is actually quite interesting so the both amplitudes and faces they look more or less the same as a previous one in the central part of the diffraction value if I go back and compare again from this to this so the central parts are the same but the outer parts look quite different and and what we're seeing here is actually another very important feature of the Fourier Johnson which is that the central parts code for lower resolution information and the outer parts they code for higher resolution information so when we added this noise we didn't change that much in the sample on sort of the lower resolution scale and therefore the central parts are sort of remaining intact but we did change a lot of higher resolution so the outer parts of the Fourier transform are completely different and and and sort of just to put this into formulas you can you can actually say that it's the sort of the the resolution that is coded by particular sort of place in Fourier space is given by this formula so it depends on the wavelength which is not surprising but then it also depends on the scattering angle theta and so the scattering angle being you know really I mean obviously in the diffraction pattern with a low scattering angle you will end up close to the center of the detector and a high scattering angle you will be close to the outside of your detector and that means also far out in Fourier space okay however and if we sort of imagine the more realistic example so here it's still a simulated object on the left but I've made something that maybe looks more like it might be a protein or maybe a virus with some internal structure for example and and and here you can see that well amplitude it's sort of still it's interesting it has more features and definitely the phase has a lot more information than we had before and the problem in the sort of experiments that I'll find at the beginning or sort of any kind of cdi experiment really is that this is not what we measure we don't measure amplitude in phase I mean if we did this talk would essentially be over now because Fourier transform is a very nice transform in that you know that's a well defined inverse so you can just take your amplitude in phase inverse Fourier transform and get back to the electron density of the sample however detectors at least x-ray detectors they can generally not measure phase and so we're stuck with with essentially the amplitudes and we don't measure the amplitudes directly either because we measure what's called intensity which is the amplitude squared and generally corresponds to just the energy content of the wave but also you can see is that we usually have a limited signal quite often please because there's a sort of finite number of photons being deflected by the sample so there's a finite number of photons or sort of that hits our detector so the amount of information is also limited and so so the problem or the name of the game for you know most of this talk is the question like how do we get from the middle picture like this these intensities how do we figure out what the sample looks like and and sort of this is the the problems that we have to solve because we couldn't use a lens that otherwise kind of do this for us and and someone might might ask them like after having seen the previous images that the phases often look quite uninteresting or and so so you might ask like do we do we really need them or like how how bad is it really so i'm going to show you like a classic example of of sort of right stock but actually for a transform of me and this old picture of dina that i managed to find somewhere online and i've calculated the for a transfer essentially then defection that we would get from my face and dina's face respectively and you can see that you know they look somewhat somewhat different sort of have some similarities and and so this classic example tells you that okay just try to switch places switch the phases from one to the other so okay let's let's do that so now i'm not i know at the top have my amplitudes combined with dina's faces and the bottom dina's amplitudes combined with my faces and if if it turns out to be true that you know the phases are not that important then then my picture at the top should still look pretty much like me and dina's look pretty much like her and so if we look at the inverse transfer of this we can see that it's actually not the case it turns out to be the other way around that it faces seems to be sort of more important actually to if you want to sort of recover these two pictures of our faces some people to use this argument to say that you know the faces carry more information and and to some extent i guess that's true but what's really going on here is that the amplitudes look more similar if you compare you know just four transitions of two different faces which means that you know if you just replace the amplitudes with some other amplitudes from the similar like object you're somewhat okay but the phases are you know quite unique even though the objects are sort of of similar type and this is true for faces but it turns out it's also true for you know yeah biological objects proteins viruses but also you know most things that we might need to do these kind of experiments so it's it's still a problem even though we can explain kind of why it happens um so we really need a strategy to handle this and what i'm going to show you now is essentially the basis for sort of basic assumption that we use for for most of these algorithms to sort of try to solve this problem because there are algorithms that do kind of work to solve this and this problem and so the question is kind of what should we know we have forest space on the left and real space on the right here and in forest space i've shown you you know this the fraction still the same object that's used yeah before showing the faces but and you can see that you know it's a bit noisy but we roughly know the amplitudes and it's just that for every pixel here we're missing one value we're missing the face and but okay so we have information in forest space but what about and but what about real space so in real space you might just say that okay no we don't know anything there and if we don't then then you sort of the question stops there but sometimes you can say that okay we know roughly the shape of the object for example so so you can see that if you go from you know we don't know what it looks like but we know that it lives in this yellow area and and let's see if you assume that let's see if we can do better and and some of you might be thinking like this seems like a very strong assumption it's quite reasonable that we don't know this and and i will sort of try to relax a bit in the future but for now just accept it and and and that means that you know we can say that everything in in this sort of blue area outside the yellow part and we know that it's going to be zero and so that's actually a seriously strong constraint also in real space so we have these two constraints in first which is called forest space constraint from real space constraint first space constraints is you know the amplitudes real space constraint is we know that there are zeros outside the particle and sometimes we call this yellow area we call it where the particle is we call it the particle support so i will probably use that word a bit later okay so so naively then if you would like to construct an algorithm around this we might imagine that we start with just assigning random faces so you can see on the right random faces to every pixel in forest space and we just keep the amplitudes as the measured ones and this is for very likely to be wrong but as a starting guess it might be okay and then we inverse for your transform this which we can now do because we do have faces and but of course it doesn't look anything like the particle that sort of that gave rise to this deflection and but we know something at least in real space which is that we know it has to be zero outside of this support and so we just enforce that by putting the rest to zero and if anyone is confused by the fact that we have faces in real space as well it's sort of for a transform doesn't care it's a complex it's a complex transformation and but it's usually a good sign if the particle turns out to be real at least for biological particles which usually don't introduce any kind of pastry some particles do introduce pastry and then then you know it's not confusing but you see your face also in real space hang on okay so we now have a new guess for what the particle looks like and we can sort of test that by forward for a transforming as seeing what the diffraction from this example would look like and obviously it doesn't really match the sort of the top the the measured ones but what our hope is that the faces will at least be better than the random start so we what we do then is we enforce this force based machine which is replace the the new amplitudes with the measured ones but keep the faces that we sort of got and we get something like this and the faces will look a bit different the other parts and that's only because we don't have zero signal there and and sort of the faces kind of all define for zeros but for everywhere the word is important you know we keep the face from the bottom part in the top and then we just you know keep going around like this and updating the real space and then get an updated force base and you can see that it changes ever so slightly and I'm not going to sort of click through the entire reconstruction but I'm I'll try to show you videos that so at the top you can see the sort of goal and at the bottom you can see the evolution of the amplitude and the face and you can see that you know it slightly sort of changes slightly and sort of very slowly but kind of in the right direction but also I think now or very soon it seems to get more or less stuck and sort of not move that much anymore and it and we're still quite far away from the from the true solution up there even though we gave it a very nice support to start with and and this this is I guess unfortunate into but quite common if you use this very simple algorithm and to explain what happens I'm gonna I'm gonna show a different way of thinking about these algorithms so they belong to a set of algorithms that's called convex optimization algorithms and the way to think about them as as convex optimization is to just imagine that we have all the all the possible like they have the space of all the possible solutions and then we have two sets the top one which is all the solutions that fulfill the forest space constraint but all the possible objects that fulfill the forest space constraint and at the bottom we have is that all the possible ones that fulfill the real space constraint and and ideally like we're looking for for this intersection like this small point that that fulfills both of them and and the way we do with this algorithm I just explained to you what will actually happen is that you know we start somewhere still filling the forest space constraint and then we project first the real space constraint by enforcing that and then you project on the forest space constraint by enforcing that and you sort of keep going and get closer and closer and closer to the solution but you cannot see that the closer we get the slower we get and and sort of that was clearly happening before but in addition to that this image is a bit simplistic so first of all we do have noise and we maybe our support is not correct so it might be that these actually don't need not just might be but you know essentially in reality it's always like that so they don't need perfectly and and these sets might not be convex so real space constraint first of all turns out it's actually convex and it's even nicer it's it's actually a completely flat even though it's flat in the sort of very high dimensional space but it's a very well behaved set the forest space constraint however is not and it turns out it is yeah it's non-complex so I'm gonna draw it like this it's and this means that if you run this algorithm there's not only one point you know where you have a local minimum like you can get stuck like this one but and which I'm sure is also what happened in the movie and we just showed you before and so so this algorithm which is called error reduction because you know it keeps getting better and you can see often you call and sort of define as the error it's the length of these projection arrows so essentially the length between the real space and forest space constraints after solution and this error will always go down with this algorithm that's you can you can easily prove that but and that also means that if you end up in one of these local minimats you're never going to escape from it and and yeah so so this is what happened so for this reason there are tons of different different algorithms that somehow are variants on this so I'm just going to show you some results from one of them called hybrid independent outputs which is I think it's a quite old algorithm is from the 80s but it's still used and lost today even in published papers big and essentially the as you can see from these arrows the forest space constraint is applied in the same way but the real space constraint is is not it's sort of as you can see also it kind of often overshoots a bit and the amount that it does it with is depends on sort of how far down it was in the previous example it sounds a bit complicated but the effect will be that when you sort of keep running it as you can see that it gets sort of starts to get stuck in a local minima but if it's stuck in one place for a long time it will sort of jump further and further and further for every iteration and sort of given time it will end up far away from it so that the actual in this case the actual best solution sort of actually is closer in the next forest space projection and so and from using this algorithm you will see that it seems sort of more random and more energetic but it means that it can escape from these local minimas and and and and yeah generally performs much better so I'm going to show you now the same example as before but with hybrid input output and I actually slowed this down by a factor of two I think before or maybe compared to the previous one so just for you to have any chance to see what's going on okay so let's run it as you can see it sort of starts to be quite you know random and energetic but after a while it really seems to find the correct minimum and you know focus down quite well and in that one so here we see that we got much much better reconstruction of the of the object than what we got before and okay so moving on I'll just briefly mention then for for sort of anyone who's interested you know I've mentioned about two algorithms error reduction on the top and and hybrid input output or hio there are tons of different ones that are sometimes used one is called raw it's called difference map and there are more but these are maybe the most commonly used ones and both of the other two they behave somewhat similarly to to hio raw it's maybe slightly less energetic sometimes if one doesn't work you can switch together but in my experience also when you start to switch out like this it's usually a sign that something doesn't work maybe not and in addition to that I didn't mention that you know people sometimes apply other extra constraints so for example and now I did let the real space vary however it wanted essentially but you can also add additional constraints such as the density out to be positive which is very reasonable or even that it has to be a real object which is reasonable for for example mini biological objects and you might be able to add sparseness which would mean something that tries to find you know atom positions instead so encourage there to be empty space around the strongest scattering points for example so and there are many extra constraints like this that might also make the algorithm better and I'm gonna I'm gonna very very quickly say it's okay someone might be wondering like are we only using these convict optimization algorithms and and because it seemed like they're not quite made for this type of non convict sets and people are experiencing with no gradient descent and I I made this attempt a few years back to try my station on method called simulated annealing um and this is of a different object and it it turned out like this this example for example it turned out to somewhat work um and a nice thing with it is that you know simulated annealing being in general more general optimization algorithm and can optimize any kind of function so you can I can really look into you know I want the object that on a sort of probabilistic level best explains the intensities and sort of assuming close on statistics and things like this which is not really possible with with hio um but and it turned out I was actually surprised but in fact it did work it did manage to reconstruct the object somewhat but the solution is actually worse than the hio even though we can do this nice assumption and and in a in addition I think it runs about a thousand times slower or something like that in the shape to get this similar kind of reconstruction so essentially it is more like yeah cool it is possible to use other methods but but so far I haven't seen anything that comes anywhere near these convict optimization type algorithms in terms of porting um I said earlier that you know what about if we don't know the support what about like that assumption about the support seemed like a fairly strong one and um and it's and it's true and often we don't know it but it also turns out that if we don't have a type support the burst support is a bit bigger like on the right very often we we can't recover the object well um so these algorithms are quite sensitive on the support being nice and tight and so and if we like as an example if we try to run the same algorithm as before but instead with this much looser support we can see that even given a lot of time it will sort of never converge to a good solution but we do see something else that is interesting which is that we start to see the shape of the object somewhat so it's it gets something right but it has problem figuring out what your density should be inside or outside of it and so on so there is like too much freedom for it um but the fact that it gets something right is still interesting and so and based on this realization there is an algorithm that's called shrink left it tries to solve this by continuously updating the support so during regular intervals so so you start usually by you know taking and this is actually starting from you know 20 iterations in and then repeating this every 20 iterations in my example here and you start by running error reduction a few times even if this is not the algorithm you're running you're using it just to make sure that you know you get the best possible you get down to the sort of minimum of that particular local minimum that you're in and so run a few in this case private relations of error reduction and and then blur it and in this case I've blurred with about three pixels and the reason is that you know we probably can't trust the high resolution parts of it but maybe the shape of it is sort of better than the original shape that we have it's the hoop and then we use this to sort of take the stronger part of this to calculate the new support and and if again I'm going to show you now the same reconstruction and using this method and you can see that the support updates every 20 iterations and it quite quickly actually goes down to something that is very similar to the support we had and it even turns out you know this support becomes a little bit tighter than the previous one because everyone was not you know super well optimized here you really get the sort of very tight support and definitely you can get a nice reconstruction but the big benefit is not to get the better reconstruction the big the big benefit is obviously just okay now it's possible to do these kind of reconstructions but without knowing the support in advance and so okay so moving on I want to take also just a little bit of time especially for any of you who have a crystallography background to to explain sort of a relationship to that and sort of the reason why this is possible because sort of famously in crystallography you need some kind of always need some kind of extra information about the sample to be able to to solve it to be able to you know face it and recover this huge structure and so how can it be that it's possible here but not like in cdi but not crystallography and to prepare it let's just sort of recap what crystal the fraction is so here to the left you have you know the same sample and to the right it's the fraction pattern and if we start to create like this crystal here's a very very small crystal you can see that you know the interference between positions of these molecules will create this kind of almost like the double slit experiment like pattern but you know both for something vertical tractions but it will be only be modulating the the original the fraction pattern of it so it's sort of the same kind of information is already there but if you make the crystal larger what you can see happening is that these peaks so for bright peaks they become more narrow and they become much stronger which we sort of keep making larger like this so for very large crystal we have essentially only information at these peaks but nowhere in between and here like in my simulation here you can still see it a little bit and that's partly because I use logarithmic scale and partly because the crystal is still very small so and this sort of bright peaks essentially this means that in crystallography only get information at these bright peaks which is and they happen to be spaced at what's called critical sampling meaning that if you only know the information there you have just about the information you need to recover the object if you knew the faces but what we don't so it means that you can't and you need to figure out the faces some other way another way to see this is to say in the crystal there's there's no space for support like there's no space around the object that we can say has to be zero and another way to understand this is just by looking at again that the diffraction pattern from single particle and then if I change the sampling for your space so essentially we reduce the pixel size of my detector and see what happens in real space so if I reduce it from originally 1024 was my pattern and here it's 512 on the side which you can just see by it's becoming slightly more blurry and and then on the left one in real space this example actually you know we we're reaching the same resolution but what's left or what's changed is that the field of view so the empty space around it is reduced so if you keep making the right more pixelated you can see that the space around it becomes so less and less and until we reach this point which is critical sampling this is the same kind of sampling that we would get from the brackets and you can see that there's no space anymore around the sample at all which means which is why there's nothing that we can set to zero which means that we have no real space constraint and this is why why crystallography doesn't or in crystallography you can't face the pattern correctly but it also means that if your detector has pixels of the size that you see on the right this method also doesn't work it only works in the case where you have you know the high sampling rate so and this is usually called oversampling as you can see in the headline that you know we need oversampling to be able to and to do this kind of direct facing and okay so I now want to talk about I think something that is very important so I'm gonna talk about one one common problem in in phase fuel and I this is I haven't seen this sort of presented in many other ways like there I don't know there is one paper that deals with it but but it's something that you know you can see published papers that that don't sort of realize this problem so and it has to do and it has to do with missing data and so essentially I'm gonna try to explain when is missing data in in your in forest space or missing data on your detector when is that going to be problem and when is it not and so again let's look at our constraints we have the real space constraint and the forest space constraint and so one way of putting this is that you know in forest space we know the values and or the amplitudes everywhere outside of this red region in the center and why is there this hole in the in the middle why don't they measure there and I mean sometimes maybe you don't have it but often because of the border estimation that you know you're always going to have very strong beam going through the sample it might damage your detector and and that's why it's very common to the beam stuff but sometimes you will have missing regions because your detector is made of modules and there's a bit of gap in between and so on but okay so in this case we don't know data in this red region and in the real space constraints we have the same kind of situation this red support that's where we don't know data and everywhere else we know that it's zero so you can say that in the blue regions that's where we have any that's where we have the constraints and in the red regions in both real and forest space we don't have constraints and so then you can ask yourself the question like okay if if you can create any objects in real space which whose Fourier transform fits within this red region in forest space then there is no way of knowing how much of that to add to our to our solution for electricity and so that would be an ambiguity in this reconstruction and so it's a very interesting question whether such objects exist and and the and the good the good answer is generally that no they don't exist but what and if i show the same thing in you know one dimensional thing whenever one one d forest space to the left and this space in the right if you imagine something like a Gaussian for example and and you take you know which we're trusting that you get another Gaussian because you know we're trusting with a Gaussian it's another Gaussian but with sort of inverse inversely proportional width you can see that these cautions are not neither of them live completely inside the the unconstrained area in the center but both of them have very small parts and outside of the outside of this region so that means that the constraining power on them is very small and and depending on exactly the size of the missing data and the size of the support this constraining power might be you know very very small or not that bad at all so chasm then like for example for the reconstructions that i showed you before how does it how does it look like and i've actually calculated now that the sort of the object that you know fits the best within both of these two unconstrained regions for this reconstruction so you can see that the yellow outline is the shape of the support and the and the missing data respectively and and you can see that this this mode or weekly constraint mode as they're sometimes called and it actually has a lot of density outside and especially you can see from the support that there's a lot of density outside of it and if you calculate how much density is outside it's actually a lot that's actually 70% even though it might not look like that because it's quite weak outside of support but the combined sort of integrated volume is actually quite large and so so this is not a problem at all actually in this reconstruction but just to show you how it might look like if you have significantly larger beam stuff three times larger in this case and you might have something like this so here i've again calculated this to and to have sort of and that you can have you know the worst possible case is something with a constraining power of only 0.3% and this is probably going to be problematic like it's it's very likely that this amputee will will mean both that you don't know your result very well but also that your reconstruction might go off in weird directions and so on and in this case it's actually gets worse like if i keep calculating you know the second least constraining modes and so on you can see that you know all over the first maybe five or something like that with quite weak constraints so you have a lot of freedom in the reconstruction that you know you don't have a handle on in this kind of setup and and okay this might sound complicated and you might be wondering like okay do i really need to calculate all of these things to be able to know whether whether my date is like i will be able to reconstruct my data or not and so on and i mean of course that's that's the best and if you're interested like i have code to do this but as a general rule of thumb if you look at these two like what we can say is that if the missing data region is on the order of the size of the speckles in the pattern you're usually in a good position and to the right you can see that the missing data is significantly larger than the speckles size which is a problem and in general like if general rule of thumb is that if the missing data region goes to at or outside of the of the central speckle that's usually when problems start to happen and it's a bit of you know nostalgia for me this is the diffraction pattern that i spend part of my PhD working on or not that much time but it was you know i'm one of the first ones i got and i worked so hard and and i couldn't figure out why all the reconstructions were crappy and years later i figured out this with the missing modes and it turns out that you know it definitely didn't fulfill this requirement so i had huge problems with missing modes um okay so other kind of ambiguities that are usually not as problematic as that one but still good to know about is um so here i've showed you the you know just for a transfer of the sample at the top but then i showed you the floor transform of the same sample but symmetrically inverted and uh because of you know particularities of the floor transform it turns out that the amplitudes look identical for these two and you can see that the only difference is in the it's in the faces and but that also means that when we do facelift people we could equally well get either of these two um and in the same way the ship will also only affect the faces and we could equally sort of so our sample might sort of look around but especially this centrosymmetric flip is good to know about because if you're reconstructing something you should always be aware that you know especially if you go down to molecular level resolution that you know if it doesn't fit it might be because you're reconstructing the centrosymmetric flip version um so i realized that i don't have that much time so i'm going to show you a few more things at least before people are taking questions um so so one thing that i really want to show you is sort of what do we do when it's sort of to validate what we get so because i've showed you now single reconstructions a few of them but um here instead i'm going to show you sort of if i just repeat this reconstruction many times something like so here repeat the same reconstruction the same reconstruction that i showed you earlier in this talk and i've done 10 repeats of it and and you can see a few things first of all that we do get both centrosymmetric versions which is expected you can also see that sometimes it didn't one time it didn't figure out which version to get so it failed in in that case and there was another one at the bottom where it was just completely failed and and i've also plotted something that's called the error which is the Fourier error which is essentially the distance between the two sets at the solution which i mentioned earlier and and all the software to do this they report these kind of things um and you can you can which is good you can see that you know the failed one seemed to have worse error which is which is encouraging um but what we normally do is you know or what we really did is we did the phase retrieval so it's interesting to look at at what happened to the faces and are the faces you know similar in these different reconstructions or not um and there is a very common measure of a way of measuring this which builds on the principle that you know if you take just one pixel so say that we can very top left or something like that um and you plot that's the face in an argon diagram like this plot the face of each reconstruction of that particular pixel you can see that you know if they're fairly similar they might look something like this you know they all have roughly the same angle um but not exactly if you then take the average of all of these points that this gives you we get some something yeah something like this and if you do it again but for an example where the faces are maybe not as reproducibly recovered so not the same every time uh we might get this value and and so it turns out that you know this this average that we calculate um the distance of that average from the center is a fairly good measure of how reproducible the face this was so how reproducibly recovered was the face in that particular pixel and so for these reconstructions that I did we can we can calculate that for every pixel and we get what's called the face retrieval transfer function and essentially it's a way of saying how good was this algorithm at sort of recovering or propagating values sort of at these different parts of the Fourier transform and we can also calculate the average real space reconstruction which we see look okay but it doesn't look perfect because we had these few failed reconstructions that that mix in together with it and it will also make the PRTF slightly worse and also what you will usually see if you if you read a paper on this is that you plot the radial average of the PRTF which is very nice because it's kind of as you remember further out in the first in Fourier space means higher resolution so you essentially get um sort of a measure of how well how well did the reconstruction work as a function of resolution and now these results really didn't look well because of these failed ones so in practice what we usually do is that we put some kind of threshold at this error and in this case I chose 1.3 which sort of nicely sort of sorts out the bad ones and and it keeps only the good ones and we redo the PRTF and we can see that it actually looks looks much nicer and the average bottom also looks looks really nice and and this average the the nice thing with with that the average image is that you know this way we know that we only keep you know the reproducible parts of it and everything that just is artifact of the weird random state of start that we used is just sort of washed out by this and quickly want to mention that what we usually do then with this PRTF is that we use this kind of arbitrary threshold of of e to the part of minus one to say that okay things were okay recovered okay up to that point and you can see that it sort of crosses the line at about eight inverse nanometers in this case which means that we have we could say that we have a resolution of about one over eight nanometers however you should be aware when you do this that now I only used eight images to sort of put into this and just complete the random faces like on the left with eight images we'll still get an average about 0.35 which is very close to one over e and so for this reason if you do some practice or if someone publishes a paper like you should definitely have at least more than 100 repeats or something like that for this PRTF to be to be reliable and and this is also very common problem I see in lots of published papers is that they don't use and don't use enough repeats for this to be reliable and so last couple of things I I really would like to show you before so before I let anyone go is just say this so so far I've only been talking about three about two-dimensional Fourier transform two-dimensional samples and so on and and in reality you know our object that we've been playing with what you've seen is a projection of it and it might look something like like this in 3d and the 3d object will of course have a three-dimensional Fourier transform that will look something like this obviously values everywhere but but I've just sort of made these three cutouts of it for you to sort of see see parts of it so this is the Fourier transform of it and so so what is it that we actually collect on our on our detector and and it turns out that for a particular detector setup geometry you can see the black point is the interaction region that's where the sample is and what we the part of Fourier space that we see is something that you know retains the same geometry and sort of the same maximum angle as the as the sort of maximum scattering angle collected by the detector but it will be a section this sort of spherical section from from this Fourier space and this sphere is actually called the evalt sphere and this is sometimes called the evalt construct yeah if your scattering angle is small that means that you know this is more or less a plane a plane section of Fourier space and and sort of all the assumptions that we did for 2d imaging 2d imaging earlier are still going to be valid but if you have a very high scattering angle that that means that you can't really do 2d imaging very well which is the case in you know crystallography but also many other types of of tomography where you know combine many different diffraction patterns and so what do what do you normally do is that okay we know that one one diffraction pattern only gives us you know this kind of information about the Fourier space but if we have a sample that we can that we can rotate that essentially means that we will rotate the part of Fourier space that we that we measure and so with sort of five consecutive experiments we might get you know these five consecutive sections of Fourier space image then we just keep rotating we might fill up the entire Fourier space and after doing that we can solve the phase problem in exactly the same ways in 3d as I showed you in 2d so so 3d imaging actually if anything works a bit simpler so I'm going to show you now after sort of after doing this and running the reconstruction what you get towards the end is something like this so you can see that yes 3d imaging also works quite well and and yeah essentially everything I've told you about 3d imaging carries over perfectly to or everything I've said about 2d imaging so far carries over perfectly to to this case of 3d and so I've talked for an hour now so I would like to sort of take a break and see if there are any questions if anyone is super interesting I have a few more slides but they're not on methods they are sort of on my applications of this using free electron lasers so if anyone is super interested you can ask for it and I'll be happy to talk about them but I think it doesn't make sense to sort of keep everyone around so um yeah I want to open up for for questions and yeah then also let me know if there is anyone who's interested in in looking at the last maybe five seven slides or something like that about free electron lasers and biology and and cdi yeah thanks great thank you Thomas thanks for a really nice talk I'm taking over after Dina sharing the end of this session if anyone has questions you can ask them right away by raising the hand I guess I can see this or just type your question into the chat I actually have a first question from Dina before she left is how easy to find or code your own phase retrieval code and what does it take to gain expertise in this kind of analysis is there any tricks that an expert can teach us beginners so first of all to you know to code it it's I think to just implement the basic error reduction or HIO it's not that many lines of Python code so or or any any code so it's not hard at all and I think if anyone is getting into it I mean I think it's a it's a good experience to to have but but if you if you really want to sort of be serious about it um the I mean there are more you know production codes like a colleague of mine pulled my ID sort of this has made the very nice version and if you use his his code it will run very fast using sort of being implemented on graphics cards and so on so you'll get maybe a speed up of more than a factor of 100 compared to your own code probably more like like a 500 compared to your own code and so I would still suggest that if you're like if we're going to use this for some serious analysis I suggest checking out this package or something else that already exists and because especially doing these like 100 or more different repetitions of the same reconstruction and especially if it's in 3D it can take a long time with you know home rolled code and any any tricks I think one one good thing is just you know to gain some to gain some experience like that's that's if I compare myself with you know someone who has said fairly new in the field the main difference I see is that you know I've seen stuff before I've seen things pale in many different ways before and you know figure out what might be wrong and so I think just you know just getting experience and trying different algorithms is is probably the best but if you want to do that like I also really suggest not to just you know take whatever data set you get from your own experience I think maybe the best advice is actually just to ask someone or look at some repository online for you know already published data so you can repeat the result because they know it's something that will recover well because and otherwise it might be like doing my PhD like it might be very hard to realize that you're working on an impossible problem or ask experts for some advice in the beginning I guess and I think that's that's that's good and it's like maybe the next step but what might happen is still that the expert either looks at it and it's obvious but it might also be a problem like that they haven't seen before anyway and they will have to start sort of working on it themselves to figure out what's that's problematic but I think at least I've covered now as some of the common like in this talk some of the problems that are commonly unknown to people are sort of getting into like with the problems with missing data for example and that can cost a lot of headaches if you don't if you don't realize with them okay anything more questions I have probably just finalizing one general like what are there we have a question now I am if there is really a lot of missing data large inactive various in the detector what are the possibilities to retrieve the image have you tried combining phase ritual with holography or you are here right so first of all okay what's the if you have a lot of missing data so so one way to or maybe the nowadays the sort of the best way I would say is that if you go back to let's see if you go back to this slide if you take a lot of different images of the same street dimensional first place you it might not be a problem if you have missing data in one of them because you know it might be covered by by another and so for example though if you have something like a screen between two modules in a detector if you just keep keep rotating it might always be in the same place but if you have the possibility to maybe rotate the detector for example or rotate the sample in sort of several directions you might be able to make sure that it's really things you know the missing data don't always line up and that's a way to solve this and otherwise if you can only get one shot from from each sample and you have a lot of missing data it's just really hard like sometimes you might be able to not really do a reconstruction but still figure out something just some other type of analysis but it's it's just tricky but but yeah so so but if you can this video then then you're usually fine in terms of missing data as long as your beam stuff is not super large because then you know you're going to cover all different angles and there's going to be no way of making them and by now and then face your table with holography so a little bit so I guess there are two approaches either you know you can just do direct holographic like if you have a very small reference and and it sort of works well you don't have to do essentially any of this and I think that maybe it would be able to talk later in the series about these kind of things and and and I think no it works very well the only problem is that the size of the reference will determine the kind of a solution that you can reach and but I know that people have sort of experience with just you know doing this type of place for people and but enforcing as I talked about you know additional constraints as you can for us reality or something like that you can also enforce that we know some part of real space that okay here we have this holographic reference and then it might even be significantly larger than just a point source or something like that and and and that I have tried a little bit but I'm mainly seen other people sort of do that and it seems to work quite quite well often like it's this extra constraints really helps recovery I don't know if this was related to your first question and so I've never tried to use that to and to sort of really get around the problem with missing data um but my guess is that yeah if the missing data is not super large I see that it could work because information from because information is also present interference that is maybe part in sort of other parts of it but I really have to try before saying anything too sorry that I think it would be interesting and interesting thing to try for example I can mention in in crystallography they they often use this kind of constraint that they if they know part of the sort of that the sample is very similar something else or that it's sort of part of the sample that they might know but they don't know the ligand for example which is a bit like an extreme version of the holographic constraint that you know other things and they can usually do these kind of recoveries even though they have very large units of missing data so that's one reason why I think that it's it's not unlikely okay nice question all right thank you any more questions maybe the last one what uh what would you say the major limitation and the current experiments you go to xfl to do this what is the for the resolution like the ultimate atomic resolution nowadays so so for free are there little lasers like I mean obviously this is going to be a completely different answer if you and if you do some other like fingerprint applications obviously yeah but but I know well which is which is free like the lasers and I think funnily enough like the main limitation right now is the background nonce so it turns out that the fraction you know the fraction from even though free like the lasers are very popular the fraction from the single protein or even single virus is very very faint very weak and and what happens is that you know when we put these in the vacuum we need a little bit of gas and just those gas molecules stay diffract on the order of magnitude on the same level so we get as many photos from the gas as we get from the on the sample and so a lot of work now is is focusing on trying to reduce this amount of gas and also trying to use other sorts of backers so that people can study try to study as small things as possible and and also like from maybe my side and have a student that works like just on trying to figure out how can we deal with the background that I mean some background will definitely be left so from the algorithmic side we're trying to figure out okay how can we deal with it in these in these algorithms and it's mainly actually in assembling it into three different spaces where we have to think about it back then and or at least about sort of the the noise being bad because like when doing that we will average a lot and hopefully the background will average out a little bit and and so for the phase recovery well hopefully the situation is a little bit better but and but it also means a little bit so I think that's the main limitation I should say this can also be solved by having more powerful if he else but I think right now we're more closer to reducing the background and getting significantly better okay thanks a lot for the nice answers I would like to close this session now if you have more questions you can ask them directly by email to Thomas and yeah whatever you want you also can send some feedback or the questions to Dina I guess and organizers the next meeting will be next week about typography and extension of cdi into the scanning techniques so thank you for being here today and have a nice day I think thanks Thomas