 This first part of the data analysis lectures, we will be talking about model-free and model-based feeding. There are some other approaches, as Wojtek will introduce later, but these are the two first ones. Now we'll like to start again with some refreshment of what we talk about this last couple of days. Basically, in the small-angle neutron scattering, we are looking at in homogeneity in the scattering length density profile. We have this amplitude of the form factor, and we have this scattering length density profile, which is actually what we actually want to determine. Because by knowing this, is where we can extract the structural information of the dispersed particles or mesoscopic structure. As I said yesterday, when we go to do an experiment, we have some experimental intensity that needs to be corrected to obtain this macroscopic cross-section, which is where this structural information is contained, and therefore, where we can get the structural parameters. Because of the Fourier transform and this square of this amplitude of the form factor, what happens here is that it's not as that to obtain this structural information, we cannot just apply an inverse Fourier transform because part of the negative values of this amplitude of the form factor are lost. So it means that we have to reconstruct because it's not as a straightforward way. So we have to reconstruct and that's what we do during the we have to reconstruct this scattering density profile. That's what we do in the part of the data analysis. So basically, we can use the small-angle scattering to proof the structures between one and a few hundred nanometers. It's important to be in the right queue range, as I said yesterday, and then there are some important concepts about the composition of our sample in terms of contrast and the iterations that we actually have to keep that in mind. When we plan the experiment and also when we are going to analyze the data. When it comes to this momentum transfer vector, the scattering vector, I just want to remind you that this is the function of the scattering angle and the wavelength of the radiation that we are using for this scattering measurement. So basically, we can use these queue vectors to standardize the region of interest of a given experiment. So anything that we measure with sacks and sands and the features, the structural features will appear at the same queue value. If we apply Brack law to this equation and substitute this scattering angle and this wavelength, we see that this queue vector is actually a measurement of the reciprocal space. So it's inversely related to the reciprocal space, which means that if we want to look at the small features in our system, we have to go at high queue. If we want to look at a large features in our system, we have to go to low queue. For example, if we want to see it in tight orange, we will have to go to low queue. Finally, the scattering length density profile, it's a parameter that's used to quantify a scattering ensemble of atoms. It's very handy when it comes to a small-angle scattering analysis because it simplifies the problem. It's important to remember that there is a difference between the neutron and x-ray scattering length, because the radiation interacts in a different way with matter. So basically what happens here is that we have a linear dependence for x-rays, whereas there is a random variation for neutrons which is also isotope dependent. So we have here deuterium and proteome, so we can see that they are different. Whereas in x-ray, they are exactly the same. This was just a brief reminder of what we talked about in the last couple of days. Now we're going to talk about data analysis. So I'm going to present a few different things. So the first one will be the feeding algorithms and the different algorithms that are, I mean, basically the engine that we used to optimize our model, the resolution functions that we usually apply to account for the resolution of the measurement. And then we're going to talk about a few different approaches to fit the data. So IFT, so indirect feed and transform, is something that Wojtek will talk a bit more into detail in the next lecture. So I will not give any details about that. And empirical models is something that nowadays is not, I will say that it's not widely used. So I'm just going to present like a few of them very briefly so you can actually know what they actually exist. But I think that it's not an approach that is commonly used these days anymore. Okay, so when it comes to data analysis, there are different approaches. And basically you have to pick the one that suits your purpose, okay? Because you can gain a different level of detail, but also that involves a different level of complexity. So if you want to get loads of detail on your data analysis, that's going to involve like a really high complex data analysis, which is going to take time and resources and so on. So maybe if you just want to see if a protein is folded or unfolded, we don't need to go to this, for example, accumulation assisted methods. Because just by knowing if it is folded or unfolded, we can use some model-free approaches that are quick and easy to perform. So basically we don't need to go that in detail, which will save us some time. And then in between you can find different approaches that we will treat today. So first, it's the algorithm for optimizing the model and goodness of fit. So basically what happens is that when you create a model, when you implement a model to fit your small-angle scattering data, what happens is that you have some function that has, there is a difference between this model and your experimental data. So what you want to do is you want to use an algorithm that will optimize and minimize the difference between experimental data and the data from the model. So basically this is the algorithm for fitting a small-angle scattering data. So normally we use non-linear square methods, for example, Levenberg-Mackward algorithm. And basically this also provides a way to account for the statistical weight of each Q-value or intensity value. So if there is some experimental error associated to this measurement, what happens is that if the experimental error is bigger, it means that the algorithm will take this into account and it will give, let's say, less importance to that Q-value or intensity value, because it's less defined. So, and then to account for the goodness of fit, we, there are also different approaches, but one of the most common ones is to use the chi-square statistical parameter. So basically what this does is that this takes the intensity values from your model, which is the y-fury and compare those to these intensity values from your experimental data. And it's also weighted to the square of the standard deviation from your measurement. So it's a squared of the error in your measurement and divided by the number of points. So what happens here is that the smaller is the difference between your experimental data and the model, the lower this chi-square, normalized chi-square value is. Okay, so basically what happens here is that when we're trying to fit and prove our fit net where our fit, what we have to do is we have to get it close to zero. So minimize the value of that chi-square parameter. Okay, so the next thing that I wanted to mention is about the resolution. So this is something that I've briefly discussed in the last couple of days. So yesterday I talked about the relevance of the instrument configuration when it comes to resolution. And today I'm gonna talk about how we actually use these resolution values to implement them in the models, okay? So as I said yesterday, this mirroring the resolution contribution from the experimental data is actually a very complicated procedure. So you can do it for example, when you have a slit collimated instrument, like for example, a USAN instrument, I think that Andrews presented that yesterday. So then you apply this mirroring to get to get like the resolution contribution, the convoluted from your data. But when it comes to pinhole instruments like standard SANS configuration, normally what you do is that you use some models to smear the filter model for the experimental resolution. So what happens here is that the intensity from your model, it will be the cross-section, the scattering cross-section of your particles. And this will be convoluted to a resolution function where Q is the Q value of each of the points. Sorry, the average Q is the Q value of each of the points. And then there is some, let's say, the spread of those Q values. So this resolution function is normally approximated by a Gaussian function. So basically you have your Q value here at zero and then you have, let's say, some Q spread around this. So basically this type of approach is used to calculate the DQ divided by Q. So the resolution of each of the Q points. And then we apply that to our theoretical model. And in that way, we account for the resolution of the measurement. So there was a question a couple of days ago about how we can do this for different models, for different data that come from different techniques or instruments. So for example, when we use SACs, we normally have a super good resolution because CCD cameras have a super good resolution and also in X-rays you can really define well the wavelength that you're gonna use for your measurement. So it means that our resolution is gonna be very good. So normally you don't even apply a resolution function for X-ray measurements. But when it comes to neutrons, you have to apply this resolution function because there is a significant difference between the resolution, in the resolution between different instruments and even between different instrument configurations. So it's important that when we go to, for example, core find these two data sets, when it comes to X-rays, we don't even apply this resolution function. But when we go to set up the parameters to analyze the SACs data is when we actually apply this resolution function. So basically we will have like different approaches to feed different data that comes from instruments with different resolution. I will also talk briefly about that later. Okay, so this is something that you have to keep in mind when you are feeding the small-angle neutrons scattering data. And as I said, normally when they send you the data and we're gonna see that tomorrow, when they send you the reduced data, we have always a column, not always, but we very often have a column that has this resolution values. And then when we put this, sorry, when we put this in the software, this software will already calculate this resolution function and implement it to your model. So now I'm gonna start to present the different feeding approaches to starting with the most simple ones, which is model-free feeding. So these model-free approaches normally are a good starting point for the data analysis because they provide a rapid characterization of this catcher and they are pretty useful because they don't assume you don't need to know anything about this. I mean, basically you can just go there, put something in the beam line and then measure something and you can apply some of these approaches to that data without having, for example, no structural information. So this comes handy when, for example, you have to do like a rapid assessment of your data to see if, for example, some particles are aggregating or they are interacting and so on, okay? So the first one is the scattering invariant and what happens here is that we are gonna calculate the integrated scattering cross-section. So basically we are gonna calculate the total signal that comes out from a measurement. And this is because Poirot realized many years ago that the total signal was independent of the distribution of the inhomogeneities in our system. So for example, if we have a system that has an 80% matrix, 20% particles is, so the total signal will be independent on how this density inhomogeneities in the density are distributed, okay? So basically we can use this approach to calculate the volume fraction of particles. You have to be aware that this requires data in absolutely scaling, so it has to be able to use using the protocols that I explained yesterday. And it's a very simple approach that you can use to calculate the volume fraction of particles without really knowing nothing about the structure of those. So the next one, a common approach is to use the Poirot exponent. So Poirot, a very smart guy, realized that the signal at high Q arise from the interfacial scattering. And basically what he found out was that for defined interfaces, the scattering at high Q was always proportional to the Q to the minus four. And this is like the Poirot's law. So basically what happens here is that you can then take this idea and generalize it for different type of interfaces. And we have here some like graphical explanation of this. So basically if you apply logarithm in both of these, both sides of the equation, what happens here is that you have the logarithm of intensity is equal to the logarithm of this parameter that relates to the surface to volume ratio. We have the Poirot exponent and then we have the logarithm of Q. And what happens here is that if we plot our data at high Q in this form and we calculate the slope of that scattering data, we'll get this Poirot exponent. And for example, if this Poirot exponent has a value of minus one, it means that we are gonna have a one D scatter. So it's gonna be something that is very long and very thin. And it's the same for a minus two is gonna be a 2D object for like a 3D object is gonna be a minus four. And then if you have different types of like polymer networks, for example, you will have different exponents that relate to the behavior of that polymer in solution. So it comes quite handy when it comes to evaluating the scattering data, for example, from polymer systems or from particle systems. So the next one is the Ginear plot. So what Ginear realized was that the low Q data could be used to describe the size of the scattering particle independently of the morphology of this particle. So basically this is the Ginear approximation. What happens here is that we have this I of zero, which is the scattering signal at angle zero. And then we have this radius of variation. And basically by plotting the natural logarithm of the intensity versus Q square, we can calculate the radius of this particles from the slope of that curve of that scattering signal. Okay, so the validity of this Ginear approximation is always restricted to Q multiplied by the ratio of the ratio has to be smaller than 1.3. So I will ask you in normal conditions, I will ask you why, if someone is brave enough to write it in the chat, it would be welcome but there is a hint here in these two pictures, okay? So I'll give you like a couple of minutes to think about it and then write something in the chat if you want, okay? So in the meantime, I'm gonna continue by showing this for example, by taking some data and just plot it. So for example, a polymer that is behaving as a Gaussian coil, which is this red curve or some spheres, which is this blue curve. What happens here is that you can see that the Ginear region, which is this one here and with the word size information of the particle, it's contained basically is the same. Even if a polymer that behaves as a Gaussian coil has a different structure to sphere. So basically this was what we never realized. And then we put it in the Ginear form, we see that we have this linear region here that overlap because basically the radius of variation of those particles are the same, but the structure is different. Okay, I see that no one has actually there to write it in the chat. So I'm gonna give you the answer. So what really happens is that Ginear is useful when you actually measure the Q range where you are looking at the Ginear region. Okay, there are something in the chat. Just to guess, but like if you have Q squared, RG squared then they, oh wait, I was thinking like if Q was 1.3, but that's Q times RG. Yeah. I was thinking that I have to up to the fours and I get like three, then my E will be like, now I've got E to something below one and then I will have a decreasing intensity, but now I'm trying to just guess. I see your reasoning, but I think it's... But I don't see. No, my method is not right, yeah. Yes, it's way more simple than that. So what happens here is that basically this Q multiplied by RG will give you the limit where you are actually seeing the entire particle. Okay, so if you remember about this approximation using Bragg's equation in the momentum transfer equation, there is this relationship that says that Q is somehow proportional to the inverse of the distance of the correlation distance in your system. So this means that if we are in a regime where Q multiplied by R of G is above 1.3, we are gonna be in this situation. So the measurement that we are actually taking is not containing the entire particle. So we are taking a picture that is very close to, for example, the interface of this particle, but we are not seeing the entire particle. So we cannot actually say how big that particle is because it's not included in the picture. Okay, so basically this Q multiplied by R of G is telling you that the particle, it's the entire particle, it's inside your picture. So you can use this approximation. So this is like a common way of checking that the Guinear region is valid. The Guinear approximation is valid. This is just some square showing where the Guinear region is. So the Guinear plot is useful for gaining some information in terms of size of your particle, but it's also very useful for checking the, for like some quick check, quick assessment of the behavior of your sample. So for example, if you have some structural factor contribution, what happens here is that at low Q, we will see a deviation from the linear. So it will not be linear anymore. So the low Q data will not be linear anymore in the Guinear form, in a Guinear plot. So we see here, for example, that when we have interacting particles, this intensity drops at low Q. And when we have particles that are attracting each other, for example, aggregating, we have this after. So if we have, for example, a system with particles that are prone to interact, we can use this approximation to see if they are in the dilute regime where they don't interact with each other, for example. Okay, so it's just like a quick way to check this. So then we have this crack key plot, which is basically something similar to, so it's something derived from product law. So product realize that this Q to the minus four dependence applied for these defined interfaces. And this was like generalized. And basically a random call will have a Q to the minus two. A Gaussian call will have a Q to the minus two dependence at high Q. So someone said, okay, so if we multiply this side of Q by Q to the minus two and we have a Gaussian call will have a plateau at high Q. And then they realized that actually you can use this to assess how you can use this for a quick assessment of how this, let's say polymer was behaving in solution or a protein or whatever. So if we have a Q to the minus two dependence, it means that we have a Gaussian call or a random call. So it's, for example, to say that this is unfolded. And if we have something that follows this type of behavior, it's globular. So basically you can use this for a quick assessment of the folding a state of approach, for example. So this is kind of like qualitative assessment. So the next type of analysis, it's useful when you have peaks in your data. I have a disclaimer here, which is that small angle neutron scattering is not great for starting periodic structures because we have a resolution limitation. And if we have a resolution limitation, it means that the lower is the resolution, the more smear will be these peaks and the more difficult will be to identify them and to find out where they are. Okay, but you can still find some peaks in some states even more if you use like a new instrument with a super good resolution or a configuration with a super good resolution. But this is normally performed using a small angle X-ray scattering. So basically what happens here as you do in crystallography, the peak position relates to the DS spacing in a crystal structure. Okay, so we have this again, using the Bragg's law and this Q vector equation, you can see that the Q max of the intensity of the peak with a lowest Q and the maximum intensity is inversely proportional to the correlation length of that system in this case to the DS spacing. And then you can use the Miller indices as you do in the crystallography to calculate the lattice structure in reciprocal space. So for example, if you have a lamellar, it's gonna be a one, two, three, four dependence. So if we have, yeah, for example, I think that this is a lamellar phase. So basically what happens here is that the first peak is gonna appear at a Q somewhere around 0.05. The second peak is gonna appear at a Q of two multiplied by the position of the first peak and so on. So we see that this peak is around 0.1. This third peak is gonna be around 0.15 and so on. And then if we have an hexagonal phase, we see that there is another dependence. And we can actually use this, the lattice structure to calculate the position of these Miller indices and where these peaks are gonna show up, okay? So if someone has any questions about this type of analysis, I don't have time to go into lots of detail of each of them, but I will be happy to discuss them more in detail if someone wants to think about them. Okay, so this part is the empirical models. And again, I will not spend much time into this because I think that the next part is more interesting than important, but I want you to be aware that this thing, it's a possibility and you can actually use it to study structural information from your system. So empirical models were developed to describe some trends that were observed in the small-angle scattering data. They are relatively simple because they are not as complex as, as for example, Moral Vase Analysis, but they are slightly more complex than the ones that I've presented before. So there are different ones. I have included some slides here that you will have so you can check them, but I'm not going to spend too much time. I just want you to understand how this works. So for example, the Correlation Length Model, what people realize is that the Perot equation, for example, was good to describe, the Perot plot was good to describe the behavior of polymers. And the Guinear was good to describe the behavior, sorry, the Perot was good to describe the behavior of polymers. So basically what they realized is that by having an equation that includes this Perot law here, or Perot equation here with this Perot exponent, they could extract information from, for example, polymer systems. And basically here is that you fit your data to this type of model and you get different parameters. So for example, C relates to the surface-to-volume ratio, so the interfacial scattering. Then we have this Perot exponent that relates to the behavior of that polymer. And then we have some Correlation Length give us information about the size of that system. So this is basically a quick way to assess the scattering from polymers. Then we also have a similar approach for peaks where we can fit the position of the peaks when we have periodic structures in our data. We have something that is something in between to the Contour, to the Correlation Length Model and to the Gaussian Peak Model because we can fit systems that are, so we can fit like scattering from systems that contain particles that interact. So it's something in between these periodic structures and particular systems. So there are some different ones and you will have this to have a look at if you are interested. But I'm gonna get into the more interesting part that is the model-based analysis. So basically, these model-based analysis uses mathematical models that have been developed to calculate the scattering from different equations. So in the same way that we know that the scattering cross-section is related to the scattering length density and distribution in the system to the Fourier transform of the scattering length density of the system, we can actually develop mathematical models that calculate the Fourier transform of those scattering length density distribution. So basically, we can develop the models that describe the shape that calculate the scattering from different structures. So in these models, there is gonna be different variables that describe the shape of the scattering. So for example, we can have radius and things like that and like thicknesses, length, so whatever it depends on what you're investigating. And there are two different types of models. So the first one is the form factor as what they describe your day. This relates to the shape of a scatter. So we have a sphere is gonna have a given form factor. If we have an ellipsoid is gonna have a different form factor. And then we have the structure factors which basically describe interaction between the different particles. And there is something important here which says that it's good to have some preliminary information about the scatter, either something that relates to the size or the shape. So sometimes doing some like rough analysis using this model free approaches or using even ISD will be useful for selecting the appropriate model. Because when you have a sphere is pretty straightforward, but for example, when you start to have more complex structures, there is so many parameters that you can basically fit whatever the scattering function you have. So it's good to have some information on what you actually have there to select the appropriate model to fit the data. So just to have a brief reminder about form factor and structure factor. As I said, the form factor describes the intraparticle scattering and therefore it relates to the shape of the particle. And then the structure factor relates to the super lightings of this particle. So how these particles are arranged in the space. So if we look at this scattering equation which is described as scattering from a central symmetric uniform spheres, what happens here is that we have the form factor and the structure factor that correlate to the intensity. So this is the scattering axis and some parameters that account for the concentration of particles. So basically this means that this leaves the form factor and the structure factor as the Q dependent functions, okay? This is valid as I said for a sphere but I just want you to have an idea of how these different parameters affect the scattering equation. So there is a convolution of these two terms and but this idea can be generalized for more complex structures but this is just like the simplest possible equation. So in terms of form factors, you have different, you can calculate the scattering from different structures as I said. So for example, this is how the scattering looks from a cylinder and ellipsoid and a sphere. Basically as you can see here, the, for example, the cylinder will have two linear regions. We'll have one at low Q, one at high Q because then let's say that there are two correlation lengths. One correlation length that relates to the length of the cylinder and one correlation length that relates to the size of that cross-section whereas for example, the sphere has only one linear region that correlates to the diameter or radius of the sphere. So there is only one correlation length that is required to describe that shape. And as you can see by playing with the different models, you get different scattering signals. And another thing that you can obviously do is you can play with the parameters that are used to calculate the scattering from that model. So for example, here is just a cylinder that have different lengths, okay? So you can see that this is gonna be a short cylinder and this is gonna be a long cylinder. And as you can see here, this long cylinder, we don't even have this plateau at low Q because we are not getting to low Q. So the experiment that we have performed is not reaching the Q mean required to see that second linear region, okay? So by playing with the different parameters, you can see how this function changes. Tomorrow we are gonna go a bit more into detail, but I just wanted you to know that there are different types of structural models with different parameters. And then we can build models of different complexities. So this is just a screenshot of a review by John Scott Pedersen on the different structural models for the small-language scattering data analysis. So this is just a very few of them. And as you can see, there are things from very simple shapes like a sphere to very complex stuff like polydysperis star polymer with Gaussian statistics or star polymer with Gaussian statistics and so on. And as you can see, when you increase the complexity, you get more and more parameters in these models, okay? So the other type of models are the structural factors. So basically these are mathematical models that describe the interaction between particles. So when you have many particles in solution, if they are not interacting, if you have an dilute regime, you're not going to have a contribution to the scattering. So S of q is going to be equal to one and our form factor will directly relate to the experimental scattering. But then when we start to increase, for example, the concentration, what happens is that the particles start to sense each other and they start to interact with each other. So there is some kind of like super particle arrangement. And then it's where we start to see this that when S of q is different to one. And as we can see, this is the form factor, this is the structure factor and our experimental data will be a convolution of those two contributions. There are different types of interaction between particles in a matrix. So we can have, for example, some electrostatic interactions or we can have some higher sphere, purely excluded volume interactions and they will affect the data in a different way. So again, this red curve is our form factor. And then if we apply some higher sphere interaction, some excluded volume effects, we see that there's gonna be a change in the scattering signal. And then if we have the same particles that are interacting through electrostatics, we see that it will affect the scattering in a different way. So we will need different models developed for different type of interactions. And then there are also some, these are for repulsive potentials, but there are some models for attractive potential that describe, for example, coalescence or aggregation or sticky interactions. So as I said, one of the main contributions to this structure factor is the concentration because the more particles we have in solution, the more prone it will be to interact. So for example, if we have our form factor here, which is the same for all of the particles in the, let's say, whole concentration regime, and then we start to implement different structure factor model. So basically we have the same structure factor model, which is a higher sphere of repulsion, but at different volume fractions, we start to see that if we are at the dilute regime, it will be no difference between the form factor and, so it will be a direct relationship between the form factor and your scattering data, but as soon as we start to implement these different structure factors, we are gonna see how this affects the data in a different way. So there is gonna be, for example, the appearance of a peak, a high Q, and there is gonna be a drop in the intensity at low Q. So this is something that is characteristic of repulsive interaction potentials. So I'm just gonna very briefly say that most of these potentials are determined for spherical particles because the spherical particles are relatively simple too. So there is like not many ways of interacting, let's say, because they are spherical, but as soon as you start to have an isotropic in the system, there's gonna be correlations between the orientation of the particles, the position of those particles, so everything becomes very, very complex and it's very difficult to derive this and calculate it analytically. So basically what we do is that we apply some different type of approximations that will calculate the interaction for particles that present certain degree of anisotropy. So there are different ones for different purposes. So for example, the coupling approximation is for polydispers and anisotropic particles and the random phase approximation is for polymers that interact. And we can see here, for example, that this is the mathematical form of the coupling approximation. So here we have our structure factor and it's like, let's say, modified by this beta approximation that relates to the amplitude of the form factor of the particle. Okay, so this is an approximation that we use to simplify the calculation of this scattering arising from the inter-particle interactions, okay? So the next thing that I'm gonna very briefly mention is that everything that I have said until now applies to 1D data analysis. So that means that when we look at our scattering data in the detector, our intensities in the detector, there is no anisotropy and that's because the particles are randomly oriented and we are getting like an average of all of the different alignments in the system, okay? So then we have this anisotropic scattering and what happens is that we calculate the radial average of this and we get the 1D data. What happens sometimes is that, for example, when we have magnetic particles and we apply a magnetic field or we have some other particles and apply shear, is that the scattering becomes an isotropic because particles start to, for example, align. So we have some alignment and when we try to measure that, there's gonna obviously be some preferential orientation for the particles and that shows us some anisotropy in the scattering too. So there is a way to fit this using 2D fitting and basically what this does is that this uses some fold factors that account for the orientation of these particles in relation to the Q-vector, okay? So basically what happens here is that we have this alpha, this sign of alpha that is basically saying how these particles are oriented in relation to this Q-vector. So we can actually use this to analyze and to look at, to analyze the data from anisotropic systems and to, for example, look at the alignment of particles on the shear or anti-magnetic fields. Yeah, this is something like that is not very common. So I don't wanna spend too much time talking about this. So in the last part of the lecture, what I'm gonna give you is some tips that I've been gathering in the last few years that I've spent analyzing the small-angle neutron scattering data. So when I was a PhD student, the main expertise of my supervisor was the small-angle scattering data, but as a PhD student, I had to do all the work. So basically I had to fit all of the data. So during my PhD and postdoc and also my stage as a researcher, I had been analyzing lots of small-angle scattering data, mainly small-angle neutron scattering data. So now I have some kind of strategy that works for me. It doesn't mean that it's the best. It doesn't mean that it's gonna work for everyone. It's just what I do. But if you want to, maybe you can use these tips and these suggestions to kind of like build your own approach, okay? So the first thing that I do when I take some small-angle scattering data is that I look at the different features that appear in the data. So for example, I look if we have some red peaks, which will basically tell me that we have some periodic structures, some crystal structure in the mesoscopic scale. I look if we have some bumps at high Q, which will relate to the presence of a structure factor contribution. So for example, if I look at this red, I will think, okay, I'm pretty sure that I can analyze this using just some small-angle contrast, some form factor model. But for example, if I have this peak or this bump here, I know that I will need to apply some kind of a structure factor model that is convoluted to the form factor. So other things that I do is, for example, look at the presence of guinean regions. So as I said before, if we have something that is spherical or close to spherical, we're gonna have one guinean region. As we can see in this yellow one here, but as soon as we start to have more elongated objects, what happens is that we start to have two different correlation lengths to describe those objects. And therefore we have two guinean regions, one at high Q that relates to the cross-section and one at low Q relates to the length of this particle. So basically that's telling me already something in terms of the structure of this, that there are two different correlation lengths to describe these particles. And a useful tool actually is to normalize the data. Okay, so for example here, this data is normalized to the concentration of this factor. Normally the best way to normalize data is to the volume fraction of particles because that's what we do when we do present the data. In absolute scaling, we have the data normalized in absolute scale to the volume fraction of particles. Okay, so basically by dividing this data by the volume fraction of particles we have there that we can, for example, calculate theoretically if we know how, for example, how many particles or how much protein or whatever we put in the system, we can calculate the expected volume fraction and normalize everything. So there will be some kind of like information here. So for example, here I can see that there is an overlap we're measuring different concentration. There's an overlap at high Q and there is differences at low Q. So this is telling me that the high Q feature in this case, the cross section is not gonna change with concentration but there is a change with concentration in the low Q signal. There is a change in concentration in this, let's say a large features in the system, okay? And that's just about the presence of peaks. So then something that I also normally do is to use some standard plots and some model-free analysis to do a quick evaluation of the data. And what happens here is that I normally just pour out. For example, if I'm looking at polymer systems to get some information on the slope at high Q or for example, if I have some hierarchical structure like in a, for example, a cylinder we have this cross section and then we have this elongation. So there is gonna be different features that can be analyzed with this plot that will give us information about the shape. And the linear plot, as I mentioned before is good to, for example, determine if we have some structure factor contribution in our system. And as you can see, so there is no interaction it's perfectly linear at low Q. If we have some repulsive potential there's gonna be a drop in intensity. If we have some attractive potential there's gonna be an increase in intensity. So we can use these to quickly evaluate the presence of this type of interactions. And the cranky plot, as I said, can be used as a qualitative way of looking at the folding state of a particle or of a polymer or of a protein. Okay, if it is globular it's gonna show these IQ dependence if it is unfolded or if it is like a Gaussian change it's gonna show this high Q behavior that has some kind of plot or IQ. So then when I do that I already have some structural information that I can use for selecting some models that might be suitable to analyze the data. So basically what I do is I pick one contrast that has full contrast. So for example, it's some proteated particles in deuterated matrix. So in that way we have the biggest difference in the scattering axis. And then I just use these to try some simple models and see how good they are feeding the data. So as you can see here we have three different particles that have three different structures. And I tried to use one of these perlipsoids polydisperses spheres and cylinders. And as you can see, for example when we get to this blue signal we can see that this field model is not good at all to analyze the data, a low Q. And if we go to high Q we see also that some models are better at feeding the data. And I use full tool to analyze this data often is to check the chi-square values because some of these models might look very similar when we visually expect them but there is gonna be some large differences in terms of chi-square, so in terms of goodness of fit. So normally the goodness of fit is a good assessment to decide which model is best to analyze this data. Okay, so then the third part will be to try to use some mathematical models to feed the data. So then what happens is that we want to think about how our particle is gonna behave in solution. Is it gonna be just a nanoparticle that is totally uniform? Are we gonna have some kind of like core stellar structure or different shelves? Or are we gonna have some complex interface in which the scattered intensity profile is not like this stairs or these steps that is basically, yeah, this is a well-defined interface or we have more like interfaces that are not that well-defined. So then we can pick more complex models to fit the data. So we can pick, for example, these core stellar structures or bisodes or none of these core things like that but we have to actually think about what we expect in terms of a structure to kind of like select more complex models. So it's important that at the beginning and when you can is important to actually look at the scattering in the dilute regime because in that way we will only need to use a form factor and we can neglect the structure factor contribution but this is not always the case. So sometimes we will have to then implement the structure factor to look at the interaction with different particles. So for example here, I'm including this higher sphere structure factor to look at how this affects the scattering data. Okay, and as I said before, often it's easy to get a model to go through your experimental data but that doesn't mean that it's a good fit because sometimes you have so many parameters that basically you can get anything, any model to fit almost any data. So as I said a couple of days ago, a good way to get around this limitation is by using different contrast and by combining different techniques, for example, suns and sacks or by using different isotopic labeling in our suns data. So for example, here I'm looking at some kind of like core shell structure and what I did is I took three measurements that was, one was resolving the entire particle, the other one was resolving the shell and the other one was resolving only the core and by feeding these three, by simultaneously feeding these three to the same model, we can get a very robust answer on what is the structure that we are actually studying. And this is the last step that I will follow. So once I've decided which form factor is gonna fit the data, which structure factor is gonna account for into particle contribution. So I have a clear idea of how the particle will look like. Then is when I come and try to simultaneously fit all of the data. So sometimes you might get a good answer. Sometimes you might not. So you might have to come back to the previous step and decide what is the suitable form factor or structure factor for analyzing the data. But this is the last step that I will do, but obviously there is some feedback that will go back to the previous analysis steps. And finally, what you have to do is you have to determine the error for the feeding parameters. So normally the feeding algorithm will provide you with some errors, but I mean, sometimes these are not realistic and they just like deviate from like, they are just like a mathematical algorithm that will give you some answer. But that doesn't mean that answer is like physically realistic, for example. So just imagine that if the resolution of a science experiment is for example, plus minus one answer, and you feed the data and you get an error of like 0.001 Armstrong, obviously your error in the determination of that structure will be larger than 0.001, just because the resolution of the instrument doesn't let you to go that in that level of detail. So it's important to know that sometimes the errors are not representative of what actually you are studying. Okay, so what you can do later is that you can take this data and these parameters that you calculated before and use different types of algorithms to calculate the error. So you can go from something simple, like Levenberg map work, or you can have some kind of more complex error analysis. For example, Markov changed more to kind of analysis and so on in which we'll give you, for example, correlations between parameters and it will give you a more robust error analysis. Okay, this is the last thing. I know that there is lots of ideas and concepts here. So again, if you have any question, you can ask it now. If you have more specific questions, you can always email me or ask me in the private chat, but I just want you to have some like important take home messages, which is that the analysis of neutron scattering data could be often complex, but the level of complexity will depend on the purpose of your investigation. So there is no need to actually spend months of simulation power and simulation approaches if you are actually just wanna see if the protein is folder or unfolded in different conditions, okay? So try to avoid overfeeding and spending too much time feeding the data if you actually don't need it. So it's good to establish a feeding strategy that works for you. As I told you, I have one that I normally follow, but you have to build your own and something that you feel comfortable with and you can optimize to be quite good and proficient at doing it. And then there are some like useful links that you might want to use. So for example, for the data analysis of sunsets and SAS data, you see multi-based feeding. You have software like SAS view, which is the one that we're gonna use tomorrow. There are other options like SAS feed. For the data analysis of using the IFT method and that what they will present later, you have software like app SAS and bio-ISIS. And for a simulation-based feeding, you have different platforms like Rosetta and Sassi and things like that, that they do different things to calculate the scattering profile. Then you have some links to SOD calculators and SOD calculators for proteins. You have like a booklet of all of the neutron scattering links and cross-section at the NCNR website. And again, the SAS toolbox is always a good go-to resource to look at the different aspects of data analysis and also experimental science as I said, it basically contains everything. So this is all from me today. And if you have any questions, you can ask them to me now in the chat or you can email me later if they come back to your mind when you are looking at these slides in the future. And thank you very much for your attention, guys.