 Good afternoon session. My name is Milita to the other which I come from the University of Brooklyn Finland and I'm one of the organizers of this meeting so I hope you're all enjoying it. And the first speaker in this afternoon session is last buckle from Rory in Bochum and he will be telling us about some machine learning and data driven studies on experimental data so thank you very much go ahead Lars. Hello everybody. Can you understand me okay. What's on the back. Yeah. All right. Yeah, thank you for giving me the invitation and to present here, a little bit of our experimental work. And I would like to start with a brief introduction. With a brief introduction in what we are doing at the chair for materials discovery and interfaces. This is the chair of Professor Alfred Ludwig and I'm a postdoc in this group. And what we do here is we do combinatorial material science. So we use combinatorial magnetron sputtering to create composition gradients and or also synthesize process gradients. We form high throughput characterization, we can for example go for structural properties using electron microscopy or x-ray diffraction, but also electrical optical and magnetic properties, as well as we are looking into shape memory alloys and also thermoelectrics. Let me briefly introduce you to magnetron sputtering. So on the left side you can, you can see such a magnetron sputtering chamber from the inside. And we have here four cathodes. This could also be like more than four, five to six, seven, eight cathodes. And on these cathodes you can place the base material from that, from what you want to create a composition gradient. We have then here in the center of the substrate. As you can see here in the video we are spraying or sputtering from individual sources simultaneously. And if we keep the substrate as a static state, then we can create composition gradients. So instead of having a single chemical composition in these kind of libraries on a silicon wafer, we can create continuous gradients of composition and if we then put a coordinate grid through them. Our standard grid would be a 340 roughly measurement points that can be characterized using automated high throughput techniques. And so then you have the gradients over the substrate area, you can screen them and identify for the property of interest and thereby map chemical composition to materials properties. The typical way how we describe the high throughput materials innovation cycle is we start with some kind of hypothesis for example from theory guys. And then we go to the lab synthesize a set of materials libraries, we perform a high throughput characterization, the properties that we can do I already mentioned. And so from this relatively large data sets then we can perform a data analysis. So first visualize the data look at the data try to identify trends, select a region of interest with a material that has the properties that we would like to that we are looking for, and then perform an in depth characterization of those regions of interest with more expensive techniques like atom probe tomography or transmission electron microscopy. And based on that, then probably you can, we can go ahead and process these materials libraries so you can achieve different processing states by annealing them for example or oxidizing them, and can repeat the cycle, and hopefully identify the new material of the future. And there are several opportunities provided by artificial intelligence in this materials innovation cycle, which is first of all to speed up the data analysis, but then also to visualize large data sets. So for example here dimensionality reduction is really a game changer because you can suddenly compare relatively large data sets and in the past it was kind of complicated to do that. Active learning is a good opportunity to speed up further the analysis. For example, we would like to identify synthesis composition property relationships, but also guiding of the experimentation is where I can help us. There are several use cases that we have been investigating in the past years today to present to you. And the first one is to identify the influence of process parameters on the microstructure of thin films. In order to. Yeah, let me start with this. So actually the magnet transporting process is relatively complicated physics. So you can see here inside such a reactor we have here the cathode with the target, which is the material that we would like to deposit. So on the target there are atomistic processes going on. So we are accelerating argon ions towards the target. And then material is removed from the target transported through the plasma and during transport there's interactions with other plasma species. There are also collisions that where the material particles lose energy, and then they land on the substrate, they condense there, and there can be diffusion, depending on the process parameters, and then the particles nucleate and the film grows. Depending on how you choose process parameters like, like the iron energy or the degree of ionization of these butter particles, but also the substrate temperature and generally the chemical composition, but also impurity elements. So of this, the outcome is a thin film with with different microstructure with different structural properties for example we have the morphology roughness density grain size, but also a crystal structure related properties that are being changed by the parameters. And the idea is now in order to handle this, this complexity of the parameter space to bypass the the physical processes inside the plasma reactor by, by a artificial intelligence model. What has been typically done in the past is the, there was an introduction by a John a form in 1972 of so called structure zone diagrams, where this is a low dimensional representation of possible microstructure types that you can, or a characteristic can achieve in a magnetranspatter process, depending on only two parameters, which are the pressure and the deposition temperature in relation to the melting temperature of the material. And so this was done for refractory metals. And this works in a way that you can, for example, go to this diagram and you like, if you would like a characteristic structure that is porous and not very dense. If the high pressure and the low temperature, then basically the diffusion of the atoms is reduced. They lose a lot of energy in the plasma, and you end up with a porous microstructure. If you would like to have a dense microstructure you would increase the temperature and reduce the pressure because then you increase the iron energy, and the firm is densified by bombarding species. If you increase the temperature further than you, and this kind of microstructure is then like a column microstructure. And if you increase the temperature further, then you end up with large grains that are like similar to bulk materials. So it was a relatively simple model they had to do a lot of experiments and then draw the sketch up these for the structure zone diagrams. However, the modern plasma reactors have much more parameters that you can vary and individually vary. And therefore the parameter space is larger. And so the idea now is, can we use an AI model to predict such kind of microstructure structure zone diagrams. And this was a collaboration with you really Zagorski and draft rods from icons. And we ended up using a conditional again conditional generator adversarial network, where we use the process parameters as as the conditional inputs. And we trained then on experimental microstructure images from a data set of chromium aluminum oxy nitride ceramic film films. Yeah, so as a first comparison between, if the model is able to predict the kind of microstructure. So first we should say that there's a lot of variation in the microstructure and microstructure images are rather complex to describe and complex to, to grab what you are seeing there. And so the first impression was that the conditional again was able to relatively good pre reproduce the microstructure images that were in data set. And you can for example see that if you have a faceted microstructure here, then you can also clearly see in the, in the conditional again, a similar microstructure but also if the grains are getting smaller than the predicted smaller, and there's, it is a little bit problematic when the, when the feature, when the features get too big to large, because then the, the model is somehow producing a finer, finer microstructure. Okay, but then this was just really to see if it's somehow working. And then of course would be can we also extrapolate or like predict inside this five dimensional parameters based different microstructural outcomes. And so this is then what we do we take two parameters that we can vary and three parameters are kept constant. And then we can produce such kind of microstructure overview maps. It's a little bit hard to see the difference here, but if you take a close look then you can see that there's a variation of the microstructure. You can access the deposition temperature, and here the aluminum concentration and chromium aluminum nitride. And this by itself is a little bit tough. So, you could in principle now identify the microstructure that you would like to achieve and then select the deposition parameters, with which you, you most likely will achieve them. So what we then did is train a convolutional neural network to predict from microstructure image, a microstructure type. So it's like a classification model. And since the GAN has has the randomized input, latent latent parameters. It's possible to produce for each condition, multiple variations of the microstructure type. And so then the CNN can predict the microstructure type of of multiple microstructural representations. And by that you can get a probabilistic microstructure map. And so here you can then say okay if I would like to have a certain microstructure type which is, for example, let's say fine grained here. If you want to have a fine grained microstructure, then I should increase the aluminum concentration as high as possible, and keep the temperature relatively low. Or on the other hand, if you want to have a faceted microstructure, then the probability of having a faceted microstructure is highest at higher temperature, and up to a certain aluminum concentration. You can find like the boundaries where, where you know that our physical physical boundaries for example if the aluminum concentration is too high then you would change to a hexagonal structure which you would like to avoid. And therefore you can limit yourself into this region and use this to to optimize the process parameters. For example, out of my PhD thesis. And the next thing I would like to show is how we use AI agents for x-ray diffraction analysis. But it could also be any kind of spectrum like data. So x-ray diffraction is being used for crystal structure classification in our approach. So we have a machine that can measure all fully automated the 340 measurement areas and produce such x-ray diffractograms, which are characteristic for the crystal structure that we're However, there's a challenge because what you typically then do is in order to to classify a microstructure you would compare the experimental XRD pattern with reference pattern from from literature. But imagine you would have to do this 340 times. And maybe there's like 20, 30, 50 possible phases that you can select from literature. So then you would have to do a lot of comparison. And this is a tedious task and it used to take us months to do that. And the essential problem that we have is that the the quality of the x-ray diffraction data and sin films is not always very good. So depending on the growth parameters that I just explained you can get crystal growth in a certain direction and therefore you have texture in your XRD pattern which means that if you would have typically five characteristic peaks. So if we are growing only in one direction we can only measure the single peak and the others are are not visible. And so this means that we lose a lot of information, and due to the film growth and and the subsequent measurements. And so what you have basically is a trade off between the number of possible phases that you have on the one side from reference structures, and the data quality of the X-ray diffraction. So if you have for example, high data quality, then you could probably classify a large number of possible phases. However, the worst case is if you have very low quality and a high number of possible phases and this is often the case and especially super alloys or high entropy alloys. We're just the chemical space is relatively large and therefore there's also lots of lots of possible crystal structures. And so this demands somehow for a model that can handle the uncertainty that is represented in the X-ray diffraction data. And this is what we have been trying to achieve with this study. So let me give you some brief examples of what kind of aberrations we can have in thin film patterns. So typically you would, if you have a certain material system cobalt nickel aluminum you would go to a reference database look for all possible combinations of cobalt nickel and aluminum and check what what crystal structures would be possible theoretically. So you would obtain such kind of stick patterns, where the individual peaks on the diffraction angles are plotted like this. And these you would have to typically compare to the experimental pattern. However, then we have the challenge that if, if the fun grows and preferred orientation, you only have certain peaks and others are completely invisible. So you have texture which means that the relative ratios of the peaks are changing. Another problem is that you could have two different crystal structures, but due to texture and to strain, they produce the exact same X-ray diffraction pattern. And you can of course also have noise depending on the grain size if you have small grains you get a lot of noise and relatively low signal. And these are the problems that we are having. So we need to have some kind of model that can, or now I'll introduce the crystallography companion agent for autonomous phase identification. So the idea here is to use a synthetic data set simulate physically correct X-ray diffraction patterns by accompanying all the possible variations in X-ray diffraction patterns that we can observe experimentally. And roughly 100,000 patterns per phase that we have in a material system. Then we use an ensemble of neural networks that classify the structure trained on the data set on the synthetic data set, and then we can use the real data set and perform phase classification using the ensemble network. And then there will be an ensemble vote and you will have some kind of probabilistic output of the phase classification. And this is how the pipeline looks. So, gathering the data from the ICSD for example, then simulating the data set. These are the ensemble networks. And then we will, we are using two kind of phase probabilities. The one is only based on the X-ray diffraction pattern and the second one takes also into account the chemical composition. And then we can perform some kind of multimodal analysis. And the idea is that the companion agent only gives a rough estimate of what kind of structure there is so that the human can in the end decide what, what is the most likely crystal structure. What we, what we observed also is that compared to a single learner that has always 100% uncertainty or 100% certainty, an ensemble scales with the variations in the X-ray diffraction data. So if you, for example, artificially decrease the quality in the data, then you would also see that the, that the uncertainty of the ensemble networks go down. So this is exactly what we wanted to achieve. Another metric that we were also investigating is that with increasing data size, the model actually becomes better. So it is really the large number of physically correct simulations that we can do that helps, helps with good classification. Yeah, then here's the final result of the experimental data set. So we can use here, see here the ternary plot of a nickel cobalt aluminum materials library. So we were able to cover almost 100% of the composition space and one materials library like 100 millimeter wafer. So here are these the typical 340 measurement points where an X-ray diffraction pattern is taken and then you can see here in this ternary diagram, the face probability, first only on X based on X-ray diffraction. And you can see here that this is for nickel cobalt three aluminum. And the probability also spreads out quite large and this is due to the fact that the crystal structure might be similar so there could be a match based on crystal structure, however, due to the chemical composition of this phase, the classification should be should be bounded into a region where the chemical composition actually makes it reasonable that this phase is existing here. And so if you include the chemical composition between the reference pattern and the actual measured composition, then you can reduce the classification boundaries to a reasonable region in the composition space. And so this is for comparison then analysis performed by a human expert. So, my colleague then is now I was able to classify 20 more than 20 different face regions, manually, probably took a very long time. And you can achieve now similar results using the XCAD, the crystallography companion agent. And also benchmarking this approach with other models. There's the physics informed model by Ovidou et al. And they were basically simulating X-ray diffraction patterns in a non physical way in the sense that there are patterns that are unphysical but still representative somehow for the for the data. And we were comparing when we used the ensemble method on the data approach by by auto XRD, and we were comparing XCAD with the data from auto XRD, and the other way around. So the take home message is that only if you use the physically correct measurement that the physically correct simulation and the ensemble, you achieve relatively high accuracy. And then there can still be some edge cases for example what is if you have a pattern in your data set. So if you produce the material that has never been synthesized before, then it will probably be hard to identify. So first of all, if you just look for for the typical phases you would never find a reference structure that you could compare to. So the question is, can we find the model that can help us with detecting out of distribution data, thereby informing the scientist hey this this X-ray diffraction pattern is very unlikely that can be classified by by by your model, because it's not represented there. And therefore we just used a variational auto encoder to learn representations of X-ray diffraction patterns, which can actually be helpful in for for a lot of things. And so first of all we took the same pipeline we took reference structures from the databases simulated the relatively large synthetic training data set trained a variational auto encoder. And afterwards, we have a representation of crystal structures is now the synthetic data set of FCC PCC and a HCP structure. And you can see here in this relative in this representation relatively nicely the symmetry of the crystals that we have so if we have a FCC pattern in this angular range, we have basically three peaks that are occurring. And this representation is nice because you can use it on the fly for a rough estimate of structural similarity. And this means that, for example, in this case where we have the red and the blue pattern here, they are textured along a certain axis, and therefore they produce a relatively similar pattern. And we can observe this directly here in this latent space. There's a proximity between do these two but they are somehow still separated because it's different classes of materials. And this is that this is an interesting thing for on the fly visualization like during the measurement or very brief overview analysis. You can also use this to train, then a disrepresentation to train this classifier this was just a plain simple and can use can use neighbors classifier. So then, depending on where in this in this latent space, the x-ray diffraction pattern is located, you can make a rough rough estimate of crystal structure. But then the question is what happens if we have the pattern that is not represented in the data set. It would still fall in this latent space. However, it is actually, it is, it is not this kind of structure. There you can use the reconstruction error of the variation law to encoder to to easily spot that the reconstruction error is almost one order of magnitude larger than of all the other patterns and therefore you have a way to inform yourself that this kind of data is not represented by the model. And this is actually very helpful in experimental. You can use the x-ray diffraction classification then because you can have also face mixtures. And this is a really combinatorial problem if you imagine there are 20 possible phases, and you can even have all kinds of mixtures of these phases, then you end up with a huge data set that you cannot handle anymore. In this case, we could show that when creating artificially face mixtures, the model uncertainty or the actually the reconstruction error would increase, and thereby this can be a metric to inform that the certain that the pattern is, for example, a face mixture, not a single phase. Right, and then so coming back to the experiment then. We have here then again a materials library of a cobalt chromnical rhenium. We have been mapping the typical, in this case it was a square grid 225 x-ray diffraction patterns, and you pass the data set through the auto encoder and it directly can can highlight the regions where the face classification is challenging and these are we could show that are mostly face regions that include multiple phases instead of a single phase. Okay, now, towards the last parts. It's going to be really nice if, if, or, let me start differently. So typically, if we create such a materials library, it could be a high entropy alloy or ternary system. You have sometimes relatively simple trends over this materials library. So if you measure 350 data points, that's actually much more than you should have measured because you're just measuring more data points in a very simple trend. And the idea then here is if you can, if the, if the measurement device can directly identify that there is only a simple trend, then it does not need to measure so many data points and so this is where we use active learning. And you can see here is the original data on the left side so this is the electrical resistance of a high entropy alloy. And you can see here then the GP, the prediction by Gaussian process and the uncertainty or the covariance by Gaussian process, and by initializing randomly or with the, with maybe five measurement points. So you can start with an, with an initial fit, and then afterwards you're sampling the uncertainty of the model, and the measurements are only occurring by the uncertainty as high as and thereby you optimize the prediction during the measurement. And this can be seen now. Yeah. So you can see that the machine now decides on its own where to measure. But after only roughly 30 data, 30 measurements, you can see that the model converges and that the fitness is relatively good. I mean there's always some kind of variation in the measurements. And this means that a simple trend like this could easily be measured with only 30 data points instead of 350 which is a huge. Yeah, huge gain in time. This basically shows to the same plot, but you can, you can then see where the machine was measuring and how it went through the materials library to characterize it. Right, then one last topic that I would like to address is how can we scale with the chemical complexity, because there is an issue issue associated or let's start with the motivation. High entropy alloys have been observed to be relatively good. Electro catalysts for example for the hydrogen evolution or the oxygen evolution reaction. And this effect is due to the, they call it high entropy effect but basically in high entropy alloys you are producing statistical distributions of active sites where that can act as catalyst centers. And so it was observed that if you take individual elements like iron nickel cobalt manganese which by themselves are very poor catalysts. If you combine them in high entropy alloy as a nanoparticle, you can achieve the performance of a platinum catalyst. And so this they describe as the high entropy effect, and hence it's very interesting to investigate such kind of materials as well. However, first of all, the problematic that there's a huge number of possible quinary systems if you consider roughly 50 chemical elements that could be technically produced and used in our systems. And so you would have more than 2 million possibilities of quinary systems, for example. This is the one problem but the second problem is that while you can produce a ternary system by by a magnet transbuttering in full like 100% compositional coverage. If you use, if you go to a quaternary system you are only producing a two dimensional cut of the quaternary space. So this is visualized here. If you go to quinary space this is then even further dramatized and in the end in a quinary space like one materials library has only 0.5% of the compositions and the quinary composition space. And this is then not any more high throughput or, or impressive it's basically just a point in the center that you're measuring, and you're missing out on everything else. So what would they with a strategy to overcome this simplest way that we thought of is, okay, there's the problem is that the geometry plays a role now because when you deposit from five, the position sources on a two dimensional substrate. It's always a cut through the space and therefore you can permutate the cat folds, and by permutating the cat folds you're producing a different cut in the quinary space. There are 24 possible permutations. And so we produced a, we did some ranking of the or we simulated the coverage by the deposition and performed the ranking and drank the materials in the way or the permutations in the way that the permutation that will produce the next highest coverage is always synthesized next. And with this approach we're able then to with only, if you only consider the five to 35 atomic percent region in the quinary composition space to reach up to 30% of that with only six permutations. This, however, then creates properties in the center of the composition space and from that you could extrapolate and maybe also do active learning in order to further explore these large composition spaces. All right, so with this some already at the end of my talk, and I'm happy to answer any questions. Okay, thank you very much, Lars for giving us an insight into many different things that are possible to do with experimental data. Any questions. Don't be shy Patrick. Thank you for the very nice talk very impressive. The, can I come back to this x ray companion that you had this comparison to the human right. That was very interesting. And that you said there was one before the three triangles. Yeah, yeah. So you said that so the map in the middle was your best one from the companion right. So it was basically just showing that that by using these different probabilities, you can somehow constrain the prediction to be like the face probability to be high and only the region where it should be high physically. So there's, for example, it does not make sense that a certain structure. It's not in this region because there is no, but the chemical composition is not supporting it. I see so it's the agent only looking for one phase, because the diagram on the right by the human looks a lot richer in color than the others. Oh yeah. Yeah we know it's looking for all the phases simultaneously and distributing the probability also fall phases. So it's just a simple way to represent it. And we did not love actually the real results by the agent, we just measured the accuracy. This plot that this plot was actually produced by a person. And there's only this region here that we compare in this in this book, but actually you could produce the same plot by x. I see so the machine does identify the other. Thank you. Any more questions for Lars. This is pretty good because I have a ton of questions. Kevin, anything on zoom. No, okay good. Well, I'll start. We have a few minutes before our next talk. So one thing I did want to ask you is how large are your data sets. How many individual sample how many data can you retrieve. And do you actually for the same experiments or same combination of elements, do you generate multiple samples. So to the first question, it's, it's always this typically this fixed grid that we are producing so 350. That are then located differently in the composition space depending on, we can go from binary systems to quinary quaternary. And then the compositional coverage on this is not so good anymore. So it's always 303 50, 350 per wafer. Yeah, roughly 350 per wafer, and we can produce. We can produce reproductions of the same way forward relatively good. Not a lot of variants, so we can produce two times the same materials. That was going to be my next question how much of the statistical reproducibility noise do you have and do you did you actually take this into account into in the models did you integrate this noise in the model. So, I mean if you would perform the same measurement on the same spot twice, then depending on the measurement system you could have a noise and variation there. But we also get statistics from the neighboring points. So, typically you could assume that there's some kind of trend with the property and the chemical composition except you have something like nickel titanium where there is 51% boundary where there's no effect or not. These were probably miss but if you assume a continuous gradient and by looking at the surrounding data points and you can see if there's if the trend is reasonable, and this is also kind of qualitative metric. So you do use the continuous gradients to smooth out your data if needed. Yeah. Okay. How about any further questions. Not maybe I can. Patrick. Yeah. I saw that your vision optimization also sample the edges of the way for a lot to you and this makes sense because it doesn't have what that's where it's most uncertain, right. You know, it doesn't know how the function continues outside the boundary. If you if you if you go to the one after the animation. Yeah, there. So a lot of the numbers are actually on the boundary. And we also observed this and it seems to be something in this very hard to get rid of in Bayesian optimization if you don't have continuous variables. So because the model is trying to put the uncertainties pushed towards the boundaries. So have you, have you noticed this and have you sort of ways to get rid of this to be even more efficient in sampling, if you need it. So far, not know we only did so far the first implementation of this and we actually have a measurement test and that can do this, this kind of sampling. And but we are now, and that's actually also the point that we have to do. We have to, in order to gain trust in these methods because in this case I mean know the data. And how it's represented. Then you need to have some way of stopping the model where it says okay this is the uncertain the global uncertainty is now relatively low I stopped the measurement. And so we are now trying to set up an automated benchmarking platform so that each materials library that is that is measured with this machine. Everything is stored. And afterwards the GP runs with different parameters and then over a period of time we'd like to, to see what kind of stable parameters can we use for for distortion. Yeah, you can throw in a few measurement points as test set and then always compare your model against it. Yeah, or something like this. Yeah, so this is what we're trying. Thank you. Thank you everybody and thanks again Lars for your talk. Our next speaker is you can from Altar University in Finland. And while you is preparing I wanted to make an announcement.