 now. So on the top here you can see in this figure you can see the efficiencies of different solar energy solar cell technologies and how they have changed during the past 20 years and on the top you can see the traditional silicon solar cells and how their efficiency has stagnated around 25%. So then on the red we have the perovskites solar cells and their efficiency and you can see that that has improved significantly during the past 10 years and based on this we can be pretty sure that perovskites have potential to overtake silicon and this is why we are interested in them and why I'm talking about them today. So what are perovskites? So basically any atomic structure or material that has this ABX3 atomic structure that you can see on the right is a perovskite but the most efficient perovskites that you saw on the previous slide are actually much more complicated and they look more like what we have here. They have ionic mixing in multiple sites and also these organic molecules such as form amideum and metal ammonium in them and even though their efficiencies are very high they haven't been commercialized yet because of some problems with instability and toxicity and we know that compositional engineering can help with these problems but just due to the complexity of this material space using traditional computational methods such as DFT is not feasible and this is why our objective is to use machine learning to accelerate the computations and through that use it to optimize perovskite properties using compositional engineering and in this presentation I'm going to show you the first couple of steps towards this goal so I'm presenting you our development of a machine learning framework that learns from density functional theory calculations to predict perovskite properties quickly and then applying that framework for finding stable alloying fractions for this inorganic perovskite material that you can see here and why we are using this material is that since it's inorganic and has only ionic mixing in one of the three sites it's simple enough for a first test case and although this material is not very interesting from the on its own from the solar cell perspective it's also used in blue LEDs and here is a small figure about the approach that we are going to take our machine learning approach so we'll start with data generation and then we will do we will have a dataset of atomic structures for which we do DFT calculations to find the DFT energies and then we will fit a machine learning model that can map the atomic structures to energy and then we will differentiate this machine learning model so that we can use it for structure optimization and finally we will combine those optimization structure optimizations with Monte Carlo sampling to find the convex hull for this perovskite material so first about the data we generate this dataset of about 18 000 atomic structures that look a bit like this or this this is one one example from the dataset and we included both single point structures that were generated algorithmically and also structure relaxations from or structure snapshots from DFT relaxations and in order to increase the diversity of the data we included four different lattice types or phases that the perovskite material can take those being PM3M, P4MBM, I4MCM and PNMA and then for all 18 000 atomic structures we calculate the DFT total energies using FHI aims with PB functionals and then the next step of our approach would be to use machine learning to replace the DFT calculations so that we can quickly predict the total energy from the atomic structure and our machine learning model has kind of two parts. First we will use many body tensor representation MBTR for representing the atomic structures in vector form and MBTR is this global descriptor that in general considers the elemental contents, the interatomic distances and the bonding angles in a structure to form the vectors but we are actually only using the second order term the k2 term so we compute these distributions of interatomic distances according to this formula for each pair of elements in the data and we then concatenates all the contributions from different element pairs to form these full vectors that you can see on the bottom and then the second part of our machine learning model is to map the energies or the MBTR vectors to energy values using kernel rich regression and the kernel function that we are using is Gaussian kernel where the distance between two structures is defined as the Euclidean distance between the MBTR vectors that we just calculated and both MBTR and KRR have hyperparameters that we optimized with Bayesian optimization okay so now we have our model that can map atomic structure to energy but in order to use that model for structure optimization we need to differentiate it and because our energy predictions are a composite function of KRR and MBTR we need to differentiate both of those components separately and this is exactly what we have done so we derived and implemented the derivatives of the model with regard to atomic positions and strain components and this will give our model an ability to predict atomic forces and stress components according to this formula and we can then combine these quick derivative predictions to with BFGS algorithm to optimize our atomic positions and lattice parameters in the structures okay so before we are ready to compute the convex hull we still need to solve one problem which is finding optimal chlorine bromine configurations at each concentration level and for that we use Monte Carlo simulated annealing and we initialize this sampling algorithm with a PNMA structure that has a random chlorine bromine configuration and we use PNMA here because during our data generation we found that this is the most stable out of the four phases for this material and then our algorithm samples more structures by swapping random pairs of chlorine and bromine atoms and because the exact equilibrium atomic positions depend on the on the configuration use the machine running model to relax all the sample structures from which we obtain then the change changes in total energy due to the swap that we made and then this change in total energy affects the probability for us accepting this change in structure and according to this formula that you can see and in the formula T is the simulated temperature that we decrease linearly on each step towards zero Kelvin okay so now we have all the building blocks for computing the convex hull and we do that by sampling 1000 different configurations of chlorine bromine chlorine and bromine atoms per concentration level and from that we obtain a set of minimum energy structures that we can use to draw the convex hull but at this point I think it's a good idea to look at convex hulls in general so they are a common tool in alloying physics and I've pulled one example here on the right by code net all where they computed the convex hull for a alloy of aluminum and niobium and in the figure all the black circles are formation energies of different configurations of atoms and then the red line is the convex hull that goes below all the energies and a material is stable if or a structure is stable if it's on the convex hull so for this example the stable structures would be here and then here but everything else is unstable so for example if you try to synthesize a material that had a concentration between these two points it would phase separate into the stable phases on the convex hull from both sides of that concentration that we are trying to synthesize all right but that's enough for the theory so let's look at our results on the left you can see the energy learning curves for our model for the four different phases and for all of the phases we reach energy prediction errors of less than one millilitron volt per formula units so per five atoms and then on the right you can see a scatter plot comparing the DFT energies to the machine learning predictions and you can see that they fit quite nicely on the diagonal and our MAE is only 0.69 millilitron volts per formula units for the fully trained model and the best thing about all of this is that the predictions are over four orders of magnitude faster than DFT and here on the left you can see the force prediction learning curves and this time we reach for all the phases force component prediction errors of less than 20 MeV per ounce term and on the right you can see some of the results for the relaxations or relaxation tests so on the x-axis we have the DFT optimized energies and on the y-axis we have the machine learning relaxed energies and I've only included the PNMA structures here because that's what we are mainly interested in because that's what we are going to use in the convex hull computations okay and for the relaxations our error is only 1.32 MeV per formula unit then here you can see our Monte Carlo sampling results so each blue belt here corresponds to a different configuration that we relaxed with the machine learning model and then predicted the enthalpy of mixing and here I've drawn a blue line that connects the minimum energy structures or minimum energy configurations and on this line the blue line this slide the blue line is still the same so those are the results from the machine learning optimizations but I've taken all the configurations on that line and also relaxed them with DFT and although you can see then that's the orange line so although you can see that the machine learning model is systematically underestimating the energies for these minimum energy structures the shape of these two lines are still the same and then I've used the DFT relaxation results to draw the convex hull and you can see that we have stable concentrations or stable structures at two different concentration levels so here at one sixth chlorine concentration and then another one at one third and we can actually look at what these structures look like and they look like this and we can see that there is this layered ordering of chlorine and bromine atoms so at one sixth chlorine concentration we have the one layer here and then at one third we have this double layer structure here and here and in our approach we made no assumptions about the regularity of the structure so all this order rises out of the randomness of the Monte Carlo yep then let's conclude so our research objective was to develop a machine learning framework for quick and accurate property predictions for perovskite materials and we managed to do that at least for this dataset we achieved very high accuracies and also the predictions were really quick we managed to do structural optimization using the model and then applied all of the tools that we developed in finding these two stable alloying fractions for this inorganic perovskite material and a little bit about the future so the next step of our research that we are already working on is to synthesize this perovskite alloy and confirm our results that way and then after that we will try to extend the machine learning framework for more complex perovskite materials so we will include ionic mixing in multiple sites and also these organic components and that will allow us to then look for better solar cell materials thank you for your attention if we have time for questions i'll take them thank you Jarno and for also for being on time so we have plenty of time for questions so how about it oh atrik there's a question on zoom yeah so the question is if you've already also considered other perovskite structures the abx structures and how does the band cap evolve over the material space so we haven't tested yet so maybe the question was asked before the my outlook section here but that is the next that next thing that we are going to do is to test this framework on different perovskite materials and then about the band gaps we've actually also worked we haven't looked at them extensively yet but we are also working on a or working on using the same prediction framework for predicting band gaps and until now it seems to be working great but i don't have any figures here for the band gap and i don't remember from the top of my head how it evolves over the concentration range okay thank you any more questions for jarno if not maybe i have one uh so uh it was quite interesting to see this work after a zaks talk because he was specifically asking how to to do more work with forces and how we can get accurate forces from this machine learning model so i understood that here you trained on energies only but your forces force accuracy seem to be reasonable very like in the mili eb per angstrom range and certainly the results from machine learned optimizations produce very similar results from dft optimizations so what do you think uh why do you think your forces here were so good um since you didn't train on forces and that's something that the people tend to do train on both energies and forces yeah so there's a huge difference of course between the datasets so what zak was introducing were like huge datasets of with a lot of variety chemical variety whereas our dataset is very specific and that probably helps and also i think we would benefit from training with both energies and forces so we might get even better results that way okay that sounds reasonable any further questions but if not let's thank jarno again thank you very much okay let's welcome our next speaker it's donathan schmidt from the group of miguel marques at halle and he'll be telling us about machine learning for third and then at least stable materials go ahead donathan