 in Indonesia, but most of us here and most of us in this field are called, imagine, to build a universal, what are called the right-state yam integration. Is this real? The words are, when, is that, to all the database, because there is a communication on database, and it's very, very useful before it's too late to adjust the common tools, and to map it, organizing a second workshop on this in June. Thanks a lot, Nikola, and thanks also to the organizers for inviting me here, so it's really a pleasure to be here and to tell you a bit about the work that we've been doing on Materials Cloud. I'm going to briefly revisit some of the practical challenges that have driven the creation of AIDA in Materials Cloud and give a brief overview of the architecture of both tools, and then most of my time I'm going to spend on showing you some examples of what you can do with these tools today. So the first challenge has already been introduced by Joffrey, and thanks for that. It's a high throughput, so how do you organize large numbers of calculations in a consistent way where you all know this problem? The second problem is maybe what comes afterwards. So you have done your calculation, not necessarily a high throughput, but there's a lot of data sitting around, so that's the story of the PhD student, Bob, who knows exactly how he did his calculations, where he put the data, but then he leaves the group and the data is still sitting there, and the next PhD student, Alice, has to pick it up and she gets lost in the data jungle, so how do we improve on that situation? Then there's the related open science problem. So say I've done my calculations, I've written a draft for my paper, how do I open access to all the data that I have in a meaningful way? And not many people do this nowadays because it is an enormous amount of work, and I think the key challenge here is how can we make this process easier? And finally, I've called this knowledge transfer, or what we call it dissemination, so what I'm thinking about is, say I'm collaborating with an experimental group, I've done some calculations for them, and now I would like to share this data set that I have with the group, I would like to communicate this data set to them, and of course I don't have the software I have to view the data, so I've often been in this situation and I felt that making a few plots and putting it into a PDF was really not an optimal way of communicating to them what I wanted them to see. All right, this brings me to the second part, and so this is the architecture diagram for AIDA. AIDA in essence is a collection of Python packages that you install on your local machine, so a workstation or laptop, that allows you to set up, submit, and monitor material science calculations. AIDA takes care of transferring the input files from your machine to the supercomputer, submitting the calculation at the supercomputer, monitoring the queue, and when the calculation is done, retrieving the results back to your local machine. And what sets AIDA apart from other similar approaches is that you have a local database, and so this is a database sitting locally on the computer where you have installed AIDA, and this is essentially a history of everything that happened. So AIDA stores there, for every calculation, which inputs were used, where it was submitted, who submitted it, when it was submitted, which outputs were produced, et cetera. And this data is stored in a directed graph, and it is this graph that helps us with the open science and the data general problems. AIDA is free and open source. It's developed in this GitHub repository. There's a number of plugins for different material science code available, and let me just say, so if you want to start developing AIDA plugin for your code, you just copy our plugin template that we provide and start from there. You can do that completely independently. It's a very modular setup. All we ask is that you please register your plugin on the AIDA plugin registry. There's a little bit of information that is very helpful for you to reserve the name for the plugin, and it's also helpful for the AIDA community to know that you're working on it. And who knows? I mean, somebody might come and help you. Now to the architecture of Materials Cloud. Materials Cloud, in essence, is an AngularJS web app that can communicate with an AIDA database through a REST API. So this is what we call the Explore section. So this is about exploring a database, an AIDA graph, through the web browser. And there are four other sections. So there's this cover, which is about providing user-defined entry points to an AIDA database. There's Work, which is about running and launching calculations in the browser. Learn a number of... You'll find recordings of lectures and tutorials together with slides. And finally, there's the archive, which is a place where you can upload your research data, get a document object identifier for it, and then link to this data in your paper. But this is all very abstract, and so what I really would like to do is to show you a few examples of what this actually means. The first example is the SSSP that was already introduced by Nicola. So this, again, is a collection of pseudo-potential libraries, and for each element in the periodic table, it essentially tells you which pseudo-potential gets the job done with the lowest cutoff. So we go on materialscloud, beta.materialscloud.org, click on Discover, SSSP, and we're presented with a periodic table. Now, the color indicates which pseudo-potential library is suggested for this particular element. The number is the cutoff in Ritback, and what I didn't say, there's two different versions of the SSSP, one intended for precise material modeling and one intended for high throughput screening. So now we're in the efficiency version, and it says for carbon, it suggests, in this case, the potential from the PS library 1.0 with a cutoff of 45 Ritbacks. And that is nice and fine, but how did it come to this conclusion? So I can click on the element, and I get an overview of different properties that were computed for the different pseudo-potentials. I'm not going into that in detail. There's an equation of stage versus an all-electron calculation, again, for the different pseudo-potentials. And a few other plots, like a cohesive energy as a function of wave function cutoff, and what I'm going to look at here is the maximum phonon frequency as a particular Q-point, again, for the different pseudo-potentials. And this is all nice and fine, so if I'm mainly interested in converged phonon properties, this plot can help me decide which pseudo-potential to use, but the problem with these user-defined interfaces is that no matter how many plots you make, you will never be able to expose all the data you have. And this is where AIDA now comes in. So in this plot, actually all of these data points link to an AIDA node. And when you click on the data point, you get to the node in the database. In this case, this is some post-processed data from an AIDA calculation. You can see who did it. And then here, using this provenance browser, you can go up the AIDA graph and see where this data came from. So this is, if you jump over this Python function that did the post-processing, you eventually get to the node that represents the calculation for the phonons. So you can look at the input file that was used, or any of the outputs, for example, if you want to download the dynamometer matrix. And here, again, in the provenance graph, you see all the different inputs and outputs that went into this calculation. And, of course, one of the inputs is, for example, an SCF calculation that was done with the PW code. So you can go further up the provenance tree. By the way, note that this is not done by Gianluca Pandini, so somehow Nicolamuni was able to, Gianluca Pandini did the post-processing of the calculations that were done by Nicolamuni. And, right, so here we are now at the quantum-expressive calculation. And one of the things that you can see here is now the structure. So the structure, for example, is something that you did not get from the plots. It's not included in our interface because we didn't think about including it. But you might want to know that the structure that was used to do these convergence calculations or what you might want to do is say you have a new sort of potential that is not in our list. You want to redo the convergence test using exactly the same parameters as we did to make them comparable. So you download the input files and you redo the calculations. And then if you're not happy with downloading individual files but you want to have everything, sorry, this is not what I wanted, you go to the archive section, you find the entry for the SSSP there. So there's a UI that will be cited, I guess, in the archive paper. And you can download the whole AIDA database, you can download an archive of all the plots and basically import it into your AIDA instance and start your calculations from there. For the second, so there's also an example of there's also a section for the 2D materials work that Nikola showed earlier on. I don't have time to show all of that but you're welcome to have a look on materialscloud.org. So I will jump over that to the phone workflow, which is another example of what Nikola showed but now we're going to look at this in a slightly different way. So this is part of the work section of Materials Cloud. So we go to Materials Cloud, go to work and click on the Jupyter button and what happens now is at CSCF there's a Docker instance started for me and in this Docker instance there's a Jupyter notebook server and what I see now is nothing but the Jupyter notebook that is rendered as a web app. And down here we have the phone on workflow that Nikola talked about and while it is true that it is very easy to use what Nikola didn't tell us is that of course you still have to set up the machine on which this workflow can run. And setting up this machine you need to know a little bit about how to install Aida, how to install Jupyter, etc. This is something that Materials Cloud takes away. So in Materials Cloud you log into Materials Cloud you go on your page and you start the phone on workflow and this brings me to another key message that I would like to give. So the reason we use Jupyter notebooks and Python here is not because they give you the most fancy user interface, they do not but because we believe firmly that this is the simplest way of writing these apps and so what we're going to do now is we're going to edit this app live and add another example structure here. So at the moment we have four input structures, let's add silicon. So we click on edit app. All of this is now the source code of this app. Everything is editable, so also the text which is simply marked down and now we simply search in the quell code for the first occurrence of diamond which was the first element in the list. So this is the list that generates the dropdown and so this is essentially what I did when I modified the app because I didn't write this app, it was Giovanni Pizzi and Nicola and I thought let's make an example of how to add a new structure. One thing is the dropdown list, another thing is you need to actually generate the crystal structure to give to Aida. This is just for lines here and why is that is because it's using the crystal function from the atomistic simulation environment which just needs the atomic and it's just the element, the position in the unit cell, the space group and the cell parameters. Of course silicon has the same space group as diamond so we essentially just need to change the lattice parameter and a few variable names and that's it. When we're done we click on save and we go back into app mode and now we should have added silicon to the input structures and we can submit the format calculation with silicon. So this was not sped up that is real time and the message that I would like to give is if you know how to program and if you know a little bit of Aida then you can write such a workflow. Skip over the results because I want to show another example which is from another group, this is from the nanotech services laboratory at EMPA in Switzerland and this laboratory is a mixed theoretical and experimental laboratory and one of their main lines of research is the synthesis of atomically precise graph in nano-rhythm and so the experimentalists every now and then they ask the theory guys could you please investigate this structure because we want to synthesize it and what they did is they programmed both workflows inside Jupiter and interactive ways to browse the results that they obtained the calculations done using Aida. So this is an example of browsing their graph in nano-rhythm database. So this is a very specific user interface for this use case you see a structure of the nano-rhythm and a few properties and now you can interactively explore some of the properties of these nano-rims. For example you can compare the bandgap of these two nano-rims you compare it starts a small work function a small calculation that runs I think in this case directly on the virtual machine which simply plots the two structures side by side together with the Fermi level. Another thing you can do is you can show the projected density of state so this might be interesting if you are designing the edge of the nano-rhythm you will want to know which atoms are particularly important for the density of states near the Fermi level so by default it shows the full density of states for both spin up and spin down and then you can select one atom and if you wait a little bit it shows you the density of states predicted on this particular atom. Again there is also something that experimenters might want to do. Finally you can have a look at the local density of states so this is very important for this group because they do scanning probe microscopy in particular scanning tunneling spectroscopy and so you can select a particular band structure and what it shows you here is now the local density of states and on top of the ribbon at a particular height for this particular k-point. So this is something that you would get if you do for example a constant height scanning tunneling spectroscopy scan over the ribbon and this is something that specialists can directly compare to experiment and below here you have the spin density which is another thing that they are interested in they can download the picture and put it into their presentation so they were in the beginning the experimenters were quite skeptical about getting this work started it sounded like a lot of work and they said okay maybe I'm not sure whether this is going to be useful but they have since changed their mind so this is from the head of the experimental laboratory and he can use this interface for comparing band structures for looking at the properties of these ribbons and well I would say they have been won over let me mention one last thing and that is quantum mobile so what is quantum mobile quantum mobile is simply a virtual machine image that you can load into the virtual box software that runs on any operating system so Windows, Linux, Mac OS and what this takes away from you is the necessity to set up AIDA and Jupyter so inside virtual box you have AIDA and Jupyter pre-installed you have all of the max codes pre-installed plus CP2K as well as there are AIDA plugins so they are ready to be used through AIDA so if you have like a lecture where you would like to use AIDA this is something that you might want to use and the last point I would like to emphasize is that this is a by design this is a very modular setup so we don't just have the virtual machine image but on GitHub we also have the scripts to set up the virtual machine and if you need for your lecture some additional tools that are maybe not so easy to install or you have some other modifications that you need you simply go there you clone the GitHub repository you make a few adjustments and you produce your own quantum mobile so let me conclude so AIDA is our way of dealing with the high throughput problem and in particular the AIDA graph is our map to the data jungle and what I can promise you today is that if you do your calculations with AIDA now they are ready for open science material cloud is a younger project it's now in public beta but the archive is now accepting submissions from our partners so if you affiliate with Mux for example and you're very welcome to check it out and about the Jupyter part so my main message it is simple to write also there the workflows are shared on GitHub and run it wherever you like run it on your own machine run it on quantum mobile or run it in the material cloud I have to acknowledge the many many people who are working tirelessly to make materials cloud AIDA better and better both at EPFL and also elsewhere with this I thank you for your attention