 Hello everyone. My name is Miha Macevsky and today I have the great pleasure to present to you my token titled when models query models Let me start with expressing my gratitude to organisers and allowing me to participate in remodeling to Europe. I wish I could join you, but some Personal issues block me from that And this is where carried out at ETA Zurich together with my Supervisor Yasmin Smayich, Bernhard Aufmann and Douglas Martins from Palscher Institute and Giorgio Vallone from Lawrence Berkeley National Laboratory. I will start by Mentioning actually nice coincidence because today is almost 10th anniversary of the exposed discovery at CERN It was possible by a great effort of engineers and scientists who Built an operated large rock ladder along with four detectors here. We can see Image of collision of beams of particles taking place in one of the detectors and by reconstructing that collision and matching to Physical models scientists could identify as this blip here exposed on Being created and that proved a theory created 50 years ago. So that's quite incredible And that's a big step for for particle physics, but also our understanding of matter and Universe as a whole Now, how does it work? Let me start with just a very brief accelerator principle And to me it's like a swing on the kids playground. So we have A charged particle which is kept in circular trajectory by a dipole magnet. So just to North and south poles put on top of each other that for a positively charged particle traveling perpendicular to that field would Experience a force that would bend it so we can see as Particle goes out of the screen to stir it into the middle of the circuit Circle and then turning at some point. There is Radio frequency cavity that eats electric kick That accelerates particle and goes round and round in the circle And there is just this one equation Lawrence Ford that you may remember from high school physics that is governing the movement of particles with a good approximation And what's really interesting here is that these magnets are actually super powerful So they operate at a 3 tesla just to compare an MRI magnet would run at 1.5 tesla And they need to be cooled down to very low temperatures like 1.9 Kelvin, which is even less than the outer space. That's quite impressive And yeah, we need quite big circle 27 kilometers in conference because the larger in the circle circle And the field then we have the more powerful collisions and the more powerful collisions that were more we can learn about behavior of particles and with superconducting magnets which is the objective of My project design of those magnets We can reach a compact design by fields that Allow for building better accelerators And it's actually not only Larger or not wider that's one to rule them all but actually there is an entire complex of accelerators Which provides particles and pre accelerated before going to the next stage and add different experiments at CERN But what's really striking here? That's for instance, protosynchrotron was built in 1959 and is operated until today So this shows that the design is really important part and have to be done right and to maintain knowledge about the systems and Information that is relevant to keep it operating for a long time after it was put into operation So the motivation for my project, which is the design of superconducting magnets Protractor accelerator comes from the fact that these are multi-stage static processes that involve various disciplines and they feature changing terms of experts and software tools That are on their own subject to have this in recruitment placement And may last as you can see from years to several decades. So we are aiming at consistent sustainable and reproducible organization of numerical models to cons together with construction valuation data And tools using the process and as you can see here circle of large amount collider The future circular collider would be about four times bigger than LHC today, so and that's the design and engineering effort taking place right now and i'm contributing to that And uh inspiration for this project Which is called pi nbsc so pi to the implementation of certain nbsc concepts where nbsc stands for model-based system engineering formalize application of modeling to supports and requirements design analysis verification and validation activities that start in the conceptual design phase and continue throughout the long and later life cycle phases And one of the concepts brought by nbsc is Design system matrix. So we try to decompose a bigger system into subsystems like we decompose our software solutions into Modules and interfaces between them. Here is some pretty similar Where this matrix shows the connection and coupling between systems And also the definition of interfaces so what data goes towards system as they are put together And if we look into one box, which is the super productive magnets We can see that this is composed on it owned by several sub models that also have certain dependencies And exchange information between them and when we design the magnet we want to put together these models, but we don't want to Deflicate information, of course, it's not a good practice involving software in modeling So instead we want to have specialized models that are good on its own And instead we want to provide query mechanisms such that these models can access data between each other and that's what nbsc framework is all about And it's especially important that we have many many models that are on different scales So it would be a certain composed of many magnets it can be a magnet itself or a cable that is Used to create that magnet or even a strand of that cable. So that happens at different scales and we do somehow Have first of all dependency tree of these models and the query mechanism to put them together and what we do is Reliance model with system engineering and that's actually a concept that is First of all introducing a quite generic concept of a model which is a simplified version of some It can be a graphical mathematical machine learning deep learning or even physical representation like measurements That's abstract in reality to eliminate some complexity such that we can focus on one particular aspect of system and its design and with nbsc we shift system engineering From documents to interconnected models So that models are query they generate news and are traceable and repeatable and if you look at typical project management pipeline we have initialization then study that finishes With conceptual design reports or report telling what are the suitable solutions if you want to realize that project Then you move to design of particular solutions which ends with technical design report And then we build it some documentation commission and finalize the project and there is always procedures and report and actually what's quite common in This community, but I would say when science in general is that when we write in the report We have some models some measurements some excellent spreadsheets some analysis scripts and That all is put together in the report but typically This link is only Available for the creator of the document and once the document is created country Trains back where this data comes from. So if we have to reanalyze some of this information might be missing and we need to bring it back and recreate which sometimes isn't even impossible when People leave software changes and we can already find the particular setup that was used to create that information So instead to this Static documents we proposed to introduce models And with these models be able to always retrieve information trace it back and and then put it together and that's exactly the Python mbsc primer well about and I will talk about its microservice architecture and here I'd like to really um Note the similarity between system and software architecture. So as you could see a system architecture is represented as subsystems along interfaces between them and Actually was pretty obvious of software a software architecture in our case. It's microservices and interfaces between them So there is this similarity and in five we tried to leverage practices from software development and use them in system engineering to improve the design processes So the framework itself is composed of several components So from user perspective, there is configuration that tells what models are involved in a particular design This forms a model dependency graph This graph has to be a directed cyclic graph So that there are no loops and we can always execute it from start to end and as we execute these models they are Typically living in some notebooks or scripts. So When we run for instance magnetic analysis, this would perform a model query Find its dependent models execute them and get that information back And when the model is executed to be in a notebook or a script, this is calling an electromagnetic solver to solve for particular physical quantity that we're interested in And as you can imagine, we might have some redundant calls if two models depend on one And if the input information the same we can reuse that immediately by accessing a cache database. So all the execution of models is cache. We take certain snapshots such that we can reuse that information and also after a study of some analytics like what's the fastest model was the most which gave us the biggest margins this is still available And we Choose notebooks because after execution of each of the nodule we can create an html report and then build a book from that html report such that we form and document a bigger report So what are the three pillars based on this? Initial overview of the project. So first of all containers for numerical models And in general for reproducible environments. So whatever we use we put it into our container and expose and generate interface Once we have that so that is how to use particular solvers We have a model query mechanism that has two components one is dependency tree, which indicates What is dependency between different models and How a change would propagate across this model tree and the cache database so that when a cache is available We just return that information. Otherwise, we just Need to run the model again and the third pin are our model views. So auto-generated documents where We can profit from notebooks and build on-demand And representation of our design Such that it's also available for decision makers for managers in a convenient form. They don't run any code They have a view and it's actually a quite cheap way of generating the view And with that we also keep track of some information for reproducibility like versions of the particular Packages that we used version of the container we use version of software in general so that we can always redo that analysis later so Starting with containers for numerical solvers. So we have a solver which Is solving a particular physical problem magnetic mechanical thermal We then encapsulate into the container so that we have its all dependencies and running environment And then we provide a generic REST API to interface with that solver That has four methods Initialize upload files around these files and download results And we already created some co-edit containers Some are available provided by other companies Once we have that Numerical solver API we can enable model query. So how to get Information between different models and for that we have this pym bsc class which is building an execution query Based on the config source model target model some inputs and we can either get reports or some figures of merit Or artifacts if these are larger files And as I mentioned already in case a query is executed again with the same inputs And the underlying model will change and it's not only the model but also dependencies. We return output from the cache database The cache database is a document store a dictionary A document store Which is implemented in mobile db. So we have here a schema with name of a model its path Execution time last modification timestamp Hush of its content and contents of all dependent models input parameters and output figures of merit or artifacts that we may want to immediately return once We executed the model So, um, we chose model db because we may we have freedom for the dictionary fields that we don't know a priori and could have a different structure and that was quite convenient convenient choice and We have index that is based on model hush, which brings a high cardinality And that's actually the only table that we that we need to store in our cache So we don't need to really a relational database here And one important thing to note Once we have our model dependency tree We realize that for instance a mechanical model depends on geometry, but also depends on magnetic Which also depends on geometry so if We want to then find the shortest path of execution of dependency we need to perform three linearization And which actually is inspired by the method resolution order of python So Here once we have a model that we want to execute We need to check its state if it changed we respect what's in cache but also whether any of the dependencies changed and This we do by just going through the shortest Path of this tree and we don't need to waste time And again, if we have a cache hit we return what we had but if One of the dependencies or the model itself changed we need to rerun it such that We have Current result But with that Tree a realization. We only rerun those models that changed. So that's a big gain also in computation And one important step is changing propagation. So what this? Dependency three and three rely Linearization allow or is that in for instance one will change? Let's say electric thermal model And you want to rerun particle accelerator again. You would check if it's Content and input fines and parameters changed And if not we change check all the dependencies And Here we see that only Something in the magnet change we go into the magnet we check dependencies of the magnet That model change would only rerun this one while the others return The cached results are that we can then put back together The entire model and run particle accelerator Study only on change of one particular Model and update only those that that change And one thing that also brings to the third pillar, which is model use once we reform a model re-execution Document a report is auto generated and we can see it here. I'd like to Open now the Link so if we click here we have A CDR example so conceptual design report Of magnet design, so we have some introduction and geometry with code so that we can see what was Used with versions of API and time of execution a table of parameters And we've thought we can still have interactive view of model results So that's that's pretty convenient. We can zoom we can look at what we got And the same is also for some model results where We have Information about geometry again Some input files that we can print out. So it's all available And again some physical results in this case magnetic field Which we can Still read and check that information. So that's quite convenient for Everyone on the team, but also people from other groups from other Systems that they could quickly check was for instance the big field or stored energy And that could inform their design and this information Is cross-reprehensible and people can quickly access okay So just to summarize this part so We have this pi mbsc microservice architecture First step is that we have containers for numerical solvers And we also allow for command line interface or or for rest api jrpc calls But for those tools that we provide We package them Docker We presented this numerical solver calls with more Endpoints in rest api And then there is this pi mbsc query calls that allows models to exchange information and get data from each other And this all goes to a pi mbsc cache that stores information on model execution so that we can on one hand Retrieve it quickly if we run the model again with the same parameters But also we can do some analytics afterwards and then check our models see how they change and And track that information Then we support two execution modes. So One is local execution when designers play with notebooks and do some analysis make plots and then get data in the right shape that is biogenbitter server and python with some virtual environment and On local machine it relies on Docker and MongoDB instance It produces a notebook output Which then we can put together into a book and report that people can communicate outside But that's one way. The other is distributed execution. Of course, we can imagine that Once we have certain design and we want to change parameter. We don't do it manually We can do on-demand compute And here we just in a way abuse github ci but By making it running with its rest up on-demand of certain parameters and For that we use paid-per-meal to execute notebook programmatically And for that we rely either on github runner or open-ship instances to run these Docker containers and allow for The computation to be done on on the cloud Yeah, some implementation details. So as I mentioned, MongoDB for cache database Docker is rock-wrapped locally open-ship for distributed execution fast API for REST API development. That's really fast development Petri for the virtual environment and dependencies pay-per-meal and scrapbook for notebook execution And we create these books with jupyter book and eventually github for versioning ncsd pipelines Which actually is quite neat with poet-resokir sample pipeline that we that we use for our packages And now I like to come to one application to show how we use that tool in practice. So here is one example of optimization where the objective is to obtain certain fit quality so that particles are really focused and traveling in the desired trajectory this accelerator Of course, we need certain safety margin. We don't want to degrade the material too much as well as its installation And this is the taking place by optimizing a cross-section of that magnet where we put certain cables here around And the xy plane and we adjust the positioning angle inclination in the number of conductors And for this optimization, we have four models geometry electromagnetic mechanical thermal that run in the loop Until we find with a genetic optimizer a set of parameters that minimize all the objectives And the optimization execution is actually quite simple So we execute models of that by nbsc app So we need one line per model to retrieve figures of merit Then they are returned as dictionaries. So put them together Collect also some artifacts that we may use for further processing And we calculate scores one value that represents the design And we then take the figures of merits scores and artifacts and return it as as a result Once we have it We can run and the run is Very similar to what we see so far. So we have Different solvers for different models. They have this model api calls The optimization node will be performing by nbsc query poles to each of the models Combine them into one Computes score and that's for is then stored in the cache And database so that after an optimization run we can retrieve it from Simulation of a cockpit of Of the optimizer which is Alexei notebook and that's that's the cockpit where we can see the progress of optimization If we click any dot we can see a cross section of the magnet that was Used for this optimization and was the best individual in this Genetic optimizer we can see some parameters for that Model and we can also see how the design variables compared to the best in Individual so far And this is the last two information. First of all, if we are hitting the limits and we maybe we should expand the design space In that variable or we are somewhere in between and and we are still exploring the interspace So putting it all together with pimea and bse we could provide query mechanism for models so the physical connections physical interfaces that exist in a particular system are represented in the design phase When we use computer aided design with queries And that is a very clear interface between these tools so different components of accelerator and query each other with that tool and also within one system different models can query one another and We know that that particular accelerator is a system of systems implemented as multiple project remote queries for data exchange and one system Like a sprockets among the system of components that are implemented in a single project with local queries for data exchange And one important part in a design is its versioning so Here we rely on idlam, which is a standard way of keeping track of information Where we can have reference of our main branch of the design but also certain sub branches for particular versions of a design which still can be updated And we can keep track of that or put some ties to keep information about main main changes main Main versions of our our design And for each of these dots we can run a continuous integration pipeline and at the end produce this report so that every change We can keep track of what was done and maintain also that focus and artifact in githlabs so that Anyone can access it rather quickly Okay, so to conclude the pine bsc and bsc framework, which is a python implementation of model-based system engineering concepts Um It's quite generic. It's not only force for protecting magnets, but can be used in other engineering fields It is a set of containers with numerical models a model for a mechanism of a cache database And a model view based on jupyter notebooks and jupyter book for documentation And on top of that we added a set of optimization algorithms interactive cockpit for multi objective multimodal optimization And one important thing is that we rely on standard open source software technology stack And really pay attention to testing and documenting So that this solution would last For years as as as needed. So thank you very much for your attention. It was a great pleasure to provide delivered stock If you have any questions, I would be very happy to answer Um We have a few minutes for questions Would be to ask you if you could expand a little bit on the motivation for this project I'm curious, you know, how expensive these model api calls were to to justify this caching caching approach and secondly why developer choose pi mbse and Was there no other package available for example? uh-huh And so on the first point. Yes. So some models may take a few hours to run um for most advanced if you look at the party or on the big accelerator study this can be a few days and Sometimes the system did not change. So in this case we would just rerun something that you know the output for So, um, here is a big game and as for Having already Package library for that. We of course did a search and tried to see what's out there and we did not find A tool that would exactly match our our requirements. There is a tool and developing java but that would require us to then go via pi for j and and A bit complicated our our stack Plus did not support natively notebooks and then that was our our one of our requirements to rely on notebooks Okay, thanks. We have a one more question in the room Uh, thank you for your talk. It's really nice to see how this works in action So the examples we've seen have um a very strict order in which the models are run Are there more exotic examples where you iterate over two models that feed into each other and you try to get them to converge Or you have multiple versions of a model side by side Uh-huh. Yeah Thank you that was a good question So When two models require one This is called post-imulation so We indeed run one and and the other and to resolve that We would have a model which is post-imulation calling to internally So that way we we can run the To and to respond when we resolve So you encapsulate them in another model Exactly, right Yeah, and run until conversion Thank you Yeah Okay, so it looks like we have no more questions. So once again, thank you michael for your very interesting talk