 Okay, so, hi everybody. My name is Massimiliano and I currently work in a small group, CETAS, at a PFL that manages all the HPC clusters at our university. And my main task there is to take care of software deployment on the clusters, which brings me directly to the topic of today, SPAC, a Package Manager for Scientific Software, a well-known competitor to EasyBuild, I would say. And, yeah, incidentally, I'm also like a SPAC core contributor. So the plan of the talk of today is to give people that are not familiar with the tool just a 10 minutes introduction to it. Then I'll give you three reasons why SPAC is different from EasyBuild. And finally, I'll go into the latest new features that were advertised at Supercomputing 2018 and in the roadmap for this year, or let's say tentative roadmap. So let's get started. SPAC in a nutshell. SPAC is a Package Manager. And in my opinion, this is already a difference with EasyBuild, which is a build in the installation framework. What do I mean by Package Manager? I mean that SPAC tracks the metadata of everything it installs and can reuse it later for any purpose, like querying or generating additional files or things like that. It supports multiple configurations of the same software to be installed along each other at the same time. It has support for supercomputers, namely Cray and LuginQ, Linux and MacOS. And this covers a wide spectrum that goes from a developer's workstation up to tier zero machines. And as EasyBuild, it builds from sources, but it has support to do binary packaging. So already today, if you build from source, whatever you build, you can package and sign it and reuse it later if you need to rebuild exactly the same software. It can be used for deployment. That's what we do. Development, if you are developing an application, or QA, what do I mean by that? If you are somebody taking care of the quality assurance of an application, you can use SPAC to build your application in a million different configurations and check that everything works. And for those of you who want to know more, you'll find a link to our homepage. So who can profit from SPAC? What are the target users? Identified four categories. One is end users of HPC software. Basically people that SSH into a machine don't see the software they need. They can just download SPAC and build the software in their own directory or wherever they'd like. The second category is HPC application teams. So department, which are not in charge of the facility, but they maintain a set of libraries or applications for a set of people. Basically for an entire department that is developing applications that are related to a field. Package developers, people that just want to package software and distribute it to somebody else. And finally the category of which I'm part, which is user support teams, people that do the deployment of the software for an entire facility. And a common thing between the two tools is that also SPAC is open source and it's a community project. It has been started by Todd Gambling from Livermore, who's still the lead developer. And I must say that we have a very active and engaging community, like you do, guys. And to give you an example, we recently passed two upper-missive licensing. So we have a dual license, Apache 2 or MIT, the user choose. And we switched from LGPL mainly because we want to be enterprise friendly. We want enterprise to be able to embrace SPAC by in and contribute back. So to do that, we asked for the consent for the change to all our contributors. And we got, I think, more or less 300 consents in less than one month. And that's like the level at which the community is engaged into the project. The project is also done GitHub. And we tried to develop it following the usual GitFlow approach. It's used worldwide. I hope it's clear from the slide. Basically, that's the same image that Kenneth was showing for easy build sessions on SPAC read the docs for one month. The only noticeable difference is that this time the center of interest, if you want to put it like that, is the US. So the most you get from around Livermore, I guess. But you see, it's also used in Europe. It's used in Asia, in Australia. So anywhere in the world. And I said it's open source and a community project and it's committed to be. And I wanted to show you this graph as a proof of that. Those are lines of code as of now for packages. So recipes for installations. And as you can see, we have a lot of different institutions contributing to it. Livermore is not the first contributors for packages anymore. And at this time, the units are gone. And the same goes for line of code over the core framework. There, Livermore is the first contributor. But we are getting a diversification of contributions. So SPAC is not like a single institution I thought. It's really a community project. And it's going that route, you know, even more complex. Then another thing that I think it's good to stress is that whenever we develop a feature or a new common, we do that with simplicity in mind. And you can see that from how easy it is to start using it. Basically, you have to clone a repository, source a shell script, and then you're ready to go. SPAC install HDF-5 would install all the dependencies and HDF-5 itself. And I'll dwell later more on the details of what happens behind the scenes. Packages for us are just Python classes. And as you can see, those Python classes are different from EasyConfig's because they are a template for a single software. We have only one class. And this class has all the information you need to build that software under many different configurations. To stress that, I put two versions in these slides. But I guess you got the point. We have a description that will be reused by many comments, an on-page, a URL, and different, let's say, configuration options. Go back one slide. Sure. What if the URL changes between versions? How do you handle that? If the URL changes between versions, you have, let's say, more than one option to deal with that. If it changes just for a single specific version, you could use a URL keyword in the version directive, and that will be the URL for that version. Otherwise, if you need a lot more flexibility, which happens sometimes, you can define a method, but that's rarely the case. That is called URL for version. You get the version and basically the method gives back the URL. So usually it's that simple. As you see on the slide, like 95% of the cases are like that. People are embraced sane, let's say, version in scheme. Sometimes you have people that go crazy with that. They use, you know, they use underscore in one part of the URL dots in another part, and then they decide to do the opposite for the next release. In that case, you need to use all the ukes we provide. Okay, with the answer? Yeah. Finally, to conclude this 10 minutes tour, we also try to make it easy for packages, and I know many of you are packages to work with if a recipe is. So we provide tools that automatically create stub implementations for new packages. You just give the URL, and we basically the URL of a tarball or of a repository, we basically download that thing, try to analyze it a bit. Is it CMake? Is it AutoTools? And we create a stub package with indication of what you need to feel in to have, you know, fully blown recipe. If you have to modify an existing package, just set your preferred editor and then say Spark Edit and the name of the software you want to edit. And we provide comments like versions that are useful to update packages. This one in particular scrapes the web for version, assuming they have a regular, let's say, version in scheme and reports all the one that it found that are not yet part of a package. And that concludes like the 10 minutes tours. And now, three reasons why Spark is different from EasyBuild. Reason number one, we fight combinatorics with combinatorics. What do I mean by that? I think that seeing this image, let's say, it will resonate with most of you people. This is meant to depict the complexity of a typical HPC system where user support teams need to deploy a lot of applications because users have diverse needs in a lot of configuration for a single application. You might have heterogeneous nodes, so you have to deploy for different targets. You might need to use different components because it's own advantages and so on. And as far as I understood, what you do, people, is using an approach of taking directly the product space of those things, sample it, and you have an easy config for each point in that sample space. What we do in Spark instead is different. We try to use three different ingredients, which I'll show next, to be able to construct this combinatorial space from exactly these items. So the first ingredient we are using is user facing. It's the SPAC syntax. It's the syntax we employ to permit the user of Spark to specify what it needs. So keep a look at that. This line says, I want to install HDF5 at a particular version. So with the add sign, we specify version or version ranges in some context. We have similar, let's say, symbols or conventions in the SPAC syntax mechanism for each part of the combinatorial space. We could specify with the percent which compiler we want to use with a plus or tilde if we want an option to be active or not. We can inject flags into the compiler. I want to compile this and I want to ensure that minus 0,3 minus f loop block are used. And we can even target different systems. So on the one hand you have a powerful syntax that is modular as the reality we are facing. So each different component gets a different piece of a syntax and you can compose it to get the combinatorial space. On the other hand, we have the same thing for packages. Directives, as I said, for us a package is a sort of template that tells how to build one software under many configurations. And we do that employing what we call directives, which are those function calls within the class definition. And different directives extend the combinatorial space in a multiplicative way, let's say. So when I add a Boolean variant, I multiply by 2 the number of combinations that I now have for openblast. Versions, of course, they sum up and that's fine. And we have ways that I want to multiply with a conflict directive to basically mark a single point in the combinatorial space or an entire slab as not viable, as something that doesn't make sense. For instance, openblast in this case cannot compile with Intel 16 when in this version range. So two ingredients, one from the recipes side, one from the user face inside. Those two ingredients they get like injected into the third part, the concretizer. The concretizer is a core part of Spark that takes as input what the user asks for, so what the user let's say cares about, in this case MPI leaks that depends on call path, blah blah blah, so all the details the user cares about. It takes on the other end all the recipes the description of the packages and fills in the missing details. What it gets what it gives as output is a full blown DAG directed acyclic graph with all the details you need to go from what we call an abstract specification what the user need to a concrete configuration that can be installed. And that's basically one of the difference in my opinion we visibilled. And on top of that how do we, once we have this full blown graph of course we can associate a unique ID we can ask the graph and that ask will be used in what I will be calling in this presentation the Spark Store. I don't know if it is a term that Todd will endorse but let's pretend it is we use this ash to uniquely identify its prefix, its configuration. So it is easy to see the mechanism by which we allow different versions of the same software to be installed alongside each other each version will have a different ash and will get a different prefix for the installation and all the prefixes will reside within the Spark Store. And finally I wanted to point out that before installing stuff you can also query what will happen when you specify some abstract specification. This is an example output of asking Spark what will happen if I just to install its DF5 with MPI on that depends on MPI CH and the reply is that I will get this concretized SPAC. So it's DF5 that depends exactly on that version and you can see here a short version of the ash of that particular sub-DAC. So that was reason number one. We fight combinatorics with combinatorics with these three ingredients SPACs packages that have directives and the concretizer that merges the two things. The know if there are questions on that. Just a small question. The concretizer, what are the rules for the concretizer? I mean you have two different versions of the same package and you are not specifying anything. So you are going to be the newest, the oldest. Usually we take for instance for version whatever is the newest unless you mark some version as preferred. So if you have some use case you know that the latest version for some reason should not be the one we go by default. If a user don't say the version you can mark an older one as preferred. And that will be the one that will be chosen. For variants right now if you don't specify them they will take their default values. This is going to change the next year because we are, well, I'll say it later, but right now we are not doing set solve with full backtracking and we want to pass to that. Meaning that we want to take into account all the constraints we have from packages and put into the mix the binaries we have already installed. Which is a thing we don't do right now. So this answers your question. Yeah it does. Just to see if I understood everything. So if you have a case like HDF5 for instance you have a package that depends on HDF5 but you can have it with or without MPI for instance and that can change massively what is the graph underneath. So in that case let's say that you have HDF5 with MPI installed and you request a package that needs HDF5 but doesn't care if it's with MPI without it. Then it goes to the default and if the default is not installed it installs it. Right now it does like that. What we want to do within this year is to take into consideration what is installed and if the constraint is just HDF5 we say ok HDF5 plus MPI matches HDF5 and then we'll reuse what's already installed and that's the idea. Right now instead it says you didn't specify anything so I'll go with the default for the package because I don't know what you installed and currently I don't care. With that change you're going to do you're going to have to be able to for a specific package say that you are not allowed to do that. You must go with the default. Oh yeah but we already do this. Just right now they will result in error. At that point they will result in constraints that are just not possible. So the such solver will choose something different. Because HDF5 with MPI is a perfect example because some codes will not work with the MPI build they must use another MPI version. Ok but in that case if they don't work with an MPI build in their recipe they should say I depend on HDF5 without MPI. But if the user doesn't really know this just specify I want it to build with HDF5 as a manager of this system should be able to say that HDF5 should never make MPI if it's installed. It should always fall back to default even if HDF5 is installed with MPI when it searches for it. With the new change. When you said that this was going to find HDF5 installed with some version No of course but when it finds it installed and that's basically reason number two we get all the metadata and we know that it is with MPI. So if your software requires for something that is not MPI we say hey we have this thing but it doesn't match because it's with MPI and I require explicitly that this software is built without. So I'm going to build a new HDF5 without MPI. But if a user that just wants to build HDF5 for yeah ok you have the dependency on top yeah maybe that will solve it there might be reasons to not let it pick the installed version if it's a specific version I think that the idea the work is still ongoing but the idea is that we could fall back on the current behavior if somebody wants the current behavior. So not taking into account Yeah probably I'll note that down So on the new behavior then the order that you install software matters right because if I first install HDF5 without MPI and then I start my code it depends HDF5 will take that one. That's why we will provide some let's say handles because there are both objections that could be made. So on the current behavior on the one side you say it's completely reproducible because the algorithm doesn't take into account what you have installed. You have full control let's say. No you will always have full control. The point is how much verbose you need to be in my opinion because even right now you have full control of saying what you want to be installed but sometimes you have to specify a very long spec. If instead you don't care many users objected that we should be looking at what's installed and do a fully set sold in the sense that if a software if I say install the software and the software says I depend on HDF5 and there is any HDF5 installed and the software doesn't impose any more constraints then we should pick what's installed instead of reinstalling a new one. I repeat the question if you want. Okay, so the question or the comment was that a good example could be MPI because every site could install MPI in a custom way and then want that users pick that MPI. That's true. That's also something that we can currently do via feature that I won't show here because otherwise it takes too long which is external packages. So you know it, Spark permits to interface with things that are already installed via external packages. You basically need to say that something is already installed it will be a leaf in the dark in that way because everything that is below you have to take care. Spark has no way of knowing how it was built but you can already do that and if the package provides MPI that's another thing that I didn't say we have the concept of virtual dependencies which basically model APIs. So packages could declare that they depend on MPI and then other packages could declare that they provide MPI and the resolution of this dependency is done also by the concretizer and also by will be done by different rules when the new concretizer will be in place. There's a question from the chat as well can you go back with the slide that has the hashes. This one on the other one. So the question was what is included in the hashes how is the hash computed? So the hash takes care of all the informations that make the entire graph unique. So for instance this hash takes care of all the other hashes plus all the version, variants, compiler OS and all the other details of HDF5. So each hash is basically identifying a directed a cyclic graph on its own and you can refer to install software by hash if you have the need. Like I want one installed exactly this one you can say spark and install this hash. We'll see this later but yeah as we are at it. A follow up question on that and this is my own question. One of the ongoing issues I've seen people complain you know where I'm going the hashes change because of changes made in SPAC like SPAC or even I think it can even happen when you change one package like for HDF5 because of changes in the package the hashes start changing and SPAC isn't seeing the installations anymore. No SPAC sees the installation but we currently reinstall stuff. This is part of the reason why I want to. And this comes from two things basically people living on develop and doing get pool to get update on recipes so we are not living on let's say a stable version of SPAC. If you maintain I say that because I know it's a contrived let's say argument even in the community but what we do at our site when we go in production is we freeze some set of recipes and if we install a software I mean if there are not like big deals in which case we could go back and review the installation but generally if we install a software we always keep that recipe for that production installation and we update SPAC for the next production installation that's the idea. Many people are instead living at develop like they install software and they continuously get updates day by day and in this case what you say it's true. So you have a different let's say or an updated recipe just to be very clear let's say beta add a variant to openblast because I updated that that will change all the hashes for openblast because now it's a new package. It's a package that has an additional option and the algorithm sees it before I didn't have the option so the hash was different. Anything that depends on openblast if I just have abstract specification will be recomputed on the new package and will be reinstalled. That's what happens currently. If you live on develop and do production installation doing Git pull frequently. That not only happens when you're living on develop. SPAC hasn't been doing very frequent tagged releases. Even just between two tagged releases you will have the same problem. Say you're using I don't know what the latest version of SPAC is. 012 I think. So say you were using 011. You switch to 012. You're going to hit this problem as well. Exactly because the recipes will change. But the idea is that you do for instance you choose 0.12 and you do your production installation with that or at least that's what we do at the PFL and you choose 0.11 you do your installations with that and that will be fully stable. As long as you modify a package you're also modifying the hash of the thing because now the DAG has different properties and the hash reflects the properties. One question that came up on the chat as well. It's a bit of a weird question and I can say that because it's my colleague asking it. What is the typical number of hashes for the same package on average? What does it mean? I don't know. How many hashes that package can have? How many hashes can have? I mean you count in principle. Here you have two versions. A boolean variant which means four configurations. Another boolean variant which means another two. So eight configurations. You have some compilers. Let's say three compilers and whatever and you do the multiplications and those are the hashes. Or even you can specify compiler flags. Exactly So in principle it's infinite. With compiler flags I would say in principle it's like infinite. If I change something in the package specification that's not related to the way the software is compiled, just the way I write the recipe would that mean a new hash? No. Not right now I think. But we want to keep somehow the hash of the package.py Todd can correct me on that. I think that the feature is not there yet. But Verwerth talks about the package.py hashed. So the recipe itself hashed and been part of the hash. Because if you change the recipe theoretically You can just change something aesthetically. Or if you just add a version to it like here. Now if you add a version that could change if you select that version. Or if you add a variant for instance it will change for sure. If you add a version it doesn't change. I mean if you select that version it changes because it's a new thing. If you select the old one and you don't touch the various options or whatever it will always have the same hash. I'm saying for example in this one I don't change anything in this one and I have a dummy variable that I sometimes forgot and then I remove it. I think that right now it won't change the hash. But Verwerth discussions so don't take me like 100% I'm pretty sure about that. Verwerth discussion to also hash the recipe. I think the talk will be much longer than 45 minutes. Maybe we should continue and we can do more questions at the end unless it's very relevant to do it at the end. Continue. Reason number 2 Spark keeps the metadata of the software it installed. I briefly touched on that before but whenever we install a software we have a central place that is also within the Spark Store where we basically dump all the metadata of the software that was installed. The full hash whether it was an explicit request of a user or not just like a dependency that was pulled in the installation time meaning the time since SAPOC at which this installation was successfully done the reference count is the number of other packs, other concrete configurations that depend on this one and so on. That's a difference with visibilt that doesn't have this kind of database. Another thing we do to keep track of all the metadata is that whenever you install something within the prefix of the installation we have a dot Spark directory where we store anything that is related to the provenance of that software. So in this case we have an auto tools package and we archive the config.log for later inspection. We have a dump of the environment during the build so the same environment can be let's say recreated manually if need be later the complete log of the build. We have all the recipes that were involved in these like installations that are stored at the time of installation and finally data for the graph of this installation. And as an aside should anything happen to your central DB that I've shown before with all this distributed installation we can regenerate or sanitize any broken DB. So it's not frequent but sometimes it happens at least to us due to some TPA first like malfunctioning or things like that that the database was not really in sync with what was on disk and I was always able to regenerate the central one from all the metadata that was stored in its prefix for its package. So when you have metadata like that you can start to build tools around it tools to query what's installed when installed things tools to generate module files to activate Python packages and so on. And I'll just give a brief overview. Whenever you install something you can query what's installed. You just give a constraint in this case I ask for Zlib and I get back all the version of Zlib that I installed. I could make like anonymous query saying spark find % gcc that would show me anything that was compiled with gcc and you name it you can abuse the spec syntax to make your own query you can also query by date. I found that useful to have like some sort of mod of the day automated mod of the day you can ask if the software was installed in the last month and then you get the answer and this is just by inspecting the installation time. Likewise you can uninstall anything you installed in the same precise way. You have an ash you know what's installed. By default we do it I would say in a safe way so we ask if you have any questions about the installation I'll try to hurry up a bit on that so I could arrive I don't want to say in time but almost to the new stuff. But yeah you can uninstall things easily. You can generate module files on top of the installations. So for us module files are just a byproduct of an installation. We have highly customizable using YAML files and that generates module files where you specify to generate them. Everything is powered by Genja2 so when I say highly customizable I mean you can even go to the point of providing your own template for just one specific installation and we support both Ticol and Lua meaning hierarchical Ticol module files and hierarchical Lua module files. And if you want to know more I advise you to go through the tutorial for module files but just to show you how easy it is to configure it I just reported here simple module files. So this is an example of a simple configuration for a hierarchical set of modules. You just need to say that whatever is compiled with version of GCC is to be considered core. You say that you want your hierarchy. Of course you always have the compiler hierarchy so any other compiler will be on top of that and you say in my hierarchy I want MPI and LAPAC so I have a double hierarchy. And then you can modify the naming scheme as you want like set in the ash length which is something that usually the user don't want to see to zero and distinguish between the different version of software by suffix. Like okay I put the ash length to zero at this point I cannot distinguish between the same version of HDF5 that was compiled with or without MPI so I need to put a suffix that says if the thing was compiled with MPI add dash MPI to the name or with open MP and the thing that you probably will find interesting is we can activate Python modules on a fine grain based. For us Python modules are standalone packages whenever you install a Python module it gets installed in its own single prefix but then you can let's say activate it meaning you can symlink the Python module and all its dependencies into the root Python installation it was meant for and it will appear as it is site installed to the user. So here I install Python, PySitePy and PyNumPy installation succeeds Python and the two modules are in three different directories. When I say activate PyNumPy I'm creating symlinks within this Python so that I could import NumPy without saying anything else like if I site install NumPy and as you can see I didn't do the same for PyPy and the module is not found and we employ this procedure at the PFL to provide all the optimizer software for Python so our users are advised to use virtual length if they don't find software on our machines but all the software we care about we just activate them and magically they can import NumPy, PySitePy or whatever. Oh finally I added this slide because of Kenneth asking what can we do for passing logs we have a command which is quite useful for people maintaining the whole facility installation but uses the same regular expressions as CMake does to scrape for errors or warnings in this case I went back to our build output if I ask for the warning of course I get a lot of them for HDF5 and that's a useful tool especially for if you get reports that something doesn't work really long after the fact, long after you installed it so that's a way we do error reporting and we employ the same mechanism for failed build to extract the errors and show them to standard output to the user when they are actually doing the thing this was the second reason we keep track of all the metadata of the things we installed and the third one it's kind of faster because I'm really like over the 425 we can fine tune compiler options to users needs so this is just a single slide that depicts, oh I hope you can see it because the blue was not the best color choice but it depicts basically what happens when you install a software with Spark so briefly you fork a new process for each software that needs to be installed in the graph you set up an environment so that we get the environment isolated for that and whenever you have to call the compiler we have this compilation being mediated by a compiler wrapper and the compiler wrapper analyzes what's on the common line it can get rid of things that we don't want it adds our path it can inject compiler flags and so on and so forth so that's another different we can inject compiler flags with that we can do a lot of stuff and that concludes like the three reasons why I think Spark is different from Isabelle if there are no more questions I'll go to the cool new feature which is Spark environments so those of you who tried Spark for doing a lot of installations might have noticed that it copes well with dealing with this real space but at some point it might be convenient to look at something which has like 2,000th combination of softwares because most likely you installed those 2,000 combinations to serve different needs and whenever you have to focus on one single need you would like to see only the software that pertains to that domain and that's where Spark environments comes into play so if I have to sell it in a few words I would say it's like virtual environments but not just for Python for anything with this feature you create a virtual instance of Spark on top of the real one you can see the comments here and whenever you create a new environment you start out blank and of course we that's a funny thing during the development we started adding we received let's say the feedback to add this like message to the user that you are in an environment because when you installed 3,000 softwares and suddenly you don't see any software installed you might freak out so basically it says you are in an environment where the software is still there and now when you install software in this environment you basically reuse what's already in the Spark store but you have let's say a dedicated view for your specific purpose which could be developing an application which could be serving a group of users and yeah you can install things directly in an active environment here you can see activate the environment query where you are and see which software is installed and here you can see what happens when you install new software so in this case I add Zlib already in the store and it was reused HD5 was not there and it's been installed in the real Spark instance but stays below you can set up groups of packages at once so you can add your request to the environment and then trigger the entire installation just saying Spark install this thing so when you add things you're just basically saying I would like to have HD5 without MPI in Python but nothing happens in the moment in which you install stuff you concretize all your abstract request and you look if they are already in the store if they are not you install things now you might be asking how is this implemented why is this extremely useful and the fact is you can finally tune this configuration the fact is whenever you create the environment you're basically creating something like a manifest file and the manifest file is not bound to be shipped with Spark so this configuration you see here we reside in a file called Spark.yaml and this file can be bundled within a repository of an application to have your development environment or it can be bundled with anything you want like by people of a department to just install the software they need and it can be manipulated directly if you are if it is in the current working directory by Spark so you can use Spark.yaml to distribute environments around and whenever you install things instead you create another file which is called Spark.lock which is the lock file what you really install in that moment and depending on your need to reproduce an environment you can either reproduce the logical environment in Spark.yaml the one with the abstract request but will be concretized on the spot on the new machine or on the new place you are in so you can use if it is a viable choice the Spark.lock that somebody else gave you to reproduce exactly what it did down to the concrete specification of each installed applications so using either Spark.yaml or Spark.yaml you can recreate an environment and redistribute it to your needs coming soon because it's not already there we want to have views for Spark environments which means if in the specification file you say the simple thing everything in Spark is driven by YAML configuration so if you say I want to have a view within the environment you are in we could create a sort of fake we'll make sure that you don't have conflicting applications and we can create a sort of fake route with the usual include, leave and bin directories and all your software sim linked in that's an option that you might choose to use so that's a feature that is being currently it's currently under review. So finally next big things for 2019 is to build ACI infrastructure for both source and binary packages during this year. Currently we are here we are this user and we have the capability of creating source mirrors ourselves or binary packages ourselves and sign them what we want to do for instance for sources is to have some common place somewhere where Ponepo request let's say that this user added a new version we get through continuous integration and we archive the tarble somewhere as a big up mirror and yeah so I don't know if you ever experienced but at this time I'm looking at source force when it's down it can be an hassle like you can't build packages because you can't reach where the source is stored and that's let's say the easy part of it. The other part is that we could trigger via GitLab CI either a build of a few configuration of course not the possible one but a few relevant one and what is relevant is still to be decided I guess and create binary packages that are stored somewhere else in binary mirrors. We would go through the cloud for common architectures like the one we are using at the PFL but there is work being done to be able to get GitLab CI instances on tier zero machines and get also binary packages for free machines for instance and the idea is that once this is done a typical user could search those two mirrors or without even knowing it will search those two mirrors and if you ask for something for which a binary package is already available it will use that and the binary package will be fully optimized for his architecture so that's like big feature number one. Second one here the syntax is tentative but we would like to come up with a YAML format to describe an entire site installation. A format where you can say which software you want to be installed and which tool chains you want like for instance which compiler and which MPI and provide substitutes for that and go through all the combinatorial explosion this means and the idea is that once you have that file it will be just a single common deployment of your entire stack for your facility and you can share it easily with others the syntax is still being worked on but the idea I think it's quite clear so being able to have a good description of all the combinatorial space we want to cover at a facility okay this one might seem a bit complicated and in the community we call it Spark Chain but I'd say it's not let's say that an entire facility deploys a software with Spark because it tries to serve the most common needs of its users. Sometimes you have users that still want something different and they want to be in control what we say right now is well download Spark and do it on your own because it's possible but then you will recompile and reinstall everything that was already available at the facility level and that you could reuse. Spark Chain is a way for Spark to register upstream instances of Spark or to register other instances of Spark as upstream and make their DB read-only and when you search for software you first look in order into the chain database if the software is already there then you just use it. The idea is that you can have a hierarchy at this point and that's a feature for HPC facility you can have the entire facility using a central Spark instance providing the most common software for everybody then you can have for instance departments which care about their own users build on top of that software just the thing that are different but reusing whatever they can from the central facility and the single user of that department that needs to do his own strange things it will do the same with both its department software and the facility software. So one objection I think that Kenneth made on something like that on an installation of modules is that then you have no way and this is true from the facility side to know who depends on you and if you remove something that will break downstream and that's through what we decided to do is that well it's true I don't think that any facility will remove stuff but let's pretend they do what will happen is that this single user will have broken software it will ask why it's broken to the database that still knows what needs to be built but can't find it and it will say well this software which I perfectly know in its full configuration and I'm able to reinstall is not there anymore and we could say okay then it was in the facility database or in the facility or please sanitize this installation and like do it at the user level at the level in which I can write so there is always the possibility to sanitize a broken environment in case somebody upstream removes software you depend on that was like the solution we went. Does that mean if a single user install something that depends on let's say A you're pulling in all the metadata that you need to reproduce? Exactly you're still having the user database that installation A is such and such and whenever the facility removes installation A the user won't be able to use the software of course but could like queries park and it will show that A is missing and the user can say okay sanitize it in my space and refer to that. That's not just the hash right because that doesn't give you enough. No well the hash comes with all the details that are in the database and that you can. Yeah but with just the hash you cannot derive. No you cannot reverse the hash you have to keep all the details. Yeah exactly so the comment was if you could reverse the hash we would have problem in the security word. And okay I would be very brief on that we would have because let's say the concept is trivial we would like to have automatic detection of a target and automatic injections of optimization options so given a target which could be skylake as well whatever we could we would automatically add MRT equals something mCPU equals something and that's it. And finally I think this was already discussed we would like to improve the concretizer to add these two steps which seems so tiny work but I think it's gigantic and yeah this involves basically doing a full set solve of the constraints and probably is the most wanted feature in Spark right now. And with that I'm really finished so thanks for listening and I'll take more questions here. Any more questions for Muslimia? So on this concept of Spark all the way down. So if the facility gives me I don't know a GCC version and it was started with Spark version A and it was changed and the user has a version of Spark that's different from the one that was started with the facility then he will start GCC again right because we won't be able to find it right in the current behavior. No because compiler are currently treated differently so compiler are registered. If you refer to... Yeah don't think about compiler then think about library then. So if I have a package that was installed by the facility and they start with a given version of Spark then it may have to install again right? Currently yes with the other thing that comes for the concretizer no. Okay so how do you enforce that I mean you suggest the users to always use the same version? Yeah in our facility we would at the moment suggest to use our release version the same one so currently we don't have let's say currently the users don't have to know we are using Spark from next year having this feature we will suggest users to use the same version we installed we used to install software that's it. Any other questions? Out of curiosity is anyone here perhaps except for you guys on an experimental basis using a hybrid approach with EasyBuild and Spark and could you report on the experiences? But currently the issue we face is how to cope with both because on the system admin side we provide everything with EasyBuild but then with developer steam they want to use Spark for their own software business but no they would like to build rely on the things we provide with EasyBuild so we need to think how to glue the two systems so no real answer Yeah I would say let the kids do what they want but they still expect us to do the work for compilers or MPIs Just one small curiosity if I got it right We have a lot of stuff that relied on CI and where you wanted to have binary mirror That is intended to share across sides? Yeah no that's the entire idea so you can build your own binary mirror because we have commands to package whatever you installed into a binary form like APJ keys and whatever but then it will be your binary package, your binary mirror and that's it The idea is that we could build an infrastructure trusted among sides for those kind of things Okay I guess that as you keep adding software to the mirror the size can get big right? Yeah that could be the case that could be the case and I guess that there will be let's say the details I don't know if Todd has let's say more information than that and maybe it's worth to ask him but I think that like the procedure by which we could decide what goes in there and what doesn't is not decided into each and every detail because of course you cannot sample the entire space we said it's infinite if you take into account like the compiler flags so probably there should be some sort of community agreement on what should go there and those things will be the one that will be shared in those mirrors I mean I think it's a neat idea and that makes sense but I would be concerned about the space for the mirror having to be very very weak and if you rely on Amazon S3 then who's going to pay for that? DOE I think Okay, so if that was the question I think the answer is three letters as a way maybe to offset that a little bit similar scheme could be far but having for the record I'm not part of DOE this is not a commitment for DOE but you could have I guess the same thing but distributed across different mirrors so not a single side or not a single cloud service will bear that's true I think one could still do that and have a central one which is sort of the official or the most used things If DOE agrees Yeah exactly That's really the point Because to me when I saw that it was like yeah it looks like a good idea but the space is going to be massive who's just going to pay for that DOE There's legal issues there as well that you need to think about Okay that I'm not aware of You cannot just redistribute any software Oh no that's true You'll have to deal with it If you have to talk about that for instance at our site many commercial software I mean a few of them they live in a private repository because we don't have recipes not because they are difficult to make but because it's not clear if you can Exactly So Spark can be extended to have custom repositories anywhere you want you can tweak whatever and what we do at the PFL for those software is basically keep your private setup things that you don't share not because you don't want to share them but just because you don't know if legally you can I have two more questions as well Can you go to the open blast package that you showed Okay so that was when It was one of the here quite early Or probably this one Yeah this works So the version statements you have in there are basically function calls Yeah What's going on behind the scenes there Oh you mean the implementation Yeah Okay Thank you very much for your question that's something that I'm responsible for I mean not the first implementation So what's going on behind the scenes Initially Todd implemented a stack framework here So we basically were jumping on the stack frame of python new in that some version was called somewhat two frames above and changing things two frames above but now in that work until we factored out build systems So like four years ago three years ago we just had packages and you had to implement everything here you see for instance we have a make file package which basically aggregates all the common stuff that you have to do for make file packages and likewise we have CMake packages auto tools packages I proposed the refactorization and when I did that I was kind of fighting with this stack framework because you want to attach directives like variants to CMake build type build type is something that CMake almost mandates you have four build types which are custom you want that in the base class and never think about it anymore you don't want it repeated in each and every package So what I did was using the dreaded meta classes This is a function call that as a side effect puts a closure into the stack into let's say a list and the meta class when it reaches the end of the class definition looks at that list of closures and they are all such that they take just a single argument the class being defined and finishes doing the process So basically here we are just capturing those information what is missing is the class itself we don't have it here we do that via meta classes Basically like creating a factory for class instances or class definitions we are making it such that a function called during class definition affects the class definition itself while it is being defined it seems complicated when you say it but when you see it it is not it's just like a right version it's declarative you don't really care what's behind this but that was like the mechanism using meta classes and using I think the new method I need to review all the protocols but the new method of the meta class to receive the class being created and look into the list of closures that we create here and then finish whatever we need to do and we have a similar let's say implementation to implement a thing that I didn't show here which is pipeline for recipes so when you have for instance a CMake package it's like CMake build and install or CMake make and make install something like that there are cases, odd cases where you need to do something in between like tweak the package and I'm pretty sure you know that because you deal with the same packages we deal with and to do that we basically have a very nice pipeline that connects with decorators you can write whatever function you want give it whatever name you want within the class so you have a function definition here and you decorate it with something that says run before install so the decorator is run before you give as a string the name of a phase you want it run before and that magically will be run before that phase in the order of declaration the answer to that I think was the script or meta classes the other question I have when you show the overview of all the combinations you can make with version and compiler and dependencies this one what we do with easy config files is we basically take a couple of snapshots in this space and we say we tested this with this dependency version, this configuration works for me open a pull request, put it in the repository, other people test it and then it seems to work, goes in and gives you some guarantee that it will work now what happens with SPAC is you're probably building something that nobody ever has built before just based on what you specify to SPAC which exact SPAC version you're using in the future also what is already installed when it picks up those installations, so that how confident are you that it will actually work so that will be solved somehow project wise by the CI infrastructure and what is going on right now is that its site has somehow its own CI so we have a CI going on for our production releases where we test the software via Jenkins like CSS is doing and we know it's working and before Katina release we have basically some I would say words of mouth so for packages that are important like MPI packages or LAPAC packages before Katina release people like around the world try the thing and they report if it is successful or not but of course not for every package and not for every configuration so you're saying the CI that you're hoping to work on this year hoping to get it working is going to solve this but the CI is it's a big cache right it's not only a big cache it can also run whatever you know smoke test you want sure okay fine but SPAC will only pull something from this cloud if you ask for it so you have to ask the right question to actually get an installation if you ask something slightly different it will say it's not in a cache I'll just start building it so how does that help with getting something that has been like pre-approved or tested by several people okay I see what you mean so I think that most of the let's say default request will be validated if you ask for something that is really exotic like the variant that nobody uses yeah probably your own stream cases but I think that's more or less what happens if you do something like hey try this with this build and nobody else did it so we don't cover again we don't like certify the entire combinatorial space exactly and we don't have plan to certify the entire space I guess this is sort of why we have easy conflicts they're like a way to communicate to people like this probably works because it works for me you know if we have so you have easy conflicts which work that was my main point which work already when you have done the product of all the subspaces what we do is instead or what we could do is instead using to do the same thing that you do with easy conflicts I can certify this file and I can tell people around hey this thing has been certified this thing works those two chains works so it's kind of related to what I'm about to ask is how do people communicate back failure so stuff that doesn't work right so you can build it and it might fail right just check the number of issues with build error but do people I mean there's no automated like if something fails people just try something else but it's useful to know that something will fail right so that other people don't like because I see I assume these exclude statements mean something right oh yeah this means so for instance this is a tentative syntax but this means the tool chain is based on a compiler and an MPI and for compilers I also have these two choices for MPI I also have these two choices but not this and this and exactly but exclude this point yeah and I saw I guess I'm not too familiar I'm not familiar with back right but I guess in the package in the package files you have some exclusions too right not this with that and you know that will generate right now an error I think I've shown something but I'm just wondering like do you you rely on people to report this right to report not this with that right like don't do this with that version of this okay usually it's based on the documentation of the software itself like if the open blast maintainer reports a look this doesn't build with Intel 16 in this version range we just take the information there and as the recipes evolve and people could bring up good references which usually here I mean for the sake of having it on a slide I removed every comment and whatever but usually you get references from where a conflict or a particular convoluted set of dependencies comes from so you can either refer to the PR or go back to the original reference why something is there so but my thought on this is like say you have like a million combinations it's infinite in practice but say you have like a million ones what we do is we sample 10 that work what you're trying to do is defining everything that doesn't work ideally you're not doing that in practice but this could be very very long right this list of conflicts like don't try this don't try this don't try this don't try this yeah but we have also means to delete slabs at once exactly I'm not picking just one single point I could say this conflicts with Intel so whenever I try to compile this thing whatever version whatever variants with Intel it will say hey I cannot be compiling with Intel what does that do it just says it doesn't work right now it will say something like open blast cannot be compiled with Intel and you can also put like a human message here and it will be displayed in the error like a reference my question is mostly towards the discussion you had before about the implementation I was wondering why did you went through the meta classes and this kind of thing just to support this syntax because that was there and then I guess what did you gain would you gain an example or against like a more traditional still like we had in Reframe for example in Reframe we have very similar concepts but we're actually using self and then we have descriptors for validation for attribute validation and we could do that as well but then I'm thinking what would be for example benefit so in my opinion that's a matter of taste but in my opinion you get first thing the syntax is declarative so here you just say and you create the main specific language for doing packages it's implemented in Python it's valid Python but it's the main specific language the syntactic thing it's a bit it's fine for the packages so we did that work so that the packages don't have to repeat things but for them they don't make sense like you are a Python programmer for yourself makes sense but for a packageer that needs to say self all the time yeah what he wants to just say is what is relevant to him so we try to do that that was one thing the other thing to go with meta classes is to permit a few other operations like right now you can derive from openblast we use whatever is used if you use in the derived class a variant with let's say the same name you can override it it will take over and that is been done through this meta class mechanism now for a Boolean variant which can take just true and false this doesn't make much sense but we also support like multi-valued variants variants I don't know which could be in principle infinite like an integer number variants which are just a set of strings and in those cases think for instance of the build types for CME I was saying usually CMEX says there are full build types please debug rel with tab info and mean size rel or something like that there are a lot of guys around that they love just to change that slightly and in those cases you just have a CMEX package and you can override the build type for that package to reflect what the developer wanted to do and that mechanism was achieved with meta classes okay thank you any other questions thank you very much