 So I'm gonna talk about what's new in SPAC for folks. I think most people here are familiar with SPAC, but I'll give a little background. I'm gonna start with community stuff and then go on to some background and then current technical developments and some roadmap items. So I don't think I need to tell everyone here that SPAC is another tool for software distribution for HPC. Like EasyBuild, it automates the building installation of scientific software. I think the main difference that people probably know about is that packages are parametrized in SPAC so that you can easily tweak and tune the configuration. It's like even on the command line. So like with EasyBuild, you write an easy config for every concrete installation you do in SPAC. There's like one templated package file and you can install lots and lots of versions of the same package from the same parametrized package. It's designed to have not a lot of dependencies so you can just clone it and install something and then it has the syntax for complex installs and I'll talk about that later but essentially you can customize the version, the compiler, options, the target and anything about dependency as well. It has this CLI for installing things but it also generates modules for users like EasyBuild does. One difference there is SPAC does not require modules. We don't rely on modules for the build or anything. They're not a dependency so you can actually use SPAC without modules and it has its own mechanisms for getting things into your environment like conda and virtual end like environments. So I'll talk a little bit about that later and it has a lot of DevOps features in addition to just installation features. So you can manage the packages. It keeps a database of what's installed. The metadata is there. There's CLI container generation and lots of other features built into SPAC and I'll talk about some of those and how we've been building them on top of SPAC environments later. It's widely used. We recently surpassed 5,200 packages in the main line and we've got 730 contributors this year. It's contributors ever. It's not stable contributors but we have a pretty strong base of regular contributors. It's used all over the place and I get this map from the same place. Kenneth does. I actually stole this idea from him and I think this shows that we have a lot of usage around the world. This is our best estimate of number of SPAC users how many people are actively using the documentation site. You can see that back here in 2017 when we started these stats we were just under 1,000 users on the documentation site every month. So the top line is 28 day active users so monthly active users. It's been going up steadily and we broke 3,400 monthly active users this year. So the community is getting a lot bigger at a pretty steady rate. Kenneth and I share access to the weblog so I've compared EasyBuild here. It looks like we hit around 3,400 this year whereas EasyBuild is down here for just under 2,000 monthly active users. So I don't know if that indicates the size of the community or that EasyBuilders don't read the docs or I'm not sure but that's the comparison. One month this back is pretty busy. This is sort of a plot from GitHub. This is I guess October of last year before our last release but we were emerging a lot of pull requests regularly. So there's 504 that includes both core and packages. And we have like I said we have a pretty strong base of users who contribute to the package. So it's not just me. And in fact like in a normal month I think in this month I was working on the new concretizer in a normal month I would not be one of the top contributors anymore. Contributions this back continue to grow. We get them from a lot of different places. This is contributions in core over here. The core is still mostly Livermore and Max who used to be an UPFL I just haven't changed his designation. He's we contract with him. He works with us but we get a fair number of contributions from the community out of the core of SPAC. And then down here this is lines of code in all the packages over time. And you can see that Livermore itself is a pretty small contributor to the packages. And we get most of our input here from the broader community and that that's growing over time as you'd expect. SPACs used on some of the fastest machines in the world including Fugaku which was unveiled this year. We've been working with the Fugaku team for a year or two on getting SPAC ready to deploy things on ARM. They've been super helpful in contributing a bunch of patches to the packages to support A64FX and just ARM in general. So it's been nice for the larger community in addition to Fugaku. We also run on Summit and Sierra and a number of other machines in the top 500. SPAC's the central part of the US Exascale Projects mission. I don't know if people realize this but if you look at the ECP's mission it's actually in the mission statement that they want to create a robust capable exascale software ecosystem. And SPAC's a big part of that. I think shortly SPAC was fairly timely in that after ECP started they realized that hey we have all the software we need something to glue it together. They needed a package manager to put in the middle of all that. And so you can see here ECP actually maintains this dependency database which is outside SPAC. It's not ARM metadata but it's basically dependencies between the projects in ECP or the products in ECP that were maintaining. And so you can see this SPAC is actually the most dependent upon project in ECP with all 80 or so software products depending on it. It's not the most critically dependent upon project in ECP. So the blue stuff is critical dependence. So like MPI is the most critically dependent thing in ECP but all the packages are using SPAC to get their software out to users and to deploy on the machines. So SPAC's over here. There's a project called E4S within ECP that I'll talk a bit more about. E4S is the Extreme Scale Scientific Software Stack so you can read it up there. Essentially that's a distribution built on top of SPAC. The folks at University of Oregon collaborate with other software teams across ECP on E4S essentially groups of related products get together and do regular releases. And then those releases sort of roll into E4S releases. And the way that E4S is distributed right now is that they have containers that you can download that has SPAC just in them and you can install products from there. It is a set of SPAC environments and I'll get into that in a second. And it's also a set of binary packages. And you can learn more about that at e4s.io. And I'll talk a little bit more about it later as well. We did our first SPAC user survey this year for 2020. We hadn't done this before but I liked what the easy build folks were doing with the survey. And so I made a somewhat similar survey for SPAC. We got 169 responses, which is pretty cool. And people are generally happy with SPAC. Although if you look at, we broke it down by parts. People really like the community. They really like SPAC, the tool. And then the packages and documentation got slightly lower scores because the quality on them is not perceived to be as good as the quality on the tool in the community itself. But overall, these are pretty high scores and people are happy with the product. So that was a good takeaway. You can read all of the results here. It's a pretty detailed write-up. So if you just go to SPAC.io, I think it's still the top article there. There are a lot of stats on how the community is broken down, what features people want, and so on. We did separate out the results between ECP and the rest of the SPAC community in our survey. And so you can see here, the SPAC community is about 36% ECP. And I looked at the GPU and compiler questions that we asked. Ours were posed a little differently from Easy Builds in that we asked, what are you going to use within the next year? Not what are you using now? But you can see that the broader community, at least in SPAC, nearly all of them are expecting to deploy something on NVIDIA. And an interesting stat is that over half of them are expecting to deploy things with AMD GPUs. And within ECP, it's like 75% or 80%. So that's pretty cool. Intel GPUs, there's only one machine within ECP that does that. But just over half of the ECP customers are planning to deploy on the Intel GPUs on Aurora. And I was surprised to see that the community at large was interested in Intel GPUs. I guess that might just be the fraction of these, well, that's a little more than just the fraction of the community that is ECP. So that's kind of cool. By and large, the ECP folks want to use more exotic compilers and architectures than the rest of the community. But I think you could expect that. And there's more results like this in the survey. One thing we've seen this year with SPAC is an increase in industry contributions. So like I said, Fujitsu and Riken contributed a bunch of packages for ARM. We also got some contributions from folks working with Lunaro. So there's some people with High-Silicon who deploy some kind of ARM server. They've done over 400 pull requests to patch things for ARM. AMD is actually contributing the Rockin packages and compiler support to SPAC. They did the support for HIP, for all of HIP's dependencies, for optimized versions of things like Bliss and other math libraries for the AMD GPUs. And so it's cool to have them contributing directly. So we're trying to work with them to figure out things like schedules. And I think sometimes they want to have things in uncertain dates and we wanna understand how we can best review their pull requests when they don't want them maybe to be public before they have product launch or something like that. Intel is contributing one API support. So we have Robert Cone from Intel doing a lot of support for the Intel products. One nice thing about this is that we can talk to them about how their products are laid out. So like the Intel package that was sort of of a superclass of like Parallel Studio, the Intel compilers and so on previously. It was like 7,000 lines of code to understand all the different ways that the Intel products are packaged. And with the one API stuff, they've really broken it down into components better. And the one API package is much simpler than the Intel product package. So I think that's pretty cool. Nvidia is contributing NVHPC compiler support and some other features. And then we have this collaboration with AWS where we wanna have a build farm in the cloud making optimized binaries for Parallel Cluster. So that's pretty cool. We did a joint SPAC tutorial with AWS last year. They got over 125 participants. So I think that's been pretty successful. People like using SPAC in the cloud to bootstrap their software on a new Parallel Cluster instance. So before I get into technical details, I wanted to give a little bit of background for people who are used to easy build of how SPAC works. So this may be a review for people who are already familiar with the tool, but for those who aren't, this will sort of set up the rest of the talk. In SPAC, we have this spec syntax that describes customized installations. So we use it both in the packages to describe things like dependencies, but also on the command line so that users can say how they want something installed if they wanna tweak it slightly from the recipe. So here's, you can say SPAC install MPI leaks that's unconstrained, it says get this thing installed. You can add a version. So you can say what version you want. And this can affect what dependencies, what constraints there are on the package. You can say what compiler you want. You can add options. The packages can say I have a variant called threads. It can be on or off. You can have string valued variants and so on. And you can inject flags into the build. So if you wanna build 60 different versions with different compiler flags, you can do that and you just specify them on the command line. So if you wanna benchmark something, it's fairly simple. We developed this ArchSpec library that I think you heard about earlier this week from the EC project. It's essentially a library for reasoning about micro architectures and SPAC uses it natively. So all the installs are done for a particular target so that we know where the binary can be used. And you can specify that on the command line if you wanna build for a different target than say what your host supports. And we have the micro architecture names in there so you can use what I think is a familiar name for the different machines. And then all of this is recursive for dependency. So if you want to specify constraints on some dependency of what it is that you're installing, you can do that. The packages themselves are in this DSL that we developed. It is based somewhat on how homebrew packages look. Every package is a Python class. It's cool because it allows you to extend other Python classes. It's easy to make super classes for these things with added functionality. So we have a number of those for different build systems and other types of features. So like there's a CUDA package that you can mix in for GPU packages, ROCM packages and so on. This stuff up here is just metadata. So this is a comment. This tells you where the homepage is. And then the URL is used to figure out what to download. We can extrapolate that and figure out the URL for different versions. So these are different check sums for different versions of this particular package. You can specify a hard URL for every one of them if they're really different, but usually that can extrapolate from the URL here for each of these versions. And then there are options defined here. And you can see here that the dependencies are actually conditional or they can be. So this package depends on MPI when this MPI option is enabled. It doesn't have to do that all the time. If you build Kripke without MPI, it just won't depend on an MPI installation and the build will take care of that. You can have version constraints. So this says I need CMake at 3.0 or higher and that can be conditional too. So you can say when I'm at a particular version, I need a particular version of a dependency. And actually you can impose really any type of constraint on a dependency. So if the dependency defines variants like this package, you can say I depend on this package plus MPI when I am plus MPI and so on. So the stuff up here is really defining the space of possible builds. It's all the options, all the different variations of the package that you can make. And it's combinatorial. So like you can choose true, true, true, false, false, true, false, false and so on for these variants. The actual instance methods on the package, they're just, they're recipes. And so we concretize the package with this information up here. We come up with one configuration with one value for every variant. And then we pass that down here to the install method as a spec. And then you can query, how am I supposed to build this? Am I supposed to build this with MPI on or off or what? And you can see that being done here. If plus open MP is in self.spec, then you pass true to the CMake flag for that and so on for MPI. We have a dependency resolver. We call it a concretizer because it does more than what your typical package manager dependency resolver does. It handles all of these constraints. It makes that into sort of a loosely constrained graph. We turn that into a graph with everything filled in and then we save the provenance on disk when that's done. And so really reproducibility in spec is about having the recipes that you built with, which are actually stored with the package too, plus this spec.yaml configuration and then rebuilding with both of those. So this is essentially the parameters to the package. And then you can reinstantiate a particular configuration of the package with this concrete spec. We take those concrete specs, we make hashes out of them for each configuration and that's how we determine where things get installed. And so that's how we handle combinatorial software installs in spec. Everything is R-passed. So the thing in this Python directory will know where to find its dependencies because the libraries and executables there have R-pass out to their dependencies. And that's nice because you don't have to remember what module something was built with to run it. You can just run it straight out of the directory and it'll work. And users don't really need to set the LD library path. It works the way that you built it, which is handy when maybe you weren't even the one that built the thing that you're using. Like I said, we have this notion of environments. And so you could really think of environments as analogous to specs. If you want to, like if I said spec install MPI leaks, that just says get me MPI leaks. An environment is sort of a list of those where you might want to install HDF5, live elf and open MPI, for example. You can have some configuration bundled along that in a simple YAML file. And then the whole environment is concretized and saved out to a spec.log file, which is basically like that spec.yaml, but for the whole environment. So you can reproduce the concrete environment with the log file. If you want to re-resolve the same set of specs and configuration on a new platform, you can pretty easily do that by just shipping around this YAML file. And you can re-concretize things in new places. And that's kind of nice because it basically says, these are the requirements up here. This is what you need to build something. If you want to resolve it on a different machine, maybe some of the dependencies are different or maybe some of the options are different, that's fine. You can still expose these sort of abstract requirements to the users and people who consume the environment. But underneath, we may resolve it to something different and that's still reproducible because we have this log file. Okay. On top of environments, we're building a lot of other functionality. So essentially like with this list of specs in an environment, we have sort of an abstraction on which we can build functionality. So if we want to do something with that list of specs, you can add additional configuration to the environment for other types of commands. And it's nice because you can version that config in a repo. So really any environment can actually be turned into a container. You can say that containerize within an environment and it'll build you a Docker recipe that clones back, that installs all the packages from the environment and so on. If you want to tweak how that's done, you can add a container section to sort of tweak, what distro do I want the container built for and so on. You don't have to do this. You can actually say spec containerize for any environment and we'll spit out a container image. And this has been in there for a while. We have this notion of stacks, which are combinatorial environments. You don't have to necessarily specify a list of specs in your environment. You can specify a matrix. So you can say, I want all of these packages built with all of these MPIs and all of these compilers or rather all the combinations of them. And that looks like this, where you say this is the matrix, it's the cross product of these three lists. And that makes it fairly easy to express facility deployments and diversion those in a repo with a config file like this. And it's really a stack, it's just an environment. It works just like a regular stack environment. So if you want to do the other things that we do with environments, you can do that. It's just this matrix keyword that really differentiates it. We have GitLab CI integration. So this is what we're using for the cloud CI that I mentioned that's actually building binary packages. So this up here is just a regular old environment or a stack. And then this down here, this GitLab CI section, all it's doing is it's mapping the concrete specs that come out of this environment resolution to particular runners. And so here, we said to build this package with this compiler could be more for these two operating systems. And then down here, we said, well, if the spec matches this criteria, if it has this OS, then run it on a runner with this back Kubernetes attribute and this container image. You don't have to do containers, you can have bare metal and you can tag your runners depending on the attributes of your bare metal hardware. So for like the University of Oregon, when we work with them, we just tag their machines based on the machine properties as opposed to specifying a particular container image for that. So you can differentiate across all the different axes in a SPAC spec. And what this does is when you run a SPAC CI command, it takes this description and it generates you a graph of all the different builds that have to be done for to build all the things in this environment. And it stages it so that GitLab CI understands it. It actually doesn't require staged parallelism. GitLab CI has support for DAGs now and it just builds them in parallel. And each of the stages uses the binary package from the previous stage. So we're actually getting binary package testing in the pipeline as well, which is nice. E4S is actually just a SPAC environment. So like the way that they distribute the distribution for ECP is they have an environment like this. It's available online. You can look at it in the SPAC configs repo. But it's just a list of packages and it has a little bit of configuration for being built in different places. And the facilities actually take this and they build versions of it for their site. So NERSC actually thanks to Shazab and other folks just got E4S online for the NERSC facility using this and collaborating with the E4S team. And so you can either install this on the command line. You can take the environment, you can say SPAC install. You can run that in parallel on the cluster or you can run this through CI and have it built in CI runners. So there's lots of different ways to do that. E4S team takes these environments and they run this through CI and they generate a binary cache. And right now the binary cache that they have, you can see it at e4s.io. The URL is right there, but there's also some links from the main e4s.io site. You can go and search this and see all the binary packages that they have for different OSs, for different architectures and so on. We have 27,000 binaries in there right now. And this is pretty cool because it has enabled some ECP teams to really speed up their CI. So some projects where they really couldn't build all their dependencies in Cloud CI because it would just take too long have E4S set up with this build cache and it's 10 to 100 X faster. They can basically just download the binary packages for all their dependencies like you would with APT or some other package manager and they can build on top of that. You can use that in a facility deployment scenario by building your packages in advance and then you can redeploy really fast on a new machine without rebuilding. So it's a nice feature and we're hoping to make, this is a public binary cache but we wanna make binaries kind of the default in SPAC eventually. We're not as confident as we could be in the binary packages yet because we don't have relocation testing in the pipeline but we're working on that right now. And I think once that's done we will start having binary packages just out there as the default thing to use for SPAC installs. And so to support that we're expanding the CI support to include every pull request. So historically we have not built all the different packages in the SPAC distribution. Now we're building at least the E4S subset on every PR and so, or at least the ones that changed. And so the SPAC contributions come in from GitHub. We have github.spac.io hooked up to GitHub. There's a mirror and we run CI out of the mirror. And so what that means is that we have this SPAC.animal version in our repo. If we see changes to the main SPAC repository we run SPAC CI and we generate a pipeline and we run that in AWS. And that includes like I think about 300 packages that could potentially be built with every change. There is a pipeline at Livermore that we are standing up. We're trying to stand up an open cluster so that we can run potentially untrusted code from pull requests to test this kind of stuff. And then the University of Oregon also has machines that they're gonna contribute. And they have a pretty impressive menagerie of systems there. I mean, they even have like a, they have nodes from things that you wouldn't expect. They have some Mac nodes. So we're hoping to have Mac binaries and they have that's NEC vector machines. They have power machines. They have AMD GPUs and so on. And so I think we can use that as a build form for the more exotic stuff than maybe the vanilla things that we're building in AWS. So yeah, we build changed packages on every pull request. The SPAC CI command handles tweaking the pipeline so that it only generates what changed. And we're planning to do different compilers. So we've been working with Intel. We actually have an Intel compiler license now that we can use in CI. And these go back into the main line and then SPAC contributions come in and repeat. We have developed some new security support, some new security that supports contributions from forks. And what that means is really that if someone said it's a PR, it could come from anywhere. And so we build the binary packages for that in a sandbox build cache. And every PR gets its own sort of sandbox. So that you, and it uses the mainline one too but new artifacts from PRs are pushed to this sandbox. And what that means is that if, if you build untrusted code in a fork, the binary package is never really exposed to the public. We don't put that one out there. We stick it in this, this build cache for the PR so that the PR can iterate. But then once someone, some human looks at the PR and approves that it goes back in the mainline and we rebuild everything in, you know, our secure and signed environment. So really the stuff that's getting into the final binary cache has always been approved by some maintainer. Yeah. And then once the PR is done, you can see the status on the GitHub project for SPAC. So we released SPAC 0.16 in November. It has, it's probably one of the bigger releases that we've done. The biggest feature in this release is that we finally got the new Concretizer into SPAC. And I'll talk a little bit about what that means in a second. We have some testing support now. So you can actually bundle tests with your SPAC package. We're not trying to subsume the role of things like reframe and build test. We really, we just want smoke tests in SPAC packages but we needed a way to run something simple and arbitrary after a package was installed. And we wanted a way to run it after, you know, after the install was done. We don't just run it install time, we could run whenever. We have some developer support. Parallel environment builds have made the builds way faster. We can build the whole environment in parallel like I showed before. And then, you know, some other things that I'll talk about in a minute. You can, you can get the release on GitHub. The Concretizer was probably one of the biggest changes. You know, as I explained, the Concretizer takes these abstract user specs. It takes configuration that you may have set up in your environment or for SPAC itself. And it looks at the package files, which you know, there's a lot of them. You can have a lot of possible dependencies for a package. And it figures out what the dependencies should be, what the constraints on the different packages should be. It essentially does a solve across the whole graph and comes up with a solution that meets all the requirements. It's gotten to be fairly complicated. And, you know, it doesn't, we didn't have the best support for backtracking in the original Concretizer. And you have to do that because, you know, these problems aren't solvable through direct derivation. You can't, not everything just follows. You have to search for a solution. And so we really needed a backtracking search and an efficient one. And so this boils down to SAT with optimization. And it's by SAT, I mean, the Boolean satisfiability problem. So we looked around for a solution for that. And so what we came up with was, we were using answer set programming for our solver. And answer set programming is called ASP, but I always mix that up with active server pages. So keep that in mind. It is essentially, it looks like prologue, you write in first order logic, but it boils the problem down to satisfiability on the backend with some optimization in the solve. And so instead of doing what we were doing before, which was sort of iteratively constructing a graph, evaluating conditions, adding nodes and so on in Python. Now what we do is we generate a whole bunch of facts that come out of our packages. So basically we look at all the possible dependencies of the package and generate facts. We send that to a solver. So like 20 or 30,000 facts per package is pretty typical. So I think this is 30,000s for HDF five. And the solver basically boils that down to SAT. It instantiates all these first order constraints. By the time you're done, you have something like 300,000 facts in the final SAT solve. And then we run that with a small logic program that has the real semantics for how SPAC dependency resolution works. And we spit out solutions. And so what the solver does is it solves for a stable model. So it's like a fixed point. It says I'm gonna find a solution where all the optimization criteria are optimal and where no rule in the solve, where all the rules are idempotent basically, you can't apply a rule and have it changed. It's what a stable model is. Once we get that back, we build that into a spec and that's what actually produces the spec.yaml. And this has really simplified the implementation of the Concretizer in terms of what we have to do on the SPAC side. Now we just iterate over everything and say, this is true, this is true, this is true. And the way that we can formulate different constraints. So things like optimization, optimizing for multiple criteria, like what target should I build for which compiler support that target are way simpler now. And I think this will enable us to do more things. So what I'm working on right now is optimizing for reusing as many installed packages as possible. And we can finally express that with this new paradigm. And so that's pretty cool. We are currently working on vendering it. It's optional now. You have to SPAC install Klingo, the solver yourself at the moment if you wanna use this. But in 0.17, it'll be automatically installed from public binaries and we'll have a way to easily bootstrap that on an air gap network. So it'll be the official Concretizer in the next release. Let's see. And okay. One of the other features that we added in 0.16 was testing. And so like I said, these packages are just Python classes. And so this is the same as a regular SPAC package. You can specify stuff that you wanna keep around from the build directory if you need to keep something around until after the installing case you wanna run the install later. And you can define a test method that maybe it goes and compile something simple. Like this is sort of a really simple example using like the build system or anything. And you can run programs with this. And so this allows you to specify simple smoke tests. We're working on adding this for all the E4S packages so that essentially we can easily validate an existing installation. So we would compile and run a test for every package and just make sure that it still works. What we are hoping is that we can associate things like reframe tests and pavilion tests with SPAC packages so that we can farm out the work for more complex tests to a system like that. We don't really wanna be in the business of doing combinatorial runs of things. We think that's better or less to reframe or pavilion. And so we're thinking of how we could express interfaces between this and some of the more popular testing frameworks. But at the moment, we wanna leverage the fact that we have a large contributor base and we'd like to get at least some simple validation tests in this SPAC packages. So we have this. We added some features for developers in 0.16. The developers say that SPAC is more of a deployment tool. And so we've been trying to support them and getting them up and running in new environments rapidly. One of the things that we do in SPAC frequently is we reuse existing packages on the machine, but that has in the past taken a lot of configuration. So if you wanted to have an external package of some sort, you would have to write some animal for that. And what we have now is you can add a method to your package that tells SPAC how to auto detect it. And so essentially, if you've used SPAC before, you know that you can say SPAC compiler find and it'll go out and find all the compilers in your environment and register them so that you don't have to bootstrap your compilers. This is the same thing for sort of arbitrary packages. So for CMake, you can say, hey, there's a CMake executable, find it in the path. And here's a method that tells you how to query it for its version. So any other metadata that you wanna stick on the SPAC and it'll generate YAML for the external specification. And so for developers, this is great because they can basically, they can start a new SPAC environment. They can activate it. They can run SPAC external find. And then they basically get all the dependencies that they need from the system registered in their environment. And that enables them to avoid, you know, building things like Perl or things that are probably already on the system. We want every dependency specified in SPAC, but oftentimes it's available on the system and we don't want that to be the default because it's not exactly reproducible. It's kind of up to you to guarantee the compatibility at this point. But we want developers to be able to set that stuff up rapidly. And so this lets them do that. They can get our new system, they can get MPI and their compiler set up and also some system dependencies and they can get building on the packages that they care about really fast. So this has been great. And the way that we've structured it enables contributions from the community. So we're hoping that people will expand this capability for more packages in SPAC. Keeps going backwards when I click that. Okay, the other thing we've been doing is we've been working with some of our code teams on how to do sort of multi-package develop environments with SPAC. We have a lot of teams that work on codes with 60 to 90 packages that they need to work on. And they may need to work on multiple of those packages at the same time. And so we're working on a method that would let you basically automatically check out packages from your environment and have the environment built with these developed versions of your package. And so you can say SPAC and activate if you have a SPAC.AML in a directory. You can add your application to that environment. So that says put this in the spec list. This is the thing I wanna build. And then you can specify dependencies of your application or even your application that you wanna check out at particular versions to work on. And so here it's saying I wanna work on AXM and MFM. Those are two little more packages. And what you'll get is this is the SPAC.AML for your environment. You'll get two development directories here that you can just CD into and start working on. And then when you SPAC install this or SPAC build this or run the build via install, then it will basically take the source from these directories for the development build. So this makes it easy to work on complex packages. And you can see that that's specified with another development section in the environment. So develop and it tells you the specs of the packages that you wanna work on. We're working on expanding this. What we really wanna have is get version integration so that we understand essentially any version or commit that's in a Git repo for these things. Right now, working out how to compare that to the versions and SPAC packages is a bit tricky and knowing when to clone the repo so that that doesn't happen every time you wanna compare two versions is something that we're working on. The hope is this is a milestone for this fiscal year. We wanna have one of our code teams using this in their development environment by September. So keep an eye on this. And we had some other folks in the external community who've started using this and they're having some good results. Under ECP on the roadmap, what we are currently working on is just support for all the different exascale systems. So I think folks know that Aurora is gonna be an Intel and Intel GPU system and Frontier and El Capitan are AMD and AMD GPU systems. And then the Perlmutter system at NERSC is AMD processor and NVIDIA GPU system. So we pretty much have most of the architectures you would see in a while is that maybe ARM represented here. And it's been nice to have vendor support for these things to have the vendors contributing. There is an effort under ECP to get tests working for all of E4S. So Tammy Dalgren is working on that and better support for GPUs and GPU compilers is a major thing that we're working on and it's fairly complicated. We have a collaboration with HPE on tighter integration with the Cray environment. And so what they're doing in, there's two things that we asked for. One was make it easy for us to detect what's in the Cray environment and two was make the Cray environment more vanilla. So make it look like normal packages instead of this wrapper stuff that they do. And so in new versions of the PE there's a sort of a search tree like you'd expect from another system or a normal system where each package is installed into its own directory. And we've had them generate this sort of SPAC.json file that includes the packages that are in the programming environment on the Cray system as well as their dependency relationship. So it ends up looking a lot like a SPAC database and we get a lot of information out of this for how the packages are configured on the Cray machine. The cool thing about this is, like with SPAC external find, we would be able to get on the machine and detect things automatically. We'll get more information out of this because they have things like the architecture it was built before the GPU support, the different options and so on. And so we're hoping that essentially the SPAC experience on Cray machines is pretty turnkey out of the box. You get on a Cray machine, you detect this thing and then you can go and build against the programming environment without having to do any configuration because the Cray machine just tells you. Let's see. It also tells us what RPM the thing came from. So, I mean, these are not public RPMs but the cool thing about that is we could potentially make reproducible container images out of things that are in the Cray environment. So we could make a recipe that says which RPMs to install. And yeah, this is an ongoing collaboration. There's still work to do to get their environment to work sort of in a vanilla way but there has been a lot of progress with both this packages installed to a single prefix thing, unbutchering of weird patches that are on certain packages. And then just the, they've added standalone versions of all their compilers that we can use without going through the wrappers. And so there's Cray CC now in addition to just CC that changes with the modules. So I think that'll help us have much more deterministic builds on Cray systems. We're doing some changes to the permissions in the directory structure in 017. In particular, we're taking some features from Sandia for sharing a SPAC instance. We already have support for this upstream capability where you can point SPAC in another SPAC instance and say use the dependencies from there. Essentially what this does is it adds an upstream to SPAC out of the box. So the global installs that go into the SPAC directory are for everyone on the system. And then by default users build in their home directory and can use the global installations as dependencies. So we're trying to make that better out of the box with permissions and with just making the configuration easier. We're gonna get rid of the configuration in the home directory and just move it into the SPAC installation. The rationale here is that for most users, the configuration in the home directory is sort of an unwanted global. It affects all of your SPAC installations. And what you'd rather have is a per instance configuration. So for each clone of SPAC that you have, you can put the configuration in there. If you're using a shared facility install, you really want the facility to be able to set the configuration. And so the installs by users are consistent. And then they can put configuration into environments if they really want to customize something. But what we don't want is a global in the home directory. So we're probably going to get rid of that and move to something where essentially a fresh clone of SPAC just relies on only itself. And so that's good for reproducibility and for CI environments where we want to test back itself. So these are some fairly large changes coming down the pike, but I think they'll be good overall and no help a lot of the workflows that people want to use SPAC for. The other thing we're adding in 0.17, hopefully this may slip to 0.18 is compilers as dependencies. And so right now compilers are sort of attributes on SPAC nodes. We really want to get deeper into this and have virtual packages so that you can say, I depend on C, I depend on Fortran, I depend on Rust. And have compilers recognize that way along with their runtime libraries. And so the motivation for this is both to simplify things like conflicts, specification in the packages. Right now packages say, I only work with this version of this compiler and this version of this compiler and so on. What we want them to say is I need C++17 or I need this feature from C++17. So the specification should get simpler. And we should be able to manage the dependencies on particular runtime libraries, like particularly libstud C++ better so that we can come up with ABI compatible configurations. So we're working on that. We have a prototype of the dependency resolution for this that Max developed. And we are working on integrating it with the new Concretizer and getting all the semantics for that straight. The final thing I wanted to mention, and you can find a whole lot more about this in our talk on dependency management in the dev room on Sunday, is that we have a new research project at Livermore called Build. And that stands for binary understanding and integration logic for dependencies. But essentially, we don't really want humans to have to maintain the constraints in packages. And moreover, we'd like to be able to reuse system libraries that we find or binary packages that we find without having to have a package specification for them. We wanna understand compatibility with those things without having someone maintain the specification. So version ranges and conflicts in packages are not precise, we rely on humans to get those things right. But really what we care about is the ABI between different packages. And so what we're aiming to do with this project is understand software compatibility better at the ABI level. So we wanna come up with models for what's in an ABI, what functions, what data types, what parameters of different function calls are in that. How do we extract those from debug information for libraries? So we're looking at libAvigail and Dynance to do that. And how do we integrate that into solvers? And so ultimately the goal would be to take these sort of lengthy human-generated constraints that we have in SPAC packages, get rid of them and use models of libraries that take into account the sort of entry and exit calls of the libraries and their data types. And so if we had this, you could imagine that every package in the graph is not just a version constraint, it's like a puzzle piece. And you have to match up the different functions going in and out of the different libraries. And we could do something like a type check on that to make sure that they're compatible. The other kind of cool thing is if we saw some system libraries on a machine and we extracted their ABI, we could solve around that and find a set of binary packages that are compatible with what's on the system. So that's really the goal is to make binary packages more useful and more widely applicable so that we can give you a graph that's at least guaranteed to link and resolve dependencies. We're not getting into the behavior of the libraries here but I think there's a lot of low hanging fruit or at least helpful stuff that we can do with ABI. So this built project, it's three years. It's a fairly large research project. They call it a strategic initiative at Livermore for these large LDRDs. But come to our talk on Sunday in the FOSTA and dependency management dev room if you wanna hear more about this. And that's it. So I'll take questions, thanks. Right, thank you very much, Todd. I'm sure there will be questions. Let's start with Victor. So Todd, just two questions. One question is about your support for the ECP packages. You listed there that you have two very interesting packages, one is called PAPI and the other one is DASHON, right? And in general, these packages, they are built together, should be introduced inside the dependency of others, right? If you want to get either a gromax or compile with DASHON, I need to do a plus DASHON gromax, right? So that gets injected into the DASHON dependency. And currently, as far as I know, there is no single package in this package that actually any mechanisms package that I can do, say, now I want to compile gromax with DASHON, right? I have to go to the package, introduce the DASHON option, and then voila, I get now the possibility to create gromax with DASHON or with PAPI, right? Do you have a vision to have any mechanism that's generic enough or that you can do these dynamic injection of dependencies or some sort of that? So I'm talking about this because at CSCS, we also depend on getting performance tools, right? So you want to, so how are you gonna do that? So we have actually talked about that a little bit. We don't have like a near-term plan for it, but actually one of the researchers in CASC is a group that lives in Moore was gonna investigate that. I think the mechanics are pretty clear for what we would need to do to enable, I don't know, I would call it a tool dependency, where like you said, you could inject the library on the link line or you could inject some compile flags into a package. That's actually like, that's not that hard for us to do from the build system perspective because every stack package builds with a compiler wrapper. And so we could inject the flags into the compiler wrapper, we could inject additional libraries into the compiler wrapper and cause any package really to be built with tool. And to some extent you could do that on the command line right now, you just have to do the flags yourself. What I think would be cool for that is if a package itself could say, I am a potential tool dependency and here is how you would set flags for me and here is how you would add libraries to things that depend on me as a tool dependency. And then that would be dynamically injected into a build. The other use case that I think is maybe, our code teams are more interested in right now is memory sanitizers. And so like they really wanna be able to, for a memory sanitizer, you have to build everything in your package tree with some special flags for our compiler. And so we could potentially do that with this mechanism. So yeah, I think it's possible. I know how I would implement it, it's just a matter of getting it implemented. Do you have any road map for that? Sorry, what? You have a roadmap for that? Like it's gonna be like... I have an implementation plan, I don't have a roadmap item for it yet. But I mean, given that we're working with a code team this year on improving their build process, I'd say that's probably a stretch goal for the milestone I mentioned to do in September. We could, if we decided to do that for the code team as we're pointing over their build system. Okay, thanks. So if allow me to have a second question, I don't know if you can do it, Kenan? Sure, Tim. So it's about Cray support, right? So Spark has been running on Cray, and but we have a lot of issues at least at CSCS because recently you added support for Cray and Pitch which was not there. And you was recently now added also support for Cray, FFTW, all of those were used by CSCS, right? So before everything you were doing was doing to this map to M pitch. But we still have issues with Cray-to-Linos, Cray-HTF5, everything else that Cray supports. So what's the real roadmap for Cray, for real support on Cray? Because currently we have to patch a lot of packages to be able to compile anything that's with Cray-to-Linos. Well, I mean, that's the integration that I mentioned or that we're working with them on for Coral-2. So I mean, I think that would all be exposed in that back.json that I mentioned, at least what's installed. I don't know that we will introduce special versions of every Cray package because they, I mean, so like Trilinos is one of their past three packages. I don't think they do anything special for their Trilinos, it's all they use to compile it. So, I think what you'll see is that we will be able to generate a package that you know, or get the dependency spec from somewhere else for all the stuff that's on the Cray machine without having the configuration that you have to do right now. I guess it's something, yeah. So, I mean, I think it'll be more automatic soon, like within the next few PE releases. But, you know, we're still iterating with them. And I mean, to some extent that depends on the speed of Cray, which is variable and special. So, yeah, we're pestering them for updates right now. And, you know, I think they're doing, they're being pretty darn responsive to working with us. We have a working group that meets, I don't know if you've seen the, there's the Compass Packaging Working Group that you guys are on with Sir Lane. And then there is also like on intervening weeks, there's a special packaging working group for Coral 2 where we work with HPE and we can't invite other folks to that because it's in NDA. On that same path, right? So, currently spec, you cannot do spec compiler fine on the Shasta system because it's completely different mechanism. You don't have the PRGM GNU as you guys expected, right? So, you have to provide the compilers.yaml file by hand or by yourself. So, to say it's gonna go with this JSON file when you go back from, if you change this compiler from First World Simpson to Packages. Yeah, so you can't find Proggens GNU from a module right now, but there is this module restore thing that they have now in the newer version. So, we're adding some work to that. It's not done. But, so like, I mean, right now, right now everything is in flux. Okay, okay. We have, and it's somewhat frustrating because the behavior differs between the old Shasta NXCPE, the new Shasta NXCPE and the CSPE because they have like all these different products, right? The programming environment on their so-called cluster systems is different from the programming environment in Shasta NXCPE. And so we're having to work around that. So, what we're hoping is that that'll converge soon. I just have to be more patient, basically, right? I think so. I mean, I think there's a pretty hard deadline in the fall because we gotta get this working for Frontier, right? And so, because we're working with Oak Ridge. And then, you know, it obviously has to work for us to grow a Capitan. So, like, there are real machines that need this NBOE that we have to make it work for. And the PE itself is in flux right now. So that's the hard part. Okay, thanks. Yeah, I'll switch to a question from the Slack channel first. So, Maxime from Compute Canada is asking, why do you need or why do people need or want to have multiple clones of SPAC at the same time? So, they might want to develop, you know, something they might want to do. So, NCI is usually the case where you clone a fresh version of SPAC every time. Some of our users have fresh clones of SPAC because like they, two different code teams may deploy a common SPAC for their code team. So, like, at Los Alamos, there is like, you know, there might be a SPAC installation for one code team and another SPAC installation for another code team where they have all their stuff configured. And that's how they chose it to deploy it. So, I mean, we want that scenario to work. I don't know that a typical user would have that. But if you have, you know, the usual HPC center hierarchy of the facility maintaining one SPAC instance and then some code teams chaining against that and then some users using all of those, you can get a number of different SPAC configurations. And, you know, what we're trying to do is give the owners of the installation more agency over their configuration and not have the user config interfere with that. So, we're just, we're sort of matching the config to the ownership in these chain configurations. Does that make sense? But that's partially because the SPAC configuration and then SPAC installation are mostly easy built, you can use one old wrapper and have it configured differently for different use cases by using a different configuration file or setting environment variables differently that doesn't work with SPAC or? Yeah, that does work with SPAC. It's just that they have chosen to have their own SPAC fork, right, with their own package tree and their own SPAC executable and their own stuff. So, like, which is customized for different use cases. So that's why they have different copies. Yeah, and they, you can have different environments in SPAC with different config, like literally any configuration parameter can be different in different environments. It's like an instance of SPAC, but that's not how the teams wanna own their SPAC instance. They wanna control what version of SPAC they're running and stuff like that. So, you know, we wanna make that work. It's not what I would recommend people to do, but it's, you know, it's what they do. Okay, that's clear. Shezeb has a question too. Go ahead, Shezeb, yep. Oh, okay, can you hear me now? Yes. Okay, yeah. I have a question on the GitLab CI. I think it was on slide 19. Showed an example of the GitLab Runner mapping. Yeah, over here. This is what we, well, this is used in, this is maybe an old slide. So maybe, yeah. So I had a question. So in the mapping section, the SPAC cloud Ubuntu, is that a runner or what is that? That property. So like your GitLab instance has runners, right? And the runners can have tags. And essentially the way that GitLab matches these CI jobs to runners is by tags, right? Yeah. Is that the tags property, the SPAC-K8As? Yeah. And so essentially what this is, when you concretize this environment, all of the concrete specs up here, like every node in the graph, has, it becomes a job in the GitLab CI specification. And the tags down here are saying for all the specs up here in your concretized environment that match this, put, make them require this runner tag, right? So yeah, yeah, yeah. Yeah, I got the match part because I remember that is the matching of the specs, but what is the SPAC cloud Ubuntu? Is that a, what is that property? Like, how is that organized? This is a name. That's just a name for the, for this particular dictionary under here. That's all. I see. So is it actually building on a different image like Ubuntu image? Okay, I get it. Yeah, the jobs that match this will have these attributes that their runners have to have, right? So like, if the job needs run on a Docker runner, it'll have this container image. And it says, I have this requirement. So I have to run it on the Kubernetes instance. For our cloud steps that require Kubernetes, for a facility, you would probably tag this like Haswell or KNL or something like that, right? And then you would say these, the jobs that have that target need to run on the runners at the facility that have that KNL type. And that's just because like, there's no, GitLab doesn't have like a, something like ArchSpec built into it, right? Like it doesn't let you say, what architecture is that runner? You have to tag the runner and then map the build to the runner that way. And so this lets you do that at arbitrary facilities. Yeah. I seem to recall that when we did the spec seat, like pipeline on Corey, like we didn't have to specify the tags in this GitLab CI in the spec.yaml because it was defined in the GitLab CI file and it was automatically inherited. So this is kind of, I guess, I would think it's not really needed or maybe the feature may change. The reason it's needed is because in this configuration we're building things with, that have different hardware in OS configuration, right? So we need to say some of the specs need to build with this image. Some of the specs need to build with this image. Okay. And if you had a heterogeneous cluster, right? Or if you had a machine where you had multiple architectures, you need to say the specs with this target need to build on the nodes that, or the runners that have this tag, right? Because if you don't do that, then GitLab will allow any job that you picked up on any runner and the runner will pick up a stack job with a spec that's not compatible. And it will say, you don't have that OS on this machine. I can't build for that. Or you don't have that target on this machine. I can't build for that. Is that makes sense? Yeah, yeah, no, it makes sense. Yeah. Is there any plans for having like, I know this may be a little bit out of scope but having custom rules like how they're having GitLab CI like to invoke like, you know, I'm sure you can do this in the GitLab CI file like rules that say, I wanna trigger the specs CI on a PR or on a commit to develop. I guess specs CI is just responsible for that. But on the child's job, it doesn't really matter because then everything will get built. Yeah, you can detect like pre-ambles and other code to be run inside the jobs. So if you can customize what gets done on the run, like keep in mind that like there's actually two, there's the GitLab CI ML that runs, that generates the pipeline and then there's the child pipeline, right? And so you can, if you wanna put something in the parent, you just stick it in your GitLab CI ML. If you wanna stick something in the child, you can actually write it in here in there's some other attributes in the GitLab CI section that you can put that in and you can basically add extra stuff to run on the child pipeline. Yeah, I think last time I looked at the documentation, it was kind of hard to process like all the key value attributes in this. It would be nice if it was, like some kind of, you can expose the adjacent schema like through like markdown pages, like similar to like what I've done. Like just to understand what the key values are. Yeah, we can improve the documentation for sure. Yeah, I mean like, I assume that this is all in the adjacent schema, right? It is in the adjacent schema, although the adjacent schema doesn't have like major descriptions of what the different fields mean. I don't know, that's a little low level for the documentation, right? Like we should just improve the GitLab CI documentation if you can't find what you need. Yeah. Okay, she's up, anything else? Nope. Nope? Okay. There's one more question related actually thought. So if you can go back to the same slide that would make sense, 19. So Fotis is asking on the Slack whether the matrix facility is equivalent to what GitLab has. Is it fed as in towards GitLab pipelines or how does it work? And are you happy with the pipeline generation in terms of parallelization? Okay, that's a long question. So as far as I know, GitLab doesn't actually have a matrix. So like you have to like, so GitHub, well, or at least Travis has a matrix and you know, we've used that. So does GitHub, yeah. Yeah, and I guess GitHub actually does too, but GitLab doesn't or at least last I checked it doesn't. So like, yeah, I would say the matrix is like that but it's a little different that it's taking a cross product and stack specs, right? So like in, it's not a full matrix necessarily because if you stick say a variant in there that doesn't apply to all the packages in the cross product then you'll, it won't generate jobs for things where it doesn't make sense. If that, if that makes sense. So like it's a little sparser than a full matrix. But yeah, it's the same notion. It's definitely, it's a cross product. And then for parallelism in the GitLab environment. Yeah, so when we started this, the only parallelism in the GitLab environment was staged. And so, you know, originally the GitLab CI logic would basically take this backdag and levelize it and basically say like, here's all the things you can build once you've built all of them, you can build these. And so you would get builds that would get held up by bottlenecks in the middle. So like if you built an R graph there would be this one like R node in the middle of it before, after all the native dependencies but before all the R dependencies. But now they have this dag functionality. And so we've been using that. We, so essentially behind the scenes we add this needs attribute to the different GitLab CI jobs. And that's basically like a, it's a dependency in GitLab speak. It lets one job run after the completion of another. Getting it, it's, I don't know that it's the best fleshed out system I've seen because like their notion of a dag is kind of weird. Like I think the dag stuff was introduced for cases where you might want to let the test job run ahead of the rest of the pipeline or something like that. Like they considered small cases. Whereas ours is like, we really want to build a dag. We have to. So like one of the problems that we've had was passing artifacts. So like if you naively do that and you say all of my binary packages live in this directory, then, you know one job comes along and it builds a binary. It uploads it to GitLab. All the jobs that depend on it will download all of their dependencies and then re-upload the whole directory with all the dependencies. So we got like this quadratic explosion of bandwidth in some of the early versions of this for uploading artifacts. And we had to tweak that. So that essentially, you know, so that we don't overload the GitLab server with artifacts passing. But I think, you know, for our cloud pipeline now we're finding that, you know using GitLab artifacts for passing binaries around isn't actually working. That way it's too much. And so we're using, we're starting to use S3 or local file system for passing around and we're making that customizable. So that like for a common case like a project CI or something like that you can use GitLab's native artifact support for the binary passing. But for something like her cloud setup you need to have some external like three million to upload things to otherwise you're gonna just overload your GitLab instance. So I don't know if that answers your question but that's most of the parallelism stuff. It is fully parallel now. So like it appears in stages but some of them can run ahead and the dependencies are expressed in the GitLab CI handle. So yeah, Fulti said that GitLab does have support for matrix. Maybe that's recent. Oh, maybe new. Yeah, okay, cool. I did have a question myself as well. I don't see anyone else raising any further questions. So the E4S stuff that you mentioned especially the part about the build cache. So the second time you covered E4S. Yeah, so here you mentioned that it speeds up the CI for lots of projects because they can just install binary packages. They don't have to build dependencies from source. So of course a lot faster but these are generic binaries, right? So they're just X86 not specifically for any architecture. They will work on Haswell or latest Intel AMD Rome. Doesn't that affect the, let's say the quality of the CI? So if they're not building for a specific architecture they may be missing stuff that could cause their changes to fail if they were using AVX2 factor instructions or things like that. So in general, the optimization model doesn't affect the ABI that much. So like it's... Well, what about the instructions that you use that? Well, okay. Not the ABI of course but I guess it depends what they're testing. If they're just testing whether they can call the functions that they're they're calling that, yeah, that makes sense. But in terms of actual getting things to run and getting things to spit out results that are not garbage there it may be relevant. So like I guess it depends on what you're trying to do with your CI, right? Like if you're trying to bet that your package still builds with all of its dependencies then this is a good thing to do, right? If you're really trying to test whether you've lost like numerical precision when, you know, when using AVX2 then yeah, you should test with AVX2. Okay, and then I guess to answer your earlier question. Yeah, right now they are generic bindings that are in E4S but that's going to change. The build farm for E4S is, well, what they've managed to amass at the University of Oregon is pretty freaking impressive. Like they have like 20 or more systems with lots of different architectures. The plan is that we're going to put those all as runners on gitlab.spac.io and start doing optimized binaries. There's a question of where, what level do we optimize at, right? Like so I don't think we want to ship binaries for every single Intel architecture that ever existed. Right, like you don't, I don't think it makes sense to build for like Skylake, Cannon Lake, Cascade Lake, and Tiger Lake, right? Like that, you don't get that much out of that. So what we're thinking that we might do is, so there's two things that need to happen. Right now, if you want to use an optimized binary, the matching is you have to basically cause back to concretize with the right hash to match what's in the build cache. One of the things that will be in 0.17 is for the concretizer to consider what's in the build cache in advance. And so we'll optimize for reuse. And so what that basically means is that if you give spec a build cache with things that can work with your spec, it'll pull them in. The other thing that we need to do is that we already have compatibility checking in the concretization algorithm, right? So like essentially like you can have a dependency that we can evaluate compatibility with like both your host and we can also say don't let a package have an architecture that is like higher than the root, right? So that essentially if you can run the root, you can run all the dependencies. There's a question of whether you should do that because you might want something like this. But anyway, we'll have architecture checking in the concretizer so that we can pull in binaries that are maybe not for exactly the architecture we're building for, but they're close. And so then the question is like, what do you build for? And I don't know if you've seen this, but like Zlib C recently introduced these like X86 levels and ARM already has levels like ARM V8, 8.1 and so on. I think those are probably better targets than maybe specific builds like we do now for a binary distribution. So like we could do like X86 level four and X86 level three builds where one of them has AVX and one of them doesn't and say that's our binaries, that's what you get. And the concretizer would pull in the ones that were compatible with whatever architecture you had. So if you installed on a Broadway, you wouldn't get the AVX512. If you installed on something else, you'd get the highest available architecture under that. So yeah, it's not super optimized. You can always build some source if you want that, but the speedup is real. And I think people like that a lot for CI. Does that answer your question? Yeah, yeah, it think it does. Yeah, it clarifies why these things are useful even though they're generic binaries or you're not really... It depends what you're testing for, like you're saying. I mean, so far the cost has been to get binary builds working and to get the relocation testing working. And like that's why the binaries are all generic and E4S right now, that they will be optimized by the end of all this. It's just, it's work. Like once you start doing binaries, it's a whole other dimension to your... Yeah, and the step towards different micro-architectures is also a big one, right? If you start not having just X86, but five or 10 or God knows how many different generations of Intel and AMD and it becomes tricky very fast and you better make sure you... We want to avoid the PIP TensorFlow problem. I don't know if you've seen that the PIP wheels for TensorFlow, like Google's like, oh yeah, they only work on Ubuntu. And I guess that's fine for Google because they just don't care, but we like this to be smart enough to pick something compatible with the OS. Yeah, that's another thing, right? I mean, you'll have different OSes as well. Well, I guess most of the... I actually don't know how aligned the operating systems are across the big systems in the US, but there's gonna be some differences for sure. You will have at least two or three different operating systems. So even for, let's say, a Tiger Lake system, you may still have to build three different binaries just because of the OS. Yeah, so I didn't mention this, but like one of the other things that we are working on under the research projects right now is loosening up the way that SPAC specs are constructed. And so in addition to the build problem, so you can think of SPAC provenance right now as a Merkel bag for the build configuration. And what that means is that you will always deploy with the things that you've built with. And so if I have a particular hash, the hash embeds the dependency hashes and so on. And so I have to deploy the exact version of Zlib I built with. One thing that we're doing to enable the binary stuff and also for ABI checking for the research project is we're gonna have a deploy hash as well, which will use a slightly different grasp. And so what that would mean is like, you could build a version of HDF-5 that maybe is deployed with a different version of MPI that it's built with. And so you would still have the full build provenance. The existing SPAC spec would be in there. You could say, hey, this was built with like MPH 3.1, but you would see that it's deployed against in VAPH 2 with another version, like for the host MPI or something like that. And so what I think that'll enable us to do is deploy against system dependencies much more easily with from binaries, because we can actually say like, okay, we know this was built this way and it's compatible with this version of the system library. And so we can do checking up and down the bag for compatibility. And I mean, containers are one place where that comes up all the time, right? Cause like if you wanna bind onto an MPI into a container, you're essentially relying on the MPIs API being compatible. And you're also relying on the dependencies of the MPIs because you're pulling something into your container and it has to find its dependencies. So we wanna be able to reason about that. So we are working on a model to do that. It's not done yet. And then maybe the final question, unless there's others, but I don't see others popping up. You mentioned in the CI stuff, the GitLab CI stuff, you're essentially gonna build a whole bunch of things for every commit into SPAC. And that could be pretty, especially if you're not gonna just build generic binaries, if you're gonna build up your build cache for every commit to SPAC, you showed like 500 plus merges a month to SPAC, that's gonna be a lot of build. Yeah, so like the AWS build right now with builds for every single pull request into SPAC, at least for the subset of packages that we're covering right now is like 2,500 a month. And the bulk of that charge, is actually $200 or $1,000. Okay, yeah, that's what it's for. Yeah, and I think it's way more builds than that. But the bulk of the cost is, this is what makes this somewhat optimistic is like the infrastructure. So like the GitLab server and all the other stuff is like a cube cluster that we have that running on, costs a lot more than any of the builds so far. And I mean, like we're not building on super fancy instances right now, but CPU time is pretty cheap these days. Like, I mean, if you think about what enables like, Google and Travis to offer free builds to people, it's a, you know, or GitHub and Travis rather. It, CPU time is pretty cheap with at least the cloud nodes. And with the build farm at Oregon, I'm hoping that we'd have enough resources to do CI for that. And I guess like right now we have like 300 packages in there and every one of them generates a job. And if the job does nothing, that's fine, but it still costs us a little bit of CPU time on instance. Somewhere we're working on pruning the DAG and, you know, reducing that cost to scale this out. So I think it's feasible. And I mean, in the end it's like, it's money, right? Like it's, and so, you know, for 2,500 a month for a $300 million project is not that much, right? Like, so I think, you know, with DOE supporting this, we can do something pretty cool and scale it out. Cause I mean, if you think about the cost of the machines that they're deploying, I mean, those are even more, right? And like this is pennies compared to that stuff. So I think it's feasible. The real, the CPU time is also the storage of course, because over time that gap will become big. Well, okay, so like that's something that we're actually already dealing with is like the build catches themselves, like for the PRs are transient. And what we wanna have for stack develop is a rolling release. And so what I think we would end up doing is keeping around like the frontier of packages on development for the past few months so that if you close back and get those and for every release, we would keep around release car balls for some subset of the builds that you can generate from a particular release. And so we would definitely not store everything cause then yeah, we would run out of space. Yeah, at some point it becomes infeasible but you can start cleaning up, especially for releases. And yeah, like you said, the last few months of develop. But there are, I mean, that's actually an interesting use case too. We had some researchers talk to us. There are people using spec for sort of addressing the binary clone problem. They wanna find like versions of, you know, certain chunks of binary code that correspond to a particular source code. They're using spec to generate like thousands and thousands of binaries to do that. And so like, you know, there's a use case for large, it's corpora of binaries too. I don't know that we're gonna keep it around but people can use it for that. So it's cool to have a system that will do that. Okay, yeah, I don't have any other questions right now. I think I don't see any in the chat or on Zoom either. So I guess we can wrap it up here. Thank you very much, Todd. Thanks.