 Okay, thank you, Kenneth. This site presentation is a bit of a regular now for Compute Canada. I'm presenting this slide deck in cooperation with my colleague Maxime Vachonneau and also Ryan Taylor has in the past contributed to these slides, so I'm mentioning him as well, but he's probably sleeping because he's in British Columbia where it is around 6.15 a.m. at the moment. And we're speaking on behalf of the Compute Canada Research Support National Team, so that's a team that is mostly responsible for installing software, but there's some other tasks as well, namely the ticketing system and the user documentation. So our presentation outline is where we were, where we are in Compute Canada. There's supposedly a lot of changes going in that we're going to be reorganized in the coming years, but that's a completely separate topic. Our software installation, our Goal and Design Overview, the tools that we used, which is a combination of CVMFS Gen2, which used to be Nix, an easy build of course, Elmot and Python, and we do monitoring, a little bit about monitoring documentation. We don't have time for demo this time, I accidentally left that in. So what is Compute Canada? So Compute Canada, you could see as a collaboration between different regions and sites within within Canada that administers the academic supercomputers here. There are other supercomputers too, of course, that are outside the scope of Compute Canada, for instance. The weather forecasting organization, Environment Canada has big supercomputers themselves that also shop in the top 500, but we're looking at the academic site here. And Compute Canada in itself is composed of four regional organizations. Some have sub-organizations as well, but they're out west, we have Westcrete, we have Compute Ontario, for the province of Ontario, province of Quebec, we have Kaukou Quebec, and then in the east, eastern Atlantic provinces, there's ASNAT. About 200 technical staff distributed among many universities, and more than 15,000 user accounts, professors, postdocs, students, you name it, some people in government organizations as well, like research institutes. Across all kinds of disciplines, so we see here a diagram of where they come from. There's a lot of growth in non-traditional disciplines like HPC, traditionally is a lot of physics, computational fluid dynamics and things like that, but we've got a lot of growth now in bioinformatics, in health sciences, in digital humanities as well. So the ARC platform has these four organizations and there are some national clusters and some regional clusters as well. So the national clusters are accessible by all researchers in Canada, of which we have four HPC clusters, they are Cedar Graeme, Niagara, and Beluga, and a cloud system called Arbutus. This was also discussed in the previous talk by Felix Antoine. So when we were getting these national systems in place, replacing a lot more, what we call now legacy clusters, we wanted to have a system that was as consistent and easy to use as possible across all sites. So no matter where users log in, they get a very similar interface, whether they log into Graeme, Cedar or Beluga, Niagara is slightly different because it's not a general purpose system, it's purely CPU-only MPI for larger jobs, so they have kind of a separate stack, they want to go their own way, but our stack is also available in that cluster. So we want to have all software accessible on every site, it should be reliable, performant, and we want it to be independent from the underlying OS stack, so whether it uses CentOS 7 or CentOS 8 or some other Ubuntu or some other Linux distribution, it should still work. We want to automate as much as possible and we want to make it easy to use this large and evolving software stack. So we need a distribution mechanism, we chose early on to use CVMFS, this has been discussed extensively, so it doesn't need much more introduction than that. We have an independent compatibility layer which uses CentOS, it used to be Nix, this was also introduced in the easy presentation by Bob Trohe, we had a presentation from CentOS itself as well and automated installation by EasyBuild, and then we use Elmod as many others. So luckily all of these things were already discussed during the EasyBuild user meetings, so I can go quickly over these slides. We have a process that's mostly user-based, so except for the basics, we usually a software is only installed, software package is only installed if the user asks for it, so we have a ticket, I want to, can you please install software X, why is that for me? And then we create an internal software installation ticket where we install it on a build note, where there's a bunch of installers, about 10-15 people who regularly install software, where we share two build notes that are cloud VMs, we install on the build notes, we can install in our home directory, we test on the build note, and then when it's working okay, we can install centrally and then push it into CVMFS, for which we also have a development CVMFS and a production CVMFS instance, so we can publish it first on the development instance, then test it on the test note or on the clusters directly via a helper utility called PROOT, and then we can publish it to production and then inform the user, hey it's available on the clusters, please let us know if it works for you and then we're done. Sometimes there's software that's too specific or a user asks, can you install this, the latest snapshot, that's just a master checkout of Git repository XY set, and then we say no way we're not installing unstable software globally, you can install it yourselves and then the user, either we installed in the user's account ourselves manually or the user sometimes can use easy build themselves because we have that all set up to to install in the user's home directory as well and then it will just show up as a custom module. So this is the software stack is continuously growing, Maxime has been so kind to update this graph and we started this whole process in 2017 and right now we're, as you can see, it goes all the way into mid-January 2021 where we're now and there's been some some growth spurts when we added new architectures like AVX 512, you can see in terms of numbers, Bioinformatics is the king of a number of modules that should come as a surprise because of the enormous diversity of packages that are used by Bioinformaticians and so so you can see how we now are nearly at 8,000 software packages including all the variety architectures etc. So our design overview, it must look familiar because you've seen a very similar overview in the easy presentation by Pop Tröger, that's another coincidence because they used our overview as a template. So this year what we've done is to is to move from the 2017 base stack with Nix to a 10-2 base stack with the number 2020 because that's when that particular 10-2 version was released while you couldn't call it exactly a release. It's a step shot so it's just a rolling release where we took a random date from May 2020. So we have at the top the easy build layer so there's multiple architectures and we put that in a particular path modules software with 2020 and the older modules are for 2017. Then we used to have some middle layer with easy build generated modules around Nix profiles that looked like a good idea at the time but we got rid of them, it makes the system a bit simpler and we moved all these things that were compiled with Nix and then have easy build generated modules around them. We moved them into pure easy build so all these packages tc, eclipse, qt, pearl, python are all easy build packages now. We used to use Nix for compatibility layer but just like easy we're now using a 10-2 we can into more details in the next slide of why we changed from Nix to 10-2. Then under the compatibility layer there's a gray area which we also put in 10-2 but it can be overwritten using using all the library path if you want to that includes things like the luster client libraries and and lower level client libraries for for infinite path omni path and things like lip ib verbs for the infinite band stack packages like ucx and and lip fabric that are a little bit higher level than ib verbs they are compiled using easy build much like in upstream easy build and then at the bottom we have things that are local those are things that are very intertwined with kernel modules or the kernel itself and some things that we have legal restrictions on like like fast that are not allowed to be distributed even in a restricted repository. So the next part is why did we that like what are the difference between the old Nix stack we had in 10-2 prefix so so 10-2 as was introduced in the 10-2 talk it uses recipes much like easy build except they're written in some kind of best like syntax and they're called ebuilds but otherwise there's a lot of similarities between a 10-2 ebuild and an easy config file. We use them to provide dependencies for scientific applications. We provide ellipsey itself but we also provide many things that that easy build provides in the compatibility layer that are kind of boring and they're not scientifically interesting things like lip xml2 and and many many and many other things that are that that easy builders provide as modules because what they always provide is often too old. We put in the compatibility layer we added to filter depths. We provided abstraction layer as a module so when you install 10-2 prefix you you you get a best script that allows you to enter the prefix environment we use a module instead to to set a lot of the same environment variables and then we have most of dependencies with scientific software stack. So our 10-2 prefix layer is basically a checkout from the rolling gentle distribution namely a file that I downloaded on May 4th 2020 and then we we freeze that there is no version updates except for couple optimizations that customizations that we put in a special overlay repository attempt to overlay where we make we have to make some changes sometimes there are bugs in 10-2 sometimes there are just customizations that are more local. So swapping the module from the older Nix to a 10-2 feels like a Linux distribution upgrade it just makes everything newer it's like you do a YUM upgrade on a fedora distribution and just boom all the basic tools are certainly newer. So the big change is a new libc version change where in the older Nix layer we had 2.24 we now have version 2.30 and that actually has some implications for HPC because a lot of changes went into the libm library in Tlipsy I put a link here in the slide deck later on you can probably click on there when I make the slides public I'll send the slides to Kenneth. So one thing is that that a lot of effort went into optimized math functions like the exponential log power etc the older Tlipsy was extremely precise up to one half a unit in the last place which is the most correct you can get but it paid by switching to multi-precision arithmetic when needed it meant that for some input of the exponential function the exponential functions only went like hundreds of times slower than for other inputs with confused many people and now they made it faster but at a slight cost of precision for those edge cases and actually there are some examples there is one example from Damian who works at where he said that there was a big difference between the libm speed of GNU versus the one in libimf from Intel and those differences more or less went away with the newer libm so it helps it benefits everybody by having just this newer libm than what we had before and then many other things then change like we have a newer bash we have a newer git we have a newer vim emacs etc etc so why why did you switch to gen2 instead of nix and why kind of also reason why easy users gen2 is like in nix you have a very self-consistent system but when you put things on top of nix the whole system breaks down so so nix is very beautiful much like guix but you have to stay within within that environment to make it work properly and the reason is that in nix what you get is a is a sim link forest where you have a profile you you get things out of that profile but the profile itself is a sim link the version points to a generation like in this case 523 which in turn links to a hashed path and in the hash path you have the actual binaries which in turn point to the hashed location of the of the package in this case core utils for ls now we wrapped ls sorry the ld the the linker so that all paths point to this to nix user profile slash lib so the the first location here but what happens is that when you upgrade a package because some bug fix or something like that it gets the package gets installed into a completely new nix store location with a different hash and what sometimes happens that these hashes they they leak through the environment even though we only advertise these uh the the top lever top layer link top level link what happens sometimes that software packages they resolve sim links and we have these store locations uh leaking into user make files or something like that and and what what happens is that when you upgrade a package we can garbage collect the old location then the old location no longer there and things break down on the user level uh and that's fairly hard to control and we found out that tend to prefix it doesn't have any sim links it just stores directly in the location and we kind of eliminate that so tend to previous like the minimal solution for what we need whereas nix has a lot of extra things which is in principle useful but it sometimes hurts us now on top of the tend to layer we have what we call um software standard software environments uh put a link here right explain in our wiki um and so some combination of modules is loaded by default uh on clusters so that when people uh people can can use gcc and into compiler as soon as they log into the cluster and they don't have to to uh load any modules themselves before using a compiler uh there is some disagreement about that some people like it that that no modules are loaded by default and people should get a minimal environment and they have to load whatever they need but we had some majority that said okay we should load something by default which makes it a little bit more user friendly um and we call those standard environments it's possible to go outside that structure you can we also have like Intel 2017 installed and Intel 2019 uh but that's the default which usually uses uh c and and it's the what we compile the most packages for the other ones we install only small subset of net health packages for people who have a need of different compiler versions so so the workhorse for a few years has been uh 2016 compiler and tc 5.4 on the cluster c there in graham that were born into production in 2017 uh on the newer beluga cluster uh and partly on the agrar as well we we use tc 7 3 and Intel 2018 uh with open mpi 3 and and that's still using the same nix stack as the 2016.4 standard environment and now in uh we're working on the newer gen 2 based environment that we're making the default in april this year and that's that uh that actually corresponds to upstream easy build iomkl 2028 so it's kind of a mix between fos and intel uh it's mostly the intel toolchain accept that we use the open mpi instead of intel mpi because we find that that the runtime architecture uh works a lot better first with open mpi than with intel mpi and so we have 10 to cc 9.3 uh intel 2020.1 and open mpi 403 uh we have multiple flavors for architectures except for the system toolchain for which we uh which we have architecture independent so it uses generic ssc 3 uh one thing we've also we've changed versus the um versus the 2017 stacks like the 2018 and 2016 standard environments is that we now have fat binaries using the intel compilers uh that helps from clusters that are heterogeneous uh we now have particularly the cedar cluster uh it has login nodes which are haswell or brodwell and it has a lot of newer worker nodes that are skylake and so we want to make sure that the binaries are optimal and sometimes users submit jobs and they don't know if those jobs end up on skylake or or haswell processors so we want them to to have optimal performance no matter where they end up um some other things that we what we do a little bit different from upstream is that we have a lot of modules at the core level we compile them with tc core 930 but they end up in the core um part of the module hierarchy and then we even have that for things like r julia many b bioinformatics tools basically anything that doesn't use either mpi boost fortran asia 5 f ftw is heavily factorized that can benefit from the intel compiler we compile with tc core and it ends up at the core uh module hierarchy so that's kind of different from upstream because upstream has a separate hierarchy for tc core uh with a version and we can make that work by everything by a central lip standard c plus uh library and lip t fortran library so by doing that we basically take advantage of the fact that lip standard c plus plus has first and symbols so it is uh it is backwards compatible and and this this makes this makes it possible to collapse different tc core versions into a single location it's a little bit tricky but it helps users who compile their own software will just use it always use the same lip standard c plus plus so so we don't have things that accidentally try to load different standard c plus plus versions from different locations because they were compiled at different times another thing we do differently from upstream is that we install intel mkl at the core level which means we get all the functionality except for the mpi wrappers of ftw and we even have a toolchain that i i stole from from julie which is called cc core mkl which combines cc core but it can use mkl because mkl is installed at the core level we can combine the two for for the packages r and in particular we use our path but we don't use easy builds our path support we use our own linker wrapper so it actually uh it's also used by if users compile their own software if the the wrapper looks if it links to things that are installed via easy build and if so it injects the r path in the binaries we use our path now instead of run path so we can no longer overwrite this with all the library path however the advantage is now we have our path inherited by a runtime runtime plugins that's a bit complicated but run path sometimes is not inherited and that means that we lose the information in in in shared objects and that gives us very annoying blocks of time so our path has that order nice so um with standard mf 2020 we have a bit of catch up to do we had 800 different software packages installed in the older stacks so so we had to recompile pretty much everything except that we thought okay let's put in some priority we we made a list using module logs we identified which packages were used the most often and then we made like a top top 800 list and we started from top to bottom and say which ones needed priority and recompilation because there's often users who ask for software packages to be installed and then they use it for a bit of time and then the user graduates or disappears so there's no reason to install it to install we install everything we just need the ones that users use um we have some unresolved issues that are mostly about just upgrading software that that's what happens over time uh we found some some some bugs like saying oh we use a tech number in open mpi mpi sent an mpi received that was like a tech number of one billion and suddenly didn't you didn't work anymore and actually that was not permitted by the mpi standard it was not it was out of spec but it just happened to work with with older mpi which uses open ib and now if you see actually it doesn't work but anyway this is kind of the things that we we keep running into when software is upgraded people find that the software for some of your reason doesn't work anymore it's just what happens as software gets upgraded we still have some some things we need to work on too uh parallel IO doesn't work optimally for some people that we that's always a complex issue there's there's some library that can speed up communications collective uh on collectives using the age call library uh we installed in the compatibility layers the melanox library but we found some issues with it so we work on that and we also have virtual gl in the compatibility layer uh for using um gpus on working notes that's uh so the minor things we still work on um another thing we we've used a lot more in the newer stack is to have python extensions with multi depths so if a python support exists we install with python 3 6 3 7 and 3 8 so it's not just upstream easy build is one python 3 version plus python 2.7 we have multiple uh we installed for multiple subversion of python 3 so in this case 6 7 and 8 we're we don't we don't have 3.9 installed yet uh because we haven't had anybody asking for it yet I'm sure it will happen in the future um so examples are things like boost to us uh to kikis and open cve many other packages so what people do is that they load the module qt 5 for instance and they uh get by qt installed as a python binding so so we we just make that easier for the user and to say okay we want to to to load python module and have some python functionality of the other module and then we don't say okay we have to load another module on top we just have it in a single module um some people disagree with that approach we we found it it helps the users um however for many other python things that are pure python and don't need another module we have an extensive python wheel infrastructure uh where we have our own wheelhouse and so uh the standard of python distribution are our wheels and because our environment is a little bit different from um what the python world calls um uh what's the word again the uh the the binary packages that we can install from pi pi they often don't work on on our clusters so we provide our own wheelhouse so users install those instead so with no users um when they want to install most python packages like tensorflow uh set up your own virtual environment and use pip install python package uh like pip install tensorflow it gets installed in their own virtual environment and then they can they can use that so that gives a lot of flexibility in terms of which python packages people want to use we have some some python packages via modules like these extensions we also have a sci-pi stack module for the most common uh scientific packages London by sci-pi and upload lip uh things like that this is an example of the wheels so we have at the moment as many more than 8 000 wheels installed it's quite a lot of python packages and we can use the command afail wheels to to see which ones are installed so these are tensorflow for instance you can see that we have two three zero installed for us for python three eight three seven and three six so what models are users using we we send that to sit syslocks and we have a nice critical interface uh we can see uh what what people are using using craft banner so here this shows uh which models are used as a snapshot of a couple years ago um we can see uh we can exclude users we can see we can exclude some modules and we can see which then given the description um what models are used so this was kind of the things that we used to also identify uh which models were used more often in the oldest text to see what needs to be compiled for the newest standard environment so this brings me to documentation and resources so we have our list of modules on the user facing biki but we also made uh last year we made the technical documentation public so there's now a github uh based technical documentation about the compute scanner software stack i know that the people on easy refer to it quite a bit also one of the reasons why we made this public and basically anything that's not confidential and we have to keep in our internal google docs based documentation is has been put in that software stack so if you want to know more about our software stack uh internally it's it's it should be all there if not please let us know um you can use the public side of our software stack as well using this uh wiki page access in cvmfs uh so that's uh that allows you to to mount it much like bob showed in his talk for easy you can access our cvmfs as well even on windows with wsl to and and i can use all the software like so that's the end of my talk i hope i didn't go much over time a little bit thank you bard um does anybody have any questions for bard you'll get a question so i will pass across to him thanks for a very interesting talk um one quick question you've mentioned that you had problems with nix and the python virtual environment do you happen to know if anaconda has a similar problem if you're using nix anaconda and nix um i i i'm sure we've seen many we've seen many issues with anaconda so if you if you would combine a python from nix with with anaconda i'm sure you'll find all kind of weird issues going on there thank you because basically basically when you install the over the underlying issue is when you combine when you put something on top of nix uh you are getting outside the nix infrastructure so nix doesn't know what's happened what happens on top of it so whether you use easy build anaconda uh spec whatever in combination with nix it's slightly to break down somewhere yeah that was my suspicion as well from the problem you described with the python environment thanks very helpful maxine has also raised his hand i wanted to add to this the breakage we we had with python is virtual and copies the python binary which was from nix and included uh our path to the nix store and so when we updated something and garbage collected the nix store those our path would stay in the user's environment because the binary was copied in there i'm not sure how this would happen with anaconda but mixing anaconda with anything will give you headaches yeah we've we've when you go on our on our python page the wiki page about python on compute kenda we basically tell users please don't use anaconda it's entirely at your own risk because we've got so many issues with people installing anaconda i have to say thank you for that page because we send our users to your page but i'm not using anaconda yeah so i'm just seeing that somebody is typing and i'm just wondering whether we're about to get a question or in the slack uh question is um for the python packages in the wheelhouse do you have multiple copies built with different architecture optimizations yes we do we have um we have very sub directories um and so uh the the great great great majority of of wheels in the wheelhouse are architecture independent and are just pure python and so they are in generic and there's a small subset that is uh that is an avx2 or avx512 sub folders and it automatically detects where it should copy from