 Yeah, let's get started So welcome everyone to the 15th easy build user meeting and my opening talk to Give an overview of what we did with easy build in 2009 and also a little bit where we're going in the coming year So we've had four easy build user meetings before we forgot to take pictures at the first two We only have 2018-2019 But in terms of attendance It's growing significantly. It says 53 here for today here in Barcelona There's actually a couple of people who forgot to register or people who joined very late. So it's actually more than 53 There's a big mix of people from all over Europe lots of local attendance and of course from Spain and each PC now But also two people trying in from the US Somebody from Cyprus And also other non-European countries like the UK. So It's very nice to have this good mix of people here Third different countries not including HPC now, which is not the country yet And three different companies, so that's very cool In terms of agenda for For this week So I'll start with an overview of Easy build in the last year and where we're going Then John who flew in from Seattle all the way here from Fred Hutch Will give an overview of how easy build is used at Fred Hutch and also give some updates on the easy update tool that he wrote Which is very helpful And maybe where we can go with that in the future She's have there's a typo there on the slide. Sorry She's that will talk about building an easy build container library in the silabs cloud So some ideas there how that can be done then we'll have a remote talk by Maxime We'll be calling in from Canada and talking about How they are using easy build and some other tools combined to build a big software stack for the HPC sites in Canada how they're sharing that and how all that works and then muscle Miliano Really wanted to join the easy build using meeting again after last year and give us an update on SPAC as well Then tonight we'll do a group dinner In one of the best tapas bars in Barcelona. You have to check with David how things are going to work Practically, so we'll go there by group everyone who wants to join can join but it's at their own cost Tonight, so definitely let that know on the registration and I think we have to set up some upfront payment as well for that But more on that later and then tomorrow we have a busy Agenda she's up. We'll be talking about his build test tool Fazelias from CSS we'll be talking about reframe Then we'll have several site presentations. So the people or ill from HPC now We'll talk about how HPC now is using easy build and their consultancy work Simon from Birmingham We'll talk about the use of easy build at the University of Birmingham and okay about the About you maya and maybe Sweden As well Then we meet we'll talk about their HPC on open stack setup that they have at the Vienna bioscent biosenter Then a little bit more site presentations Luca from CSS and Kaspar from Cersara in the Netherlands Then in the afternoon Sam is going to give a Small tutorial on how you can contribute back to easy build So Sam is one of the easy build maintainers attending the meeting this week And you'll see the easy build maintainers have a red name tag So if you have any burning questions on easy build and you bump it to somebody with a red name tag You can definitely ask them And they can hopefully help you out And then to close up The talks on Thursday will have another remote stock by Robert McClay who will be calling in from Austin And give an update on Elmott and Exalt And then of course since we're in Spain we have to wrap up the day in a nice restaurant so we'll go to a very nice place Can Cortala and The best news here is this will be sponsored by HPC again. So you won't have to pay for all the good food You get there Then on Friday morning We will start very early so there's a tour and Organized or a guided tour at the Barcelona Supercomputing Center on Friday morning And it was very difficult to find a slot for a group as big as this So the people from HPC now did a very good job at finding a slot The downside is it's quite early in the day. So there will be a bus leaving at the EBS hotel where lots of the people attending here are staying and the bus will come here right in front of the Technological Park here and leave at quarter to eight The tour will start half past eight and then we should be back here by around 10 for the coffee break So there will be lots of coffee So you can wake up for the rest of the talks on Friday where Luca from CSCS will be talking about the Saras container Runtime they have at CSCS Then two more site presentations by Yin Zhang who flew in all the way from Australia to Barcelona for this meeting and Alexander from the University of Dresden in Germany And then I'll close up with more informal talk maybe 10 things you didn't know yet about easy build I haven't finished my slides yet for that. So if you have any tips, let me know So as you've noticed we've been setting up the live stream So people can also join the meeting remotely and ask questions via Slack and all of that and we've have a pretty good track record of Recording all the talks we've had up until now at the easy build user meetings and all of them are posted on the Easy build YouTube channel which we have since last year as well So thanks a lot to Alan for all the effort in setting all that up So before we dive into easy build itself a little bit about me Confining on github Slack Twitter. I lose a lot of time On Twitter, maybe I shouldn't I got my Computer science degree at Ghent University in Belgium. I Was working with machine learning before it was cool So in the 2005 Era and then when I started When I joined the HPC team in Ghent in October 2010 I quickly was assigned to user support and training a big part of that is so doing software installations for the for the scientists and Easy build was already there at that time in the team what it was shoved into my lab and then We took it a step further by releasing it public and things exploded from there I Like a lot of things including beer. So if you want to buy me a beer, I will not stop you If you have some nice stickers, I can probably pull off another one and put the sticker on my laptop There's also lots of things. I don't like which are mostly in result of using easy build or implementing easy The HPC team in Ghent University is the central team at the IT department of the university So we help pretty much anyone at the university who wants to use our HPC infrastructure. We do training support We mostly just buy the hardware and installing and figure everything ourselves We have a modest infrastructure, it's not terribly big, but it's big enough That we need to have a lot of work and need a lot of manpower to keep it running We're also a member of the Flemish Supercomputer Center, which is a collaboration between the Flemish universities. So we share accounts. We get access to each other's infrastructure and so on In preparation for The meeting this week. We did the user survey. This is the third time. We've organized this It's an anonymous survey through SurveyMonkey, which is a very good tool for organizing surveys We basically do it to get more Insight into the easy build community, get some feedback as well in an anonymous way so people will have to be scared Or shy to say that things are broken or that they are not happy with certain aspects It's a pretty long survey, 39 questions, but a lot of people were participating in it So we have 91 this year, which is pretty good And the assumption here is that it gives a pretty good view on On the community, but yeah, we don't really make any big decisions based sorely solely on the On the survey, of course and I'll go through the results of the survey in this presentation mixed with some other things so first couple of questions in the survey were What kind of profile do you have? What do you consider yourself to be? So over half of the people consider themselves to be a cis admin and then another quarter to user support and Then you have a bit of a splintered view for the rest of it So there's a clear bias in the easy build community towards user support and system administration, which is not a big surprise In terms of type of organizations, it's a big mix Central Computing centers at universities or maybe research groups National computing centers public and private research institutes of companies and so on Most of the easy build community seems to be in Europe, which is not a surprise either What we do get a bit of coverage in North America the US as well And then small bits and pieces throughout the rest of the world Australia Asia Africa So all of this is we don't see any big shifts here compared to previous years How long have you been using easy build was a another question we asked so here we do see a bit of a shift To people who have been using easy build for a long time either longer than five years or two to five years I'm not sure how to interpret this, but it seems like People that start using easy build stick to it. Otherwise, they we wouldn't have a big ratio like this But we do see see still see new people coming in as well. So that's a that's good, I guess How did people first learn about easy build most of this is basically word word of mouth presentations articles people showing it or People trying to convince someone else to this is a good tool you should try and use it. So that's a that's the best Advertising you can get I guess We do see a big increase of people saying it's already in use. So people who change jobs Easy build was already being used at that site and they rolled into the tool that way Then the most convincing part about easy build why people pick up on it It's basically the core functionality. So the easy build framework what it can do generate modules automate software installations fully autonomously and so on and also a Bit of the supported software so people run into things like open foam or tensor flow that they want to install from source And if you do it manually You'll get depressed pretty quickly. So that's how people find And easy build and of course already in use is a big part here as well So some of the highlights of last year. So when we had the user meeting Last year we I talked a little bit about easy build for what was coming up So we finally got easy build for out the door last September That was a big relief. I was very happy that we finally made that release because we were working on easy build for and Easy build three updates in parallel, which was a lot of work and a lot of Shifting attention between the two and making sure the separate branch we had for easy build for Did not get out of date compared to easy build three. So that was a bit of a bit of a hassle But now the release is there. We're back to a single branch where all development is being done The community keeps growing and I'll show some some stats on that we see also significant growth in contributions Which is very good and we're still managing to keep up with the incoming contributions, which is very important as well So if you get a thousand pull requests, but you can only merge 200 then you're very overwhelmed So if I think we've done a pretty good job keeping up with contributions, even though we see a significant growth there We have over 12% increase in supported software packages, which doesn't seem like a lot But we're close to 2,000 supported software packages now. So that's a significant growth and I should also mention there that we archived a lot of old easy config So we actually lost support for some software packages Which are no longer relevant or no longer updated or nobody is interested in them anymore And we archived those easy conflicts in easy build four. So we got a little bit of a dip there What we still have an increase compared to the beginning of last year You know Significantly better support for installing Python packages and bundles of Python packages now even You can do a single installation that's compatible with both Python 2 and Python 3. So we have a good way of dealing with that We do a better job at checking the installation. So we've discovered this pip check command That makes sure that all the dependencies that the Python package requires are actually there We were not doing that before and we ended up with half broken installations in some cases So we do a pretty good job now by avoiding those We jumped on the github actions bandwagon. So github now has a native Continuous integration System or service I should say inside of github. So everything is in github. You don't really need a separate service like Travis anymore So we were using Travis before we're still using Travis We're basically now using both Travis and github actions for now Well, the long-term plan is to shift everything to github actions because it's native in github It provides a lot more resources you can run more tests in parallel So you get quicker feedback on pull requests and overall it's a lot more stable and Travis is for now, I Want to say we're not confident enough yet to drop Travis entirely and it's also not very easy to Test easy build on top of Python 2.6 in github actions, which is now very very old But we still support it which is in Travis is very easy. So we certainly keep Travis for that part For the coming months and we'll see how that evolves in the future And the github integration that we have an easy build was significantly significantly improved and Sam will talk about that tomorrow in the tutorial and Also the support for building Singularity container images was support us improved quite a bit and the first half of 2019 and That's probably gonna come forward and she's at stock This afternoon as well. So the most important changes in easy build for Well, this flagship change or the biggest feature is the the fact that easy build now works on top of Python 3 as well So any version Newer than or equal than 3.0 Python 3.5 should work now and we still support by 2.6 and 2.7 as well We've dropped some Required dependencies. So the Python the setup tools Python package was required by easy build at runtime because of some technical internal details and we had this VSC base library, which is a Python library that We've developed inside of HPC again, which has the option parser and the logging Support and so on. So this was a required dependency in easy build 3. We've dropped these as dependencies We basically ingested VSC base into easy build itself because this was not ported yet to Python 3 It still isn't actually so to move to ingest it and do the porting inside of easy build itself Was a first step to support Python 3 And it also simplifies the installation of easy build itself but sometimes we ran into problems because of Breaking changes in setup tools, which don't make any sense at all to me or people having a very old set of tools installation on their system and then running into problems because of that It's not very easy to update setup tools in some cases So just got rid of all of that and you don't need these packages anymore You just need easy build itself and the Python standard library. That's it We've deprecated the dummy tool chain for two reasons. It had some quirky behavior in terms of when dependencies are loaded So we got rid of that and the name was a bit silly as well So we renamed it to the system tool chain, which probably makes more sense You can still use dummy now in easy config files, but you'll get a warning that it's deprecated It still does the same thing it did before it just an alias now for the system tool chain. So It's not a big issue if you're still using it, but you should try not to and then another change that we did quite late In preparation for easy build 4 is that easy build will now detect Unknown easy config parameters. So keys that you use in easy config files that are not known to easy build Which could be typos or it could be just mistakes something that you think is supported in an easy config file But it's actually not so it will now print out a warning if it sees anything it doesn't know and Together with that we had to implement a naming scheme for local variables. So sometimes you do use Variables in an easy config file that are not easy config parameters And if you prefix them with local underscore or just underscore easy build will not complain about those so And all of this all of these changes are well documented in a separate Page in the documentation which is linked from the main page And a bit more on the road to Python 3 so it took us a while to port easy build to Python 3 Mostly because we don't have a lot of time to dedicate or I don't have a lot of time today to dedicate to easy build development and also because we didn't want the Porting to Python 3 to disrupt the easy build 3 version So we kept doing easy build 3 releases and bug fixes and improvements there and we did the porting on the side So it was a bit of a duplicate effort So the I started the effort in December 2018. So before the previous user meeting We ingested VSC base first because that wasn't ported yet to Python 3 So we just copied everything we needed into easy build itself and then ported it there Because the impact of porting VSC base is a lot bigger inside of HPC again You have to be very careful there And then we made sure that all the unit tests for the easy build framework were passing on top of Python 3.6 Which was our main Goal at first. So it took us about or it took me mostly about a month or two to get that working then we also Made some additional changes to also support Python 3.5 and 3.7 So one of the issues with Python 3 now is that pretty much every 3.x release has some breaking changes It seems like they're somehow scared to release Python 4 for some reason So I guess the whole mess with Python 2 and Python 3 and it took forever for people to pick up on Python 3 I think there will never be a Python 4. They will just keep breaking Python 3 over and over again Once that was done by mid-March we started doing actual testing of easy build itself installing software With easy build running on top of Python 3 and then discovering some bugs there in easy blocks But also in framework and we fixed those over the course of a couple of months And then finally we had an easy build 4 release that supported Python 3 in September 2019 Now when people started using easy build on top of Python 3 a couple of more Small bugs popped up and those were fixed in the weeks and months after and then With the 4.1.1 release which was done earlier this month There are no known issues on top of Python 3 anymore, so it's pretty stable when you want to use easy build on top of Python 3 Some may still pop up. We'll see certainly if people start using Python 3 more For easy build, but I'm fairly confident that things are working quite well And also when we run regression tests, we run easy build both of Python 2 and Python both on top of Python 2 and Python 3, so we have a pretty good idea of what's broken and what doesn't but what is not broken Looking at supported software over the years So this is since the very beginning when easy build went public in 2012 We supported about 150 software packages We're now ramping up to over 2000 different software packages So and you can see it's pretty much going up and it's not really slowing down At all so bioinformaticians keep popping up with new tools Because the old ones didn't work for some reason A big part of the supported software is in bioinformatics So about one third of all the software packages we support is in bioinformatics And then another quarter or so is in supporting libraries and tools so basically dependencies or tools you need things like CMake or Tools you need to build software with And then it's a bit of a splintered view for the rest of the software So it looks like you had a small dip here because we archived things in easy build 4 that were no longer relevant Or that had a very very old toolchain and nobody was updating them anymore. That's why we have a bit of a Drop here, but it looks like if this rate if we keep up this rate or will cross the 2000 Supported software packages this year So these count do not count extensions So if we install R with I don't know what close to a thousand different R packages inside of the R installation If you count all of those you probably double this number here, so Then back to the survey One of the questions was on which operating systems are you using easy build? Most commonly so people could answer multiple things here. So the numbers don't add up to a hundred percent Most prominent one is sent to a seven X here So which was almost three quarters of people are using sent to a seven on at least one of their systems a Big amount of people over 15% are still using sent to a six so I Was a bit surprised by this But I guess this is old systems that are still running and there's no longer worthy efforts to update them to a newer OS. I guess that's mostly the issue there And we see the rise of sent to us eight and rel eight coming up as well So that's probably going to be a big shift in 2020 new systems people are Probably going to go with this rather than sticking to sent to us or else But quite a bit of usage on other operating systems as well you've been to is up to 15% for the most recent version Debbie and is a bit less and then Susie Kray. These are the more adventurous people who want to be different Than anyone else, but we get some usage there as well In the Python version that people used to run easy build with So since easy build for you can run easy build on top of Python 3 and you see that the splintered use here So bite bite 3.5 3.6 3.7 This is probably going to stay like this for a while depends on which operating system you are using Which is the standard Python in there? So we have to be careful to keep testing with each of these Python versions and make sure easy build stays compatible 80% is still using Python 2.7 even though it's officially dead It doesn't surprise me and that I mean the Python core developers are giving very mixed signals there as well officially the end of life was January 1st 2020 what they're going to do another Python 2 release in April. So I Don't know what these people are smoking, but I want some So I expect things to change in 2020 significantly, so I Think by the end of 2020 most people will still be using Python 2 because it still works certainly in CentOS 7 It's the standard Python. So Why will people change to something else for for running easy build? Well, there's hopefully going to be a big increase in the use of Python 3. So Let's see if those predictions hold up When we do the next service And then this we've asked this question. I think in all three surveys So how concerned are you with easy build drop support for Python 2.6? Which is What is it 15 years old by now? So we're now at the level where people say only 2% of the people say it's problematic or worse So I'm sorry, but those people will have to update it to when you were Python So we've deprecated support for Python 2.6 in Easy build 4.0, maybe 4.1 even But you'll now if you're running easy build on top of Python 2.6, you'll get a warning that says you're using Python 2.6 You probably should upgrade to at least then you would bite into So This change was made before we did the survey and it seems like we did the good We anticipated things well, but I Guess this is still the people that are stuck on CentOS 6 and haven't bothered to upgrade to Python 2.7 there But yeah, that this shouldn't be a big hurdle to take hopefully Then basically the same question but for Python 2. So what if easy build goes Python 3 only? Would that be a problem and they're pretty much 25% of people say yeah, please don't it's too early Maybe they are not confident enough with easy build on top of Python 3 even though It should work And if there's any issues, please let us know so we can fix them but going forward Python 3 is officially supported and any version newer than 3.5 should work But for now, we will not actively drop Python 2 support. I don't see a reason I don't think easy build is gonna become a lot better if we only go Python 3 It will take us a while to pick up on Python 3 specific features that we can use and we're stuck to probably Python 3.5 Anyway, so stuff that's only in 3.6 or 3.7 We won't be using anyway. So I think that the jump is it's not worth the jump yet I'm thinking about deprecating Python 2 support in the future probably in some easy build for a version You'll start getting a warning if you're still running on top of Python 3 But maybe not before the end of the year. We'll see Maybe we need to do another survey first and see this going down significantly before we Start trying to push people to Python 2 Which is in the Python community, this is very controversial still still supporting Python 2 is already controversial Still supporting still supporting Python 2.6 is like what the fuck are you doing? So but yeah it's I guess the HPC community is different enough that this makes sense then This is a new question in the survey. We didn't have this before Which processors are people using or For which processors are people installing software using easy build So their Skylake is pretty pretty popular. That's probably not a surprise. So are the older Intel generations Even down to Harper Town and the Halem which are quite old by now So people are still keeping very old systems running Just because they work I guess I would see rise of cascade lake and these newer architectures and also AMD Rome is coming up So over 10% are already using easy build on an AMD Rome system So I know there are some concerns there in terms of compiler and libraries you should use So this is definitely something we will have to keep an eye on in the coming year and years and Then there's a small minority of people using it on power And even one person using it on arm. I hope this is just for fun and not for production, but So easy build works on arm. It knows about arm, but how well Bioinformatics tools will work on arm. That's that's not very clear to me Then the same question before accelerators in here and video rules surprise surprise And then the most expensive full type GPUs are seem to be quite popular One thing that surprised me a bit here is that only three and a half percent answered. We don't build software for accelerators. I Don't know. Maybe I'm naive in being surprised by that HPC again was quite late with buying a GPU cluster. You only have in September or something we had our first GPU cluster. So maybe we were just very late to the party or scared to Buy some GPUs But it surprised me a bit that this number of hope this number was so low and then easy build tool chains. So this is a Probably almost a readable graph, but we've been asking this in in each survey pretty much Which tool chains are you using which generation of the common Foss and Intel tool chains are you using a lot of people are still using 2018 B which is not ancient, but it's getting a bit old. This is based on GCC 5 I think Or 6 6.4 probably. Yeah We're ramping up to GCC 9 getting a bit old But I guess the most the biggest reason is that there's lots of easy comfy files that use this tool chain. It was It used for longer than the six months. We usually try to use a common tool chain So it has a big stack of easy comfy files Things are certainly still working there. I guess that's why people are still using it But this kind of information is useful for us when we run the regression tests what kind of Toolchain generations do we really have to keep an eye on it's clear that we should definitely keep testing with 2018 B Because lots of people are still using it Foss tool chains are way more popular than the Intel tool chains because there's the license cost there Or at least there used to be this is changing and Here there's no big changes between the different versions and then the the common tool chains with CUDA Not on top, but on the side underneath the MPI library is also certainly Foss CUDA is being used quite a lot Intel CUDA less, but this is also more recent. So I guess that makes sense And then there's a long chain a long tail of other tool chains people are using Intel Compilers and MKL combined with opening PI or all these other mixes. So some people are using those we don't have a lot of Easy comfy files in the central repository for for these two chains But people can of course create their own or just take a false one and tweak it to Whatever to change they prefer. So I'm sure people have good reasons to Prefer these specific tool chains Clang is not or C-Lang based tool chains are not showing up here, but I think there's Some interest at least to give that more attention in the coming year and years because Clang is getting a lot better Also because it's getting a decent Fortran front and finally after two or three attempts It seems like it's gonna have proper support for Fortran. So It will certainly become more attractive In the coming months and years So how frequently should be should we be updating the false and Intel common tool chains? So we we now do it twice a year We try to do it in January and July, which doesn't always work. Sometimes we delay the update a bit For specific reasons. So there's no 2020 a yet because we're waiting for an open API release Which has some important bug fixes So sometimes you shift things a bit But we're for now. I think we're trying to keep it up with twice a year. Some people say once a year is enough We could try it, but I think we'll We'll run into annoyances there like for AMD room for AMD room For example, it's quite important that we have a new tool chain. That's just a C9 based If we only do it once a year that may mean you have to wait a year for a new tool chain That's the standard in the easy build community that may be too long for some people So that's why I guess we can keep it up with twice a year At least for now how many installations are people doing so in the last year and How many people how many installations do people have in production? So this is a big Variety of answers here from one to ten installations to over one thousand installations So it's quite easy to install lots of software with easy build. So that's why people manage to install over a thousand Or perform at that over a thousand installations in a single year If you have a new cluster, it's quite easy to roll out a software stack on the new system Thanks to easy build and if you look at how many installations people have in production across different clusters in their infrastructure Ten percent has between five and ten thousand. I think HPC again is probably in this range So that's that's quite crazy. If you would have to do that by hand It's close to impossible even if you have a team of people But yeah, knowing that there's a wide variety. Some people are only using it for very specific things While other people are just installing everything they can get their hands on. So that's that's good to know. It's important to know It's easy build your only way of installing scientific software. So More people seem are saying yes than before. There's a small increase here Most people say it's at least the main way or Or common and then eight percent of people are also using other tools. This may be SPAC or KONDA or Wherever the software is installed is supported that they need to install I guess and here we see The answer to the question whether the people are still installing software manually We see a small increase in never So I guess easy build is doing a better job or is supporting more software that people need So that's certainly a good sign as well Which easy config files people use Stuff they write themselves. So lots of people are over 60% or almost 70% of people are writing their own easy config files Maybe just tweaking tool chains or versions or things like this And quite a lot of people are using at least some of the easy configs we include with easy build So it's definitely the effort we do to have a central repository of easy config files and accepting contributions and testing these contributions. Well, it's certainly a Appreciate it Lots of people have their own repository as well. So maybe we need to look into Organizing or listing those repositories somewhere so people can easily search for stuff they need or want and Maybe even have a way of pulling things in centrally if people are not actively pushing them To the central repository. So maybe we need to be a bit more active on that front And then yeah, 25% of people just whatever wherever I can find an easy config file if it fits my need I will use it don't really care where it comes Custom easy blocks so over half of the people said they're not using any custom easy blocks at all Which is a bit of an increase. I guess that's a good sign that the easy blocks we have or more flexible or Are enough for whatever people want to do And then some people have less than five customized easy blocks Either for site specific things or maybe things they haven't contributed back yet And it's quite easy to use our own custom easy blocks and easy build it has an include option Where you can just give the location of the Python files and easy build will work as if the easy blocks are included with Easy build itself. It will not be able to tell the difference Are people using site specific customizations to easy build? So here there's a big increase and no customizations almost half the people are not customizing easy build at all Which I think is a good sign it Tells me that there's enough flexibility that people can do what they need to do without having to dive into the code and Change whatever we hard-code it or whatever decisions we made That need seems to be dropping significantly. So the previous survey had 80 percent No customizations now we have close to half. So that's That's pretty good. I'm not sure what we did specifically To to enable that maybe it's the hook support that we now have which which makes it easy I should add a separate entry for people that use hooks Yeah, because maybe that's not clear whether that's customizing or not. Yeah, that's a good point Take a note of that. So well, I can do it in the next survey make that change You'll have to tell me again So yeah, I mean to me, this is a good sign that we're doing quite well Which easy build version do you use and What reasons do you have if you're not using the latest release? So here I'm very happy that people are really much picking up on easy build for even though It was only released last September. So it's quite recent Over 75% or about 75% is using a version of easy build for so that's quite good To me it tells me we did a good job at working on things in parallel and avoiding Introducing breaking changes in easy build were in easy build for wherever we could So people seem to be quite happy with that. I made a lot of effort to Document it well what breaking changes there are in easy build and try to keep them as minimal as possible And if there are breaking changes to have an easy way out like auto-fixing easy config files and And things like this and not aggressively Breaking things just because we feel we need to and it seems that's paying off So I was really hoping that easy build will not that people will not be stuck to easy build 3 and it seems That's more or less the case Some people are still using easy build 2 that's Pretty historic. So if some people here are still doing that, let me know why why are you not making the jump? If it's just a lack of time or if there's something in easy build 3 or 4 You don't like but it's working fine in easy build 2. We want to know about it so we can get you out of Easy build 2 and not a lot of people are using develop. So this was Important to see here as well. I think most of the easy build maintainers just work straight on top of develop and Gantt we even do that for production installations because we're quite confident that develop is stable We have issues in develop occasionally, but not a whole lot. So I'm quite comfortable with doing that But it's clear that most people rely on a release on a stable release that's been regression tested And all of that so that's important for the maintainers to realize that Things that get fixed in develop. We should try and get in a release as soon as we can so people can actually pick up on it Why are people not using the latest latest release yet? Because they're happy with the current version because they don't have time so most of these reasons are Not because of problems that the newest easy build version has but other reasons so that to me it's good that tells me we're not Making any changes that prevent people from updating and then how What about the frequency of the releases most people seem happy with that? Some people even think it's too frequent. So the current release Rate is about once every six to eight weeks. We have a new easy build release. So we want to Push out everything we have in the develop branch Into a release which I think is quite important because most people using a release rather than a develop version Seven percent of people think it should be more frequent. Yeah That's very difficult because it's a it's about a day of effort to push a release out Including a regression testing keeping an eye on the results of the tests Fixing bugs last minute that pop up during the regression test which does happen So it's a lot of effort and I'm very grateful that Miguel who's probably watching this tree now from Singapore Has been helping out with the easy build releases a Lot more than he was already before so he's actually now the last two or three releases were done by Miguel Rather than by myself and I still have out with the regression test and all of these things But he's saving me quite a bit of time By helping out with that. So thanks a lot for that. What's your favorite easy build feature? This is a bit all across the board. Some people like the Automatic installation of missing dependency. Some people write like the try tool chain or try software version options I'm definitely in the github integration part here So which is what Sam will be talking about tomorrow in the tutorial if we wouldn't have these features in easy builds We would not see the growth that we're seeing in contributions And we will not be able to keep up with incoming contributions. So for myself as an easy build maintainer This is very important and There may be some things here There's actually features here that are not mentioned at all which I may talk about on Friday the 10 things you didn't know about easy build or Things that we maybe should promote a bit more That people should at least know about and then maybe a bit of a dangerous question But which part of easy of easy build do you not like currently? So we I gave a bunch of suggestions So things that I I know are not perfect or are certainly up for improvement or Complaints that have been coming back for For a while from several people. So those were added as possible answers and there was a an open An open field as well where you could type anything So 20 20 percent of people added something in the other category and I'll have to go through them because there's a long list of things They didn't have the time for that yet Here the thing that jumps out most is the Fixed versions for dependencies. We have an easy config files. So there's very little flexibility You cannot say you see make bigger than 3.5 or anything like that. Everything is a very specific version About one third of the people are not very happy with that I'm sure muscle Miliano is smiling now in the back because the flexibility in terms of juggling different versions Is one of the major features in spec so you can talk about that? at the end of the day in an easy build I think it's First of all if we would add more flexibility We would shoot ourselves in the foot thing in terms of testing things If we put out easy config files there that have flexible versions, you don't really know what people will be using To install the software of it and then some people will bump into a problem because they're using a newer C make Which has a bug while other people using the previous C make which was working fine Are not seeing all these issues will be popping up and also how do you regression test things that can float around everywhere? so you It would be very difficult to to test things and I think that has been an issue in spec as well Like how what do we test? How do we test? All these things and you can talk about this the end of the day A couple of things I added specifically because I know they're up for improvement are the error messages So if easy build doesn't installation, it works fine Senate the check passes everything is happy or you're okay if there's a make command that fails and you get in Pretty weird looking error message at the end and you have to start hunting in the easy build log What the actual problem is? I think we can do a better job of highlighting the Probable cause of the of the problem is like finding the first error message in the log file and mentioning that in the output Maybe some some highlighting with red colors or I don't know what I think we can do a lot Better job there and just improving some of the errors that easy build itself produces as well if it's If it needs a source file and it could not download it And it's trying to do the installation it will it will now bar for the very ugly error message And I think we can do a better job. I was trying to find this file. I looked there Couldn't find it and maybe this is how you fix it by using this download URL yourself or something like that Yeah, so lots of things to work on here And I don't think there's any big surprises here So that's that's good to know about what surprised me here is the pink bar conda So over a quarter of people combine easy build and conda I don't And I also try to tell our researchers not to Use conda on our system because at first it works and then six months later They realize they have a big issue and they have to redo everything or just doesn't work anymore So if people are doing this here, I'm interested in hearing how they're doing that why they're doing that Um, so definitely at coffee breaks or lunch come and tell me what's what's up with that. So this this surprised me a bit um Lots of people are also using Singularity and containers Um 10 of people are still using singularity too Yeah, it's the same thing with easy build people are stuck for to old versions for god knows what reason Um, this is quite high. It's Is it 45 to 50 percent? So are people using easy build in containers here or just containers on the side because the software Comes as a container pre-built So also here if people have more information on that I would like to hear about what's going on here And then yeah, some of the other tools are more niche, I guess In terms of module naming schemes So what what do the names of modules that you install with easy build look like lots of people are using the default Um easy build module naming scheme or or maybe not aware You can tweak the naming scheme to your liking Um, this is one of the gaps in the documentation right now. It's not really Clearly mentioned in the documentation What type of naming schemes easy build supports or how to use a different naming scheme? So you should probably fix that Um one of the Nice easy cool features that easy build has is support for hierarchical naming schemes So who has no idea what the hierarchical module naming scheme is Everybody knows what it is Okay, so Just very quickly a flat naming scheme looks like this if you do module avail, you can see everything You can have these pretty long names In terms of easy build speak you'll see the software software version toolchain and toolchain version And maybe even a suffix at the end. So you get a pretty long module name If you organize your modules in a hierarchy It looks a bit different if you do module avail after Starting a new session, you will only see a couple of modules typically compilers And then as soon as you load a compiler and you run module avail again You'll see the MPIs that are built for that compiler if you load an MPI You will see applications or other libraries Built with that compiler and that MPI And the good thing here is the module names become very short. You can no longer Combine things by accident that will not work together because they were built to a different compiler or different MPI So this has a lot of advantages One disadvantage is that Your users need to know what a compiler and an MPI is or at least have a notion of of what that is Which sounds like a joke, but in reality lots of people just don't know or don't care and probably don't have to have to so Now organizing your modules In this way has a lot of small details that you have to get right for this to work well And easy build knows how to do that. So you can just tell easy build use a hierarchy and it organizes things In the right way for you, which is Which is very useful Then how long are modules installed? Or how long the modules stay there once they are installed with easy build lots of people said forever So the forever means as long as the system is alive The system is dead. There's no point in keeping the installation So we're also in this box in Ghent And should probably do a better job of actively pushing people to new installations Deprecating all installations Because we often see things We often see people using very old software or very old tool chains That are slower or that are are known to be problematic Um, yeah, we're not doing a very good job there and it seems like lots of other sites are in the same category and in terms of community Easy build mailing list and the the slack slash rc channel. Although irc is pretty much dead nowadays. It's not very Very active anymore We see less participation in the mailing list and more on slack, which is probably not a surprise So even though we still get about 800 messages a year on the mailing list Yeah, people seem to like the interactivity of slack a bit more I guess What's also important here is that A big part of the easy build community only reads mailing list or slack or is not even interested Or not has not joined yet So it's not because we ask the community something on mailing list or slack or both That we get a good coverage in the in the community. That's very important to realize as well. So lots of people Don't speak up for whatever reason they have And it's important to realize that as as easy build maintainers and developers Subscribes to the mailing list keeps going up Um, so even though it's getting less active. It's still now has over 270 people, which is very good Um slack channel, which we This is traffic on the mailing list So about 700 messages a year, which is enough that you have to keep attention to it Um, but it's certainly less than the over 1000 we had in 2017. Thanks to slack This is the activity on the slack channel. So we have over 200 people there um That have an account and about in 2019 about 50 people That were active on a weekly basis. So that's very good. If you're not on slack yet There's a small app here that you can drop in your email address. So you get an invite to join slack So it's probably the best way now to reach out to the easy build community to get some quick help with things you're stuck with And then before we take a look at contributors Uh, where's Oriol should you do the coffee break first? Or should we continue? Who's up for coffee? Yeah So maybe you should have a short coffee break first before I give some more Uh Yeah, yeah, then we'll have some time for questions as well. So there's Just for people on the stream then what time we're going back there and half an hour probably so half past 11 We'll continue So I'll continue um with the survey So we ask people if they actively contribute back to easy build um A lot of people just over half just report issues or complain Uh that things don't work as expected, which is a form of contributing back if we don't know there's a problem We'll probably not fix it There's a again significant growth in slack usage And we also see more people Claiming that they contribute contribute back by submitting pull requests to easy for easy conflicts And we also see this In the other statistics that I'll show that's very good and a decline in people that don't contribute or at least Claim they don't contribute So that's good. I guess the rest is more or less stable Well We look at the forks on on github of the different Easy build repositories. So the green line is the easy conflicts repository where we get by far the most pull requests they have The counting is a bit weird because Github says we have Forks. Well, if you pull down the data you get 350 forks, but there's indirect Forks as well. So people forking from a fork and apparently that doesn't show up here But yeah, you get an idea it's going up um Keeps going up at the same rate. So community is still growing Now the repositories are getting less forks because there you have to Not just change a version to make a pull request but actively do some Python coding Or the bottom line writes some documentation, which is the least popular way of contributing unfortunately But yeah community is still growing. It's mostly what you get from this graph If we look at pull requests to the framework repository over the years Last year was a bit of a A bump that's mostly because of the effort in porting to python 3 So I did way more pull requests than I've done in previous years Mostly because I I try to do Things in small chunks, let's say Do the porting in small bits and pieces so I could do it in between other work So it's it's about I don't know what 50 pull requests um in total for the for the porting to python 3 Um, we're now about over a quarter of the of the contributions to Framework are done outside of the hbc again team. So it's still we're still Responsible for most of the development there and I'm hoping to get that down a bit But it's not that easy for easy blocks We do see more and more contributions coming from people other than myself So I only did half of the pull requests there And the others were about 50 50 between other maintainers or outside contributors So so that's quite good at least updating easy blocks For new software versions because things have changed or making enhancements to generic easy blocks is also done by people not in the team of maintainers So that's very good And then easy conflicts This is a bit scary so In 2016 we we added the GitHub integration to easy build which Makes it very easy to to contribute back easy config files. You can do it from the easy build command line So you don't have to leave your terminal. You don't have to click on github um You can just open pull request straight from the easy build command line It was stable more or less stable in 17 and 18 But last year we got over 2000 pull requests in the easy conflicts repository. So that's That's nuts, right? If you Take into amount the number of days you work in a year. It's like 200 220 Something like that. So you're looking at 10 pull requests per day that you have to process So that's quite a lot We see an increase in the the ratio of pull requests by non-maintenors So that's quite good as well external contributors are Opening more pull requests. That means we're getting new software We're getting software updates that we don't have to spend time on other than reviewing and testing Which we should be happy with but it's a lot of work to process all these contributions There's also good news here um, so this is the same graph just with different categories Here the darkest green as me merging pull requests. So that's about Uh, what is it 600 out of the 2000 something So about a quarter of the pull requests are handled by me The other maintainers are doing a very good job of processing the the contributions as well um, this is things for time, I don't have to spend in Looking at incoming contributions. So we have this maintainer of the week role um where We try to find maintainers who have time to spend at least a couple of hours that week Looking at incoming contributions And that seems to be helping in making sure that things get processed even though there's a growth and in contributions so hopefully this keeps improving and you can actually Or I can actually make the dark green bar go down um without leaving contributions on on processed And then the right graph is showing How many pull requests were opened via the new pr feature in easy build So straight from the easy build command line and that that has increased significantly both a number of pull requests and also the ratio So over three quarters of pull requests are opened this way Which is very good for A number of reasons for the contributor. It's easy. So he doesn't have to leave his terminal. He doesn't have to clean up branches and get He doesn't have to figure out what the name of the easy conflict file should be or where it goes All of that is handled by easy build And also for maintainers, it's easy because they see it's a standard pull request They can we can easily see it was opened via new pr So there's a bunch of things we don't have to check because we know easy build gets it right Um, so it's a lot easier to to process a pull request that was opened this way than one that was opened manually, which We of course are still happy to process manual pull requests as well, but it takes more time for us So if you're interested in contributing back Definitely take a look at the documentation we have on the github integration and check out Sam's tutorial tomorrow And in terms of unique contributors, we still see this going up as well, so we we broke the 200 unique contributors in easy conflicts last year And it looks like this year we're going to break the 250 so it keeps going up new people Keep coming up and contribute to easy builds Mostly in easy conflicts, but also in framework and easy blocks So it looks like the the python code that we have is fairly accessible for people To dive into and make changes and make improvements as well And we've put a lot of effort in making it accessible And then this is unique contributors per year Um, so we get 90 something unique contributors every year for easy conflicts in the last couple of years Um for framework you have about 25 different people Making pull requests every year for easy blocks. It's a little bit more than that and This stays pretty stable Both for framework and well there's a bit of Shift up and down, but it's not rapidly going down or something like that. So So that's good More contributors may be interesting here, but if we get a hundred contributors to framework Reviewing these pull requests and making sure everything is fine. There's a lot more work Than easy conflicts. So I'm happy with the current situation. Let's keep it at that And then the last question in the survey was How do you rate the overall quality of easy build? Lots of people seem quite happy with it And apparently massimiliano and thought also did the survey because they said it could be better Just a joke but overall it's fairly positive And I I agree it could be better and there's things we should fix Or things we should improve on But overall it's very very positive. So quite happy with that Then bit of outlook into 2020 so Some goals I have for easy build Things I certainly want to work on or things I know that will have to Pay attention to is the well the incoming contributions for easy conflicts. We now have over 2000 PRs a year Which is Very happy with that, but it's also a lot of work In 2020 we will break the the 10,000 easy conflict PRs since the beginning of time So that's a lot of stuff to manage We will soon have over 2000 supported software packages, which is a lot of stuff to test For every release. So we have to get better At this in the sense that we have to automate this more where we can't we're already doing a lot of automation And what we can do better we can maybe even set up a way of automatically Submitting test reports for easy conflict pull requests after we have to remain tamed and has a review the PR and says This looks good to me, but it needs to be tested You could just maybe set a label and then a bot could wake up and say oh, I can test this and Actually do the installations and report back on whether things work or not There's a couple of technical hurdles there to to go through and some security related things on github that you have to be careful about so you don't want to have github tokens leaking into environments and and things like this But it's certainly possible It's a bit of work to set it up, but I'm sure we can do it And also as a contributor yourself you can help making things easier to Process if you open a pull request and the test fail Please see if you can fix the test yourself just by adding commits in the pull request Submit your test report yourself So show that it works for you and tell us on what kind of system it works for you Which python version you were using whether it's in the room or intel has well, so this was all included in the test report So it's quite easy to to upload a test report for the pull request you open And that helps us to be more confident that this is probably good to go and To take a look at it and merge it quickly. So Maybe we should also Get better there and explain to contributors how they can help In helping us The easy regression test that we do now so for every release we Try to rebuild all the easy config files we ship so by now it's Eight or nine thousand easy config files. There's always surprises that pop up Easy config files that are broken because they auto download stuff from the internet and AWS is down or whatever happens or things have moved So, yeah You need to shift or shift through these test installations and see what's actually a bug in easy build or Is failing for another reason and it's also it's too manual now the regression test is Sort of scripted and sort of not and certainly checking the results is is a lot of work now So that it slows down the release process quite a bit We don't have the regression test results public now We should probably have a public dashboard so people can see this is broken in this in this situation But it works on maybe another system And I think reframe which was it fazillas will talk about tomorrow could be a good way of improving on the current situation One thing I've been wanting to do for a long time as well is actually run the test installations in the container environment. So you have full control over packages in the operating system which Operating system and version you are using Um different easy build configurations that maybe you can test in parallel like hierarchical and flattening scheme and all these things our partner our parts all these things Um, it's fairly easy to do if you use container images to just run easy build in Uh It's doable, but again it needs time to set these things up and We make it work a big help there is the The test infrastructure we have access to now at which is house at cscs Which is a shared infrastructure for the easy build maintainer So all the easy build maintainers get a login there And we basically share our test installations there So we don't have to rebuild things over and over again If another maintainer has done it before and that helps us a lot in saving time And in testing contributions And maybe we can even take it as far as using q emu to Emulate processor architectures. We don't have access to so if we have an intel haswell system But we also want to test for skylake Something like q emu could could help there to like sit in between and it will make the installations Slower maybe even a lot slower But you don't really care And then the error reporting that popped up in the survey Um as well I hope we find time to make it significantly better than what easy build It throws at you now and basically forces you to dive into the log file Um to figure things out And I know there's some ideas there I think spec uses the A mechanism from cmake or error patterns that it it got from cmake Um to make errors stand out a bit more or to help you to pinpoint the exact issue We should probably look at what they did and then Have a better way of doing the same thing And then some some challenges Certainly for 2020 but probably after that as well. So I have the impression scientists are getting More and more software packages that they want to have installed certainly in bioinformatics. It never ends They don't open a single installation request when they open a dozen at a time and then every week again So it it never ends. So that's something we'll yeah, we'll have to make easier to install new software or Become better at sharing things than we are already before like making it easier to discover things in other repositories That people have that are not central So we can leverage each other's work one Other concern that has popped up Quite a lot recently is the the discussion about r and I don't know what something thousand something extensions that we have in the r installation It's becoming an issue to manage that to update that john's Easy easy update tool helps a lot with that, but it's still a concern And also recently we've we've noticed that the downloads from the siren repository for our packages Um, if you download the source start balls the checksums change for no reason at all if you compare A new download within the download you've done before and you do a diff on the source code there's literally no differences in the source code only the Publication date of the package changes which changes the checksum which breaks easy builds because it checks the checksum And I'm like what's going on there So one thing I I want to do in the coming weeks is find a good contact at at siren One of the people that maintains siren and ask them what is going on. Why does this happen? And can you avoid it or can we? Well, we probably can't but Is there a way to to fix this like what they do on on pipi the python package? Repository as they hard block you from re-uploading a package with the same version Apparently siren is not doing that because people republish the same version Even if they don't have code changes for I don't know what reason is in the crown running somewhere. Are they accidentally running a published script? I don't know but sirens should probably block them from doing that because it doesn't make sense So maybe we can try and complain to the siren maintainers and just explain why it's an issue Whether that will work out or not, I'm not sure Um We're now running The repository the easy build repository test in both travis and github actions, which is working fine I have a a bot running which we triggers Failing tests in travis if they fail for no good reason So something that often happens in travis is the tests fail because there's no internet connection In whatever vm the tests are running in if you just re-trigger the tests again, suddenly they pass because Somebody fixed the internet We don't see that a lot in github actions. So there we don't need a bot that keeps an eye on things and triggers things I'm a bit in doubt whether we should switch away entirely from travis and only test in github actions and just Stop dealing with all these these issues on travis There's an issue there when you open the pull request and the tests fail for a good reason My bot will notice and will add a comment in the pull request saying this failed. This is the error message. Please try and fix it That only works because it keeps an eye on travis And it's not easy to make it work the same way with github actions again for the same reason that you don't want to make a token leak Um It's a bit technical why why it doesn't work, but it's not that easy to fix the bot also for travis So that's why yeah Certainly need to spend more time to switch entirely away from travis And then one thing I also think will be an issue or we'll certainly have to be careful with is We now support both python 2 and python 3 But every new python 3 release has some minor breaking changes and we'll need to be aware of those and Update easy build framework. Maybe even maybe even easy blocks to take into account these breaking changes So we'll have to keep pace with What things python is is breaking and again? I really think they're avoiding to to release python 4 because they don't want to go through this whole mess and people joking about it again So it's something we'll have to deal with then python 2 is dead or very soon will be Which is not The concern for easy build itself anymore, but there will be lots of scientific software packages out there that only support python 2 not python 3 And we'll need to have we need to deal with that in some way or another I don't know if this is going to be a big issue But how long we'll have to drag python 2 forward is not clear at all Certainly beyond 2020 because there will be lots of scientific software that just is frozen in time and Nobody will fix it for python 3 We're we're slowly coming out of the intel only age in terms of processors So amd is back and very much alive Power and arm are getting more interesting as well for people to to invest into AMD roam which already we know is Needs very recent gcc probably or may need a different um Library than open blast maybe bliss or their own fork of bliss um for very good blast layback performance, so It's starting to complicate things a bit. It complicates testing for us as well Like we don't have access to a power system. And even if we do Retesting everything on power takes a lot more time. So It complicates things a bit And also especially with amd roam in mind I'm not sure whether there's going to be an impact on the common tool chains as well. Can we still Go with the fos that is gcc open api open blast s50w or do we have to start Faring things based on the architecture you're on Maybe for amd you want bliss and not open blasts well on intel you want to stick to open blasts or mkl and yeah Things are getting complicated And then one thing we've been in touch with with Uh Been in touch about with intel itself is their new one api Thing that they've been making lots of noise about certainly at supercomputing We've set up a conf call with intel by their inventory Invitation recently And apparently this is going to replace parallel studio. So what's now Intel compilers mpi mkl and a bunch of other libraries next to it Uh, it's going to be included in one npi and one api with some other things a new Distributed c++ compiler or whatever it's called. So there actually have two c++ compilers in there Or data parallel c++ compiler Um, they've also renamed a couple of things like it's no longer icc, but it's icx It's no longer iford, but it's ifx and They seem to think it's a good idea So an easy build that's not a big issue It's just a version check and then using a different name But i'm pretty sure it will be more than just changing commands Change options that change behavior. They change defaults I will have to be aware of that and Start playing around with it and see what the impact is going forward They're they're not giving us a date for this now But if you look at the versions they use in the beta it looks like one api will be The new thing in 2021 and they'll just drop on parallels to you No, there's no confirmation on that and they're very careful with with Specifying dates, but that's what it looks like the versions are like 2021 dots Yeah, something Dot three. Yeah. Yeah, so you see 2021 popping up And when we had the call and we pointed this out to the intel guy I was like, huh, I didn't see this yet, but I don't know when it's going to happen 2021 is when it's going to happen They changed if you install one api First of all, it's not a single installation. It's like a base package and an hpc package and a Bioinformatics or I don't know what deep learning separate package So I have to install things on top of each other And the way it's organized internally on the file system is very different as well So things have moved around it's cleaner than what it was before So you have a compiler directory and a scale directory and an api directory. So it's pretty clean But yeah, easy build. We need to know where things are where to pick things up It will be a bit of work to to make that work Good thing is we have a good contact at intel that we promised we will we will try this by the end of What do you say? February or March We're gonna couple we're gonna give them feedback on this so you actually you can probably influence this a little bit We already told them. Yeah, don't change the compiler commands if you don't have to if you don't have a good reason don't do that because it's gonna People have to actively switch to it and it's going to be an issue And they seem to listen whether it's actually going to make a difference. I'm not sure So yeah, lots of things to worry about maybe not You sleep over but still to be aware of And then and now I'm getting a bit creative. These are just ideas for Things we could add to easy build. So this is not implemented at all. If you try this, it will Hard fail because the option is even not even there um, so one of the things I've been hoping to find time for is to Basically go beyond what we now have with dash dash try so you can do try option try toolchain to try installing it Existing easy confi file with a different toolchain And it knows about subtoolchain. So if you do dash dash robot it will do the mapping from the subtoolchain as well But the command line interface it has and I'm not very happy with it. We have a try toolchain. We have a try software version We don't have an easy way to tweak dependencies yet Which we probably should have but I don't want to do with try dependencies and then try whatever else Should probably you know, yeah, I have a new command line option that tackles this in a better way And what I came up last night after having three glasses of wine is something like this um, a new option which we could call tweak or I don't care how it's named But you could specify things to change like in this case only change the version In this case only change the toolchain to this toolchain name and this toolchain version Only bump the python version that's used in here for Well, the second one is supposed to be something else But you could rather than hard specify the versions you could give it a file that specifies the versions Because maybe you're keeping track of a list of 100 dependencies that you want to use specific versions for So this could be both command line or in a file You could make it robot aware just like try toolchain is So you could change a version change the toolchain and then the robot will make sure the toolchain is also changed for the dependencies that are in here Or maybe even go as far as having something that Allows you to update to whatever the latest version is that's available So something that would for python packages or for art in this case would go to cran check the version Update to that version check if there's dependencies that should be added at the dependencies and the latest version So basically what easy update does but then integrate it into into the framework And in implementing this we could take this step by step It doesn't have to do the whole thing at once to make it usable We could do Start with a tweak that can only tweak the version and then do the toolchain combined with dash dash robot As a next step and then go like that. So it could be a step wise I think but I think it's a cleaner A cleaner interface to to do these updates And I'm sure surprises will pop up here That we think that we can implement with our in practice are quite quite difficult So that this is something I hope to find time for um in 2020 Could merge my pr I could merge your pr which does this Yeah Does which part Okay, I should take another look at that Yeah, so there's there's things flying around that are I mean Some of these things are already supported with try software version try toolchain and the robot thing that's already there Maybe we need to yeah make it a little bit more accessible or a bit more uniform And apparently I should look at that as a pull request for The dependency stuff And then sort of what I mentioned for the the latest version of extensions You could extend that to just software versions as well So you want to install the latest tensorflow? Why do you have to go to pi pi first and check what the latest version is just let easy will check it for you so you Be update version Dash dash robot and it could do the software and all the dependencies update to latest Give me easy configs for that and then let's try and install these and see what happens That's easier said than done for things like python packages You have a single repository to query for r. You have a single repository to query But say you have open foam. Where do you query the version? How do you query the version? How do you query the Five or ten dependencies it has so everything is going to be different for each of these non-standard things so that makes it trickier and you need to have a good way to implement that You don't want framework to be Aware of how to check an open foam version that should be done in the open foam easy block So you need change in the internal api so easy blocks can tell to framework how to check for a version How to check what the latest version is so this this has to be work out And then something I actually did for a while and open the pull request for but no any merge it wasn't finished so It doesn't make sense to merge it yet. But is to have a so in the latest release we have a copy ec Option to copy an existing easy country file to somewhere else a show ec to just print it to the screen These are quite new and I use these names because in the longer run I would like to add options like new ec where you can just you can just throw stuff at it Without telling it what it actually is an easy build can do a good guess of what The values that you pass me. I mean if you give it something like this It's a string and I don't know what it is. It's probably the software game This has digits and dots. That's probably the version And it just can take the values as they come and make an intelligent choice of what they are You and then you can pass it The location where to download the sources so it can know this looks like something I can unpack So this is the file name. This is the source URL And if you then make it smart enough to say This value is the dependencies. So don't try to make an intelligent guess but use what I tell you Up to some level I got this to work and it's like it's a i in easy builds Or you can sell it that way But at least this can give you an easy config file to start from so you don't have to remember What the syntax is in an easy config file just throw stuff at it It gives you a file and at least that's a that's a template or a starting point you can use This is fun to work on but yeah Start consuming as well That's what I have in terms of Presentation I'm sure there's lots of questions There will be plenty of time throughout the coming days as well coffee breaks lunch Dinner if there's more difficult stuff we can discuss it over wine or beer Um, but yeah grab me or grab any of the easy load maintainers with the red Um name tag if you have any questions And we can probably take some questions now as well. So we have some time Uh until lunch lunch is at one and john's talk is about half an hour So yeah, if there are any questions we can take them now as well Or maybe people watching the stream on slack can ask questions too and Okay, we'll Send them to me. I can try and come up with an answer Any questions? Yeah, maybe pass the mic So people on the stream hear the question as well Uh, thank you get it for a nice talk. Um, I actually have a comment more than a question or a discussion point Whatever you want. Uh, you were talking about the dependency versions and how you don't want to Get rid of the fixed dependency versions because it makes it more difficult to test things I completely agree. I'm very much in favor of giving something where you say, okay You know, I built it with this dependency and at least I got it to work So there's at least one person in the world who got it to work with this dependency version. Yeah, that's nice to know But can't we do something like an argument saying, okay Now I want to do a try argument. I want to try it with flexible dependencies try it with A dependency larger than this version or something like that This could be implemented in the tweak idea, right? But then you really make a new Yeah, I mean you're making a busy config with a fixed version, but you you get a command and option to change the version in place Yeah, but then you still need to figure out. Oh, what do I have available on my system? Right, you might already have let's say a seamate available You could ask easy build to detect anything about seamate three point something goes or you could ask easy build to detect it Rather than than hard fixing a version here. You could say python Colon avail so whatever module I have that works use that That's what my PR does at the moment So that's great because that will be my preference to still have any easy config one version where you say, okay This is what we test. This is what we guarantee works. Yeah, but I have something else Feel free to try whatever you want That makes sense. Yeah, cool. Thanks. No Yeah, how old is that blue request a year a year Yeah, but it's come it's a bit complex. It's it's not that It's not the dependencies that are a problem. It's like things like version suffixes Yeah, because you have things like bin utils version suffixes, which also has a version, right? So you have double versions going on and so to update to the latest means actually updating Two things maybe two or and the same with the python. I'm not that kind of That kind of was disappearing because we were having python two and three to combine thing That would have been solved and now it's gonna appear again. It's coming back because we're back because mostly going forward with only python three But there is something there to To look yeah, but that's that's what makes it uglier than it needs to be. Yeah I did a very quick try not in framework itself, but in a in a script A python script that uses easy build To do some of these things and I quickly bumped into issues like this as well like at first site It's very straightforward, but then there's corner cases that you have to take into account Comporating and then it's it's this issue as well with the with the extensions Because the extensions are hidden inside the files themselves, right? So So if you want to update extensions as well You can yeah, there are other things Yeah, I think we have to pull things apart maybe Do it first for dependencies and extensions are really a separate problem. Yeah. Yeah. Yeah. And then you that's why I think The implementation of something like this has to be stepwise. You have to have a plan to do it piece by piece Yeah All right, I just wanted to Make a comment about the fixed versioning and I'm surprised you don't bring this up, but that's part of the selling point of easy build for us at our site This is the idea of reproducible science. Yeah, and people are publishing papers and they're using a very You know something from an easy config from a module That's a hundred percent reproducible anyone can build that and reproduce that science. So That's one of the big big features that we push with easy build locally at our site This is idea of you know, your paper would be reproducible at another site. Yeah Yeah, that's I guess that's also the reason why we have fixed Another reason is it's a lot easier to have fixed versions than to make it flexible so I know in in spack. It's very flexible, but it partially has Has blown up in the basis as well. I'm sure muslim liana will disagree with me Based on what I see passing by on the mailing list I'm on the spack mailing list as well. I'm silent, but I'm there So I see some people sometimes running into issues because of the flexibility that spack has I wouldn't say so Yeah, yeah get get the mic first Maybe we can get back to this At the end of the day when you do this back talk. Yeah Because I I will ask you questions like this Think of that So once you have something installed you can get a file where everything is pinned down you can give this to somebody else Yeah And are would you say the chances of that to work for somebody else are as high as they are in easy build or are there Are there reasons why a log file you give me would not work for me for spack version or processor architecture Using the same version has to be the same version Yeah So I guess one main difference between easy build and spack is then that in easy build we have a central repository where we share stuff That's not there for spack is it Central repository where we basically share what you call walk files And is it is that just something the spack community is not interested in or Okay Yeah Is this stuff you'll talk about in your talk as well Yeah, all right Maybe I didn't see this in the documentation, but is there an option for dash dash new pr to do a local syntax check and If the checksum is yes, and that's something sam will talk about as well. Okay, great. There's a A check contrib option that scans your easy config files And that can tell you About code style issues and like missing checksum some things It doesn't do everything though because what we do in a pull request. We run a bunch of checks Only for the files that are touched in the pull request That's not something check on trip is doing now and that's something we have to fix because We have one way of checking things and then another way of checking things in check on trip There should be one and the same not two different things Thanks Is anyone keeping eye on questions on the slack channel as well? Okay, yeah cheers up Yeah, so one of the features that I really use Is the searching the dry run option and I've noticed that over the Releases this feature is becoming slower and slower and it's obviously because of the number of easy configs that we have Yes And now it's almost to an extent where it's just not even Really practical to like run it just like just takes forever and given that you know in the slides as you mentioned we're getting like You know 10,000 easy configs by this year. Is there any plan to kind of Manage this in a way that it's a good point. So first of all, there's a bug we fixed recently That should fix some of that I know it's it for one zero or four one one. There was something we fixed along those lines So if you're not using the latest version yet try the latest version, it may be better But another idea I've had and I should have mentioned it on the slides is to Have support and easy build to build a cache like lmod has so One when you install Easy build it comes with let's say 10,000 easy config files It could have a single file that has all the needed information like where are the easy config files? Which are available in a single file, which will be a lot faster. It loads the file it can Just load that in memory and then scan through that rather than hitting the file system all the time That's the main reason why things are so slow So that's that's one idea and I don't think it's very difficult to implement that But when you want to fix something with the cache then you have two problems, right? You have the problem and the cache To worry about and I'm sure robert will will tell you the same thing lmod has Good spider cache that it can use but now you have to have a cache that you have to keep up the date So there's a bit of issues around that but that would probably solve the The issue with dry run being slow The another problem with dry run being slow is the file system We we we change file system where we store the software and the dry run speed increased by what you say 5 10 10 times Yeah, it's it's it's lots of small files and that's not typically a common use case on hbc so yeah But I guess if you did have if you did have a cache files that came with the release For example, then at least you'd be catching that 10 000 or whatever that is yes that figure Yeah, and then so we people do locally would be on top but then yeah, we can probably be Depending on what goes in the cache file. I don't I would expect that The location of the files will have to be in the cache file Which you cannot do you cannot pre-cache that it depends on where easy but it's all You can we could probably make it relative relative to where The cache file itself is or something like that that would help so if the cache file is next to your easy config files We know where stuff is you don't have to do Fast system lookups or that but yeah So we can ship a cache file or and have an option in easy build to update the easy configs cache if you adding stuff to it And but then you you need to do things like keep track of how when the cache was updated Maybe print a warning if it's longer than a day or a week make that configurable so Yeah But I mean even if you did it for the stuff in the release not updated at all right just do it for those that's already That doesn't play that could already be a big help But i'm sure people here Have their own repository which also has hundreds or maybe thousands of easy configs So yeah, and this could be a stepwise thing as well You can only do it for the ones we include then make it possible to scan an external repository build a cache file for that and And just see what happens I mean like Elmo does things with the cache that will we will probably also have to do it has an auto expire Setting that you can tweak like how when is a cache file considered to be stale? uh Yeah, I have to be careful in updating the cache because it has to be atomic Atomic pretty much so yeah Not that trivial any more questions Okay, if there's none for now, um, I guess we can get set up john gives you an extra 10 or 15 minutes