 Okay, we can start. It's time. So, I'm actually one of those people who is using geeks in the production environment. And the website is called genetwork.org. It's hosted by University of Tennessee Health Science Center in Memphis. And this is a site that has been developed over the last 20 years. So, even though it's essentially a simple setup, we're running Python and MySQL to offer these services. I'm going to explain to you how it got out of hand a bit. Anyway, what is Geeks to me? It's all about sleeping at night. Yeah, I mean the last 20 years I've been managing systems. And sometimes running 40 servers or something in that order. And you would get a call in the middle of the night and somebody says, hey, you got to run to the data center. That's something I really want to avoid. I could have somebody else go now, I think. But even so, we are doing better. So, it's about control software deployment. And it's about controlling the full dependency graph. And I'll explain that in a bit. And also what you want to do as Ludo so nicely pointed out is you want your system to be deterministic. Once you deploy it, you want to really know it's the one that you actually deployed. Sorry. So, I'm going to talk about four things. One is deployment in a developing testing, staging and production environment. The second point I want to discuss is easy installation and I'll get to, yeah, that will be the longest point. Distributed workflows, I'll discuss shortly. And then orchestration of services is something we could see as the next step. So, when you talk about deployment, you know, it's essentially simple. But it's all about dependencies, you know, and there's a link here at the bottom. This slide you can find online. And I'm going to educate you a little bit what it looks like. So, this is a graph that has been produced by Geeks. You know, when you put all the dependencies together, it can show you an SVG. And I hope I can scroll it. Yeah. So, somewhere at the top, there's G network, right? So, this is actually the starting point. So, G network depends on quite a few things. And let's go sideways. See what's here. So, there's Redis. Okay. And there's RQTL, which is an R library. There's Python parallel in there. Sorry for the... Okay. So, here are some of the R packages, many of which Ricardo has made, has packaged. Yeah. So, yeah, it continues for a bit. And then, let's see. There's also somewhere... Oops. How to config? Yeah. So, fonts, cups is even there. God knows why. So, this is the Glypsy stuff. And then we're getting in XML parsing. We're getting into fonts. Some X stuff somehow. So, how is this all possible? Well, let's see what's here. I'm missing out on the Python stuff, because it's also quite huge. Anyway. LZ4 is there. Valgrind is even there. Glypsy is fortunately there. Quartet is Glyp. Anyway. So, to build a system deterministically, you know, you have the immediate dependencies, which are quite obvious. Yeah. So, yeah, maybe we're using SciPy or Rqtl, which depends on a few other R packages. But you also, you know, to get to these environments to deploy it, these packages have their own dependencies. Yeah. So, when I showed this to my professor, he said, how come Ruby's in there? Yeah. It's not supposed to be any Ruby in there. Yeah. But there's a dependency on Ruby somewhere. You know, one of these packages requires Ruby. Yeah. So, this is the stuff we're finding out. And the versioning is extremely important, right? I mean, when you want to have something deterministic, you don't want to under your, you know, to be building on shifting sand. Yeah. So, we have actually one thing I didn't show you, but there are three partons in there. There's parton 2.4, parton 2.7, and parton 3.3, I think, or 3.1, wherever they are. Yeah. So, there are packages also depending on different partons. Yeah. So, let's go back to the slides. Yeah. So, essentially, what is simple is actually not so simple. Yeah. And there's also dependencies for Open Blast and Atlas, which are kind of competing libraries. And then there's the Ruby dependency. And the number of dependencies is actually growing over time. Yeah. Any time you add a package, you'll get more dependencies in. And I've so far avoided doing the JavaScript stuff. Because the JavaScript we're loading at Hocknow, which NPM is actually great at. The Node Package Manager. But it's such a mess. It's unbelievable. What am I doing for time? So, also in development, the developers were using different systems, essentially. Right? I mean, they were doing it at Hock. So, you say, okay, I want Python. So, you install Python. Yeah. You want to install SciPy. You will get a version of SciPy. Yeah. And this raises real questions. If you find a bug, is it actually inside the codebase itself? Or is it inside the underlying dependencies? There's not a chance of reproducing these. Yeah. Unless you work on exactly the same machine. So, you know, a year ago, heroically, I started packaging this stuff in the new geeks. And basically, essentially, what I did is I took a check out of the geeks itself. Yeah. So, I took, I started from the trunk of geeks. And start building it from there. Yeah. This means that, you know, at that point in time, I sort of fixated the environment. Right? I started building exactly that version of Python. Yeah, that was already, was in geeks. The problem with geeks is that if you keep updating the geeks, the git tree of geeks itself, these packages change also. Right? So, over time, you will be hitting issues with dependencies that are not working anymore. So, I fixated it in February. And next to that, I was using the geeks package path. So, I put in my own packages. Right? Or our own packages. And in this repository, the link is here. There are now some hundred plus packages. And some of them are duplicates now with the trunk because the trunk has moved on. Yeah. And I actually did synchronize in August last year. And it was not trivial, actually, to move from February to August. Yeah. Because there were a number of packages that, you know, were duplicates, but there were also packages that somehow were broken. So, I had to revert on things. Yeah. So, switching, you know, after half a year to a new, an updated geeks tree is not trivial, actually. You know, I'm not looking forward to the next one because it's going to cost me a day, maybe two days. It will be possible. Yeah. But it's something that is worth that you're never looking forward to. So, another solution I put in is using shared profiles. So, essentially, what we do now is we have the network. We fixated the geeks tree. We fixated the geeks package path with our own packages. And then we deploy it. All right. So, we deploy it in something called user local shared GN staging. So, this is a staging branch. And then the version of the tree. All right. So, when we get to the next stage, if we want to move it to production, what I should do is I'll mount the profile on the user local shared GM production with this particular version. Yeah. So, this is how we move forward. And all the developers, they, you know, deployment people, they share the same structure. So, we know exactly at any point in time what is actually in GN network 2.0, A8, FC, FF4. Yeah. Based on the August checkout of geeks. So, geeks channels could be an improvement. And that's what I'm going to touch on in a bit. So, some discussion we have on the main list. Yeah. Geeks channels do not exist at this point, right? But the idea is that we fix, that we allow people to provide different versions of the geeks tree. Yeah. So, for example, you know, if you are, you have a product, like the network, and I want to tell somebody else, you know, please install this product at this version. I would provide him a channel, essentially. I would tell him, you know, tell geeks to use this channel. And from that point onwards, you know, he'll be installing software from that particular version of the tree. Yeah. That's ideally what I want. And you can also roll backwards. You could say, okay, you know, I have, you know, Ruby 1.8.7 is a very old Ruby, and sometimes you need it. Yeah. You could actually provide a channel for that. Say, okay, I'm going to provide a Ruby 1.8.7 channel that other people can use, right? Which is disconnected from the main geeks branch, the main geeks trunk. Another thing that is a problem with, you know, geeks package path at this point is disconnected because it actually doesn't look like the geeks tree itself. Right? So if you create a package in your own geeks package path and you want to migrate it into trunk, it's actually a bit of work. Yeah. Work is not good. Yeah. So if we had a channel that, you know, which is actually a reflection of the geeks tree, it would all become much more trivial, you know, to merge patches with the main trunk. So for now, I'm the one who is juggling branches. Yeah. Because I cannot even explain to others how to do this stuff. They're not ready for it. So the second part of my talk is about installation and I gave this also yesterday for the HPC group. We also have the problem that we are running on high performance computing systems and supercomputers. Yeah. So we want to run geeks packages there and, you know, these guys and girls who manage these systems, they're highly resistant to giving you a route. Yeah. They don't want to give you administrative privileges. Yeah. So, and geeks itself, you know, requires a geeks demon to install stuff. So that's, you know, to start that, you need to administer the privileges. It's a no-go in many HPC environments. You know, there's a few exceptions now, but it's, yeah, rule there has got one. But it's rare. It'll come over time. You know, the administrators will start to realize that we actually are saving them time and they get better environments. But anyway. So the alternative is, you know, people to circumvent this question in HPC environments. What they start to do is use stuff like brew, conda, easy build, which actually are built environments that work under a local user account. The downside of these systems, these build systems, is that they're non-reproducible while they're reproducible as long as you're using the same home directory that you're, you know, building stuff in. But also it's hard to share. You know, it's hard to share these build tools. And you need to build from scratch every time, which is a lot of work. You need to bootstrap. Yeah. Docker and container solutions in HPC environments are not an option. Yeah, there are actually HPC environments that are trying to provide them now. But, you know, even Docker needs to administer the privileges and people are always worried, you know, that their systems are going to get screwed. So something I wrote in the last couple of months was called Relocatable Geeks. And it was based on an insight which I had with Elko Dostra, a few years back at FOSDEM, and Elko was here a few hours ago, but he left. He's the inventor of NICS. Yeah, NICS and Geeks was forked from NICS. So, you know, there are many shared similarities. And one of them is that, you know, we have a path slash new slash store, and then we have a hash value, clip C version, and then, for example, the name of the file. Right, and the key insight is that this, you know, this is, if you look into files, this is quite recognizable, right? This hash value is unique. So what about, you know, just patching these files with something new? If you look at LDC2, which is a decompiler, and you look at the shared libraries that it uses, you can see this pattern of hash values, right? All fingerprints. Yeah, some of them are shared. So the clip C ones are shared. Sorry. So how about relocating those, yeah? So replace them with something, is a target prefix. So if you have a home directory slash home slash user, yeah, we're going to replace them. And it will look like this, right? In this case, I installed it in the home directory opt LDC test, and then, you know, I've rewritten the hash value to look like this, and these are the files. So after installation of this LDC2, it's actually sitting in the home directory, slash opt, and will this resolve? An answer is yes. It will just work. Yeah, so just to reiterate, yeah, we are replacing that value with that value. And let's see. Looks like I'm skipping something, anyway. Now I did this as an experiment, and I used Ilka also created a tool called patch elf, which actually allows you to rewrite elf files. And it worked. And for textual files like Ruby, Perl and Bashkrips that have the same fingerprints, it also works to replace them. But some formats like compiled Python files and JVM files, you know, turned out to be a little less easy, mostly because they're not zero-terminated strings. You know, they're actually strings that have a length indicator in front of them. So one night I came up with the solution and said, why don't I keep the size of this path exactly the same? So I don't have to worry about what it actually looks like inside the file, you know, whether it's zero-terminated or has some other length indicator. So that was the second insight. So if you see these two paths, you know, now they have the exact same length. I just passed it in, you know, you just slam them in. And it works. It's definitely the easy way to go. Yeah? So I replaced it in all files. There are some URLs on here where all these projects are, so you can visit them at leisure if you want. Yeah, so this is the idea. You know, in this case, I have home user opt in the first one, so I slam it in. In the second case, I tell it to install a user local share LDC 1.01. And you can see it says the exact same length. And what I do is actually I cannibalize the hash value. Yeah, so it becomes shorter in the second case because the prefix is longer. Yeah, you can cannibalize a long way. You know, I can cannibalize all the way back to it doesn't exist anymore and then I can go a bit further too. And if you add it all up, you have about 40 letters, letters you can use as a prefix. You have to come up with these unique paths. And cannibalizing the hash value is not that important in this case because in geeks, we use it to isolate, you know, all these directories. But in this case, I tell it to install it somewhere unique. Yeah, so I'm telling it to install it in this folder or in the LDC test folder over there. Yeah, so what it in the end looks like internally, it doesn't matter because we're not sharing. Okay, so for the LDC compiler, there's an example you can actually download this. It's online. It's a 40 megabyte download. You unpack it. It's about 140 megabytes. Installation of the binary takes three seconds on my laptop. So this is very different from, you know, these easy build ideas or even docker installations. You know, this is very lightweight. Yeah, the only thing I'm doing is replacing all the paths. And there's many people who've tested this now for the decompiler. So yeah, they're happy. And yeah, there's two other things that are important. One is that the shared library, that actually, you know, this package, I can call it a package now, contains the shared libraries all the way down to Glypsy and even the linux loader isn't there. Yeah, so with many systems, you have this problem that if you deploy binaries somewhere else, yeah, you will get conflicts, for example, because they want to load the underlying distributions, Glypsy or Lipsy, yeah, or LZ4 library or whatever. You know, and there can be problems. There can be mismatches. So you see a lot of internet forums, they discuss, you know, these problems about libraries not working. And that's the reason. Ten more minutes. So I also did the same thing with Ruby, SSL, NikoGiri, where NikoGiri is infamous for installation. Yeah, because it's a Ruby jam that depends on libxml2. And it's, you know, the internet is just, you know, I think half of the internet is filled with messages about people having problems installing NikoNikoGiri. It just works. In some Bamba as a tool we developed, it's written in D and is used in many high sequencing, in sequencing high-performance computer centers around the world. And there was a very, very one hard to reproduce, bug. It's sacfolds occasionally, right? In some conditions, in some environments, but it's never reproduced. But we can see it pop up once in a while. So I created the binary distribution. It was deployed on a cluster in Australia. They ran it for a day. They saw some of these sacfolds. They ran the GDB debugger against it. And we found the exact location where the sacfold happened and we fixed it. So I created the binary distribution. It was deployed on a cluster in Australia. We found the exact location where the sacfold happened and we fixed it. And it's actually not in some Bamba, but it's upstream. So I'm actually doing remote debugging. Somebody else is running it and I'm helping with the debugging. And we're doing more. We're going to add more. So the potential here is also that we can actually create one click installs for binary packages. And it's something I want to discuss also in the future of geeks. There are security concerns, of course. You know, when you create a binary that's downloadable from the web and you just install it with your eyes closed. We need to work on that somehow. But I think it is really cool that we can actually use a geeks package that is well tested and has been possibly been used by already 100,000 people. And we just deploy it by rewriting the internal paths. So the third part of this talk and I hardly have time left, so it's going to be really short is about workflows. You know, we have a need for running tools in order, sequentially and in parallel on these high performance computing systems. And essentially it's called the workflow. And there's been many standards that have been developed for workflows. At the moment there's a project called the Common Workflow Language and it has a lot of momentum in our community. And it started nice. You know, it started really as a good idea because it says, you know, it started as a descriptive document of the workflow. But soon they embedded Javascript because they wanted to do more and they wanted to avoid repetition. And they also have looping now. Yeah, so deterministic it no longer is. And when you look at these workflow standards occasionally someone will pop up and create a new standard like this one and it turns out it's actually quite hard to do. Yeah, because people want to put in the kitchen sink. That's what really the problem is. So Geeks essentially is already a workflow engine, right? I mean, we can handle dependencies which is serial execution and we can also handle parallel execution in a way because of the build farm. The way the build farm has been designed. Yeah, so Ruler is going to talk about workflows at 3.30 in this room and he's been doing a lot of work in this area. I'm not saying that build farms are a great idea, but he's probably doing something else. The fourth part of the talk is orchestration. So making the rounds, visiting companies and academic institutions this is something that pops up every time. People are using tools like Puppet and Chef and likewise. And Ludo describes services which are inside a computer. But we also have services that run across computers now, right? So we have systems that depend on each other. And this is called orchestration. One service may need another service to run. How do you set up the order of these? And when the one service goes down what do the other services do? This is called orchestration and I think with the functional paradigm that we have with Geeks, we can actually move forward to make things happen here too. So this may be something for the final discussion today which is about the future of Geeks. Conclusion. Geeks allows for controlled and same software deployment with profiles and Git. The way I'm doing it and we should have channels to make it easier for me. Geeks has a relocatable binary packages now and we should continue experimenting with that and see how far it gets us. Geeks will handle workflows, I have no doubt because rule is working hard on it and orchestration should be an agenda. That's my conclusion. And I have acknowledgments for people who are working. So Rool, Ludo and Ricardo, especially the Geeks communities which one of the things that really strikes me with Geeks is that it's attracts so many really intelligent people so I'm really happy you're here. And then I want to thank Professor Robert Williams who's paying my bills. Thank you. So Chris you can set up. Have you got any questions? Yes? Yes. The source code of what? The... I missed the last bit only. If you are not afraid that somebody will ask for the source code. What do you really provide? The question is should I be afraid about the source code if somebody asks for it and it's all online? I only do open source software. Okay, well but you have special packages in the Geeks package part Okay. We do have Geeks package part but it's also online. It's out in the open. I'm highly resistant to using any type of propriety software. Yeah but also our own systems are completely in the open. It's all open source software. So the Geeks package part is just a construct and import a second Git tree into the packaging tree. Right? But the second packaging tree is just I need it because it's not on the main tree. But it is out in the open. Anyone can use it. Anyone can update it. Another question. How do you save the package previous package version? Sorry, let me just last bit again. How do you save package version in just copy and paste piece of text or Yes. How do we handle the versioning? You have to explicitly for every package say okay this is the version I want to use. Or the Git checkout or whatever. Describe in the package. Have a look at the links that are in my talk because it's all there. Yes? Okay, so the question is how do we handle Python packages in Geeks? There's really three ways to do this. To go about this. The best way is to actually write Geeks packages for Python modules. Which describe their dependencies also. And you install the package with Geeks and it pulls it in with the dependencies. To make it easy we also provide a generator. So you can, for example, the package is already in pip or pipi or whatever they're called. It can pull that information and then create the package for you automatically in Geeks. And for this I use the Geeks package path now, for example. But you're not completely delivered to Geeks because you can also Geeks actually has a package called Python virtual environment. So you can start up a virtual environment for Python and use Python from there to use pip and pipi to pull in packages. And the third option is just to use a source tree and to provide that in the Python path. And so Geeks does not force you to work in any way. You can gradually incrementally work towards the perfect solution. So... Yeah, we also have two minutes left. Oh, yes. I'm trying to steal off from you. It's alright.