 But still in this pipeline we have to think about how we do we get our packages from our developer desktop into our production service and We normally use those with packages and with versions version packing the packages But we still need some kind of process to do this. So We'll now look at the problem that we had at least one guy did something that wasn't that good And then we'll somehow step-by-step see how we can improve this process so We somehow used to have or we had a guy that had a package Let's call this package possum with with version 1.0 And there was a bug in this package and it's all all this tooling around this was rather difficult. So What he ended up doing to fix a bug in this package was to unpack the tower wall Fix the pison inside repackage everything and then he wanted to deploy this to production and obviously that's not the way You want to do the way you want to do things What you also tried is attach a golden copy to this particular package So that you knew that this was a special one and not the ordinary other one with the unfixed bug Yeah, and then as you can imagine if you go ahead and let your developers do stuff like this You will get burned rather quickly so I Don't want to blame the developers because we somehow all of us do this we take the past of the least resistance And yeah, it's this path. It's not what you want. You have to invest into tools and processes to somehow improve this one Yeah, and we will now look at what we did So basically we had this problem. It's about okay. How do we solve this and our solution was that we said, okay We need immutable packages. So when we create a Python package, we want to create it only once and then store it forever and When we give a version for this Python package, we want to derive this version from our git repository so That we know exactly okay It has this git tag and there are five commits after it with this checksum So we know exactly when you see a package what what's the code that somewhere is in this package Last but not least we want to have those as wheels This means we don't don't want to use source distributions But wheel packages because the wheels basically just a tip archive of your pricing files and see code and whatever and your installation You just unpack the wheel or pip unpack the wheels for you. So this is much safer than a source distribution and What you also do is when we install a package we pin everything So we're basically sure that not only your packages of this version, but also all your requirements are as you expected I've a short example how we actually do this because Most of our internal tools use a open source package called setup tools SCM Or we use a wrapper around it, which we call pie scaffold But the idea is basically all the same. I find example repository here. It's called also called possum it has two get tags and one dot one one dot zero and When you go ahead and create your pricing set up I be at least wheel So we build a wheel what possum does it goes ahead and what the setup tool SCM does it goes ahead looks at your tags so it's these okay, there's the tag one at one and It's these okay of two commits that go beyond this tag. So it creates a version number called So one dot two because it increments by one and then the number of commits I've made and the checksum so basically if you want unique version numbers for your packages look at setup tools SCM It will basically solve this for you So what from the stuff we have mentioned before we now have the Yeah, the foundation laid out we have packages that have unique version numbers But now we have to upload them somewhere because it's not enough that You've created a unique package on your desktop or on Jenkins or whatever, but you have to do something with it and For us is it we upload those packages to deaf P deaf P is a package server for Python packages and for wheels and It's developer Holger Kruegel and has the great advantage that it's compatible with pipey python.org So you can use it just in the way you would look you use pipey, but you can run it on your own hardware and your own On stacks, so you don't have to upload your packages to the internet, but can use your local your local deaf P server A short example how this looks like in the first line we say, okay I want to talk with this deaf P server and this particular user on and this index We log in as this user and then say, okay upload this package and the package ends up on the server and What you basically see in the last line. It's rather easy to install a package It's an ordinary pip command. You just specify the index you want to install from so You take the URL of your deaf pie server deaf P server and can install from it so basically from a pure perspective of what deaf P offers that's that's it for for the few of one developer the developers also get a web web Website also similar to pipey python.org you get the metadata you get which packages have been uploaded So yeah, see you see this documentation and the doc string. That's basically what you get out of the box So we can now look at the entire workflow that we have we've built So what you see is we have our developers they check into good So you get great tags and commits and whatever and then Jenkins goes ahead Fetches the latest Let's check out build these packages runs all tests and once all unit and integration tests have passed our Jenkins jobs go ahead and upload these packages to our deaf P server and We basically store all our packages on deaf P forever and then when we want to install something from production we just look at deaf P install from there and Yeah, that's how we somehow do do these things and I guess for some of you in the audience This might all already be a workflow that's suitable For us at Blue Yandere is a little bit more difficult at least because we have a few constraints First of all we have multiple teams that Collaborate so with multiple teams multiple artifacts that upload stuff and We want to somehow keep the artifacts of these different teams separate So good example before we use deaf P We just said one network share and we placed all our packages in a large folder on this network share And it was somehow a total chaos because some sometimes people upload stuff and buggy versions And you didn't know didn't know who did what so for our new solution with death We want to separate teams a little bit like namespaces So that's the first constraint second constraint is We have not only passing code, but we also ship stuff like Fortran code or numpy or whatever So stuff that's compiled and we don't want to We don't want to compile production. So we don't want to have GCC and our Production servers and we don't want to spend the time Compiling on a desktop. So what we do is we upload binary packages to deaf P All in the form of wheels. So we must support this as well And the last thing is Even though we want to keep our teams separated We still want to somehow have it easy to manage their dependencies So if my package depends on the package of another team and some of them open source packages We have to somehow provide a solution that's working for all of this And then we'll now look into the deaf P box up there so we can see how it How we've achieved this So okay up to our deaf P usage Shortly back to the example we've seen before Here to two important things to notice first of all we look at the wheel. I've said that we want to upload We want to upload binary packages and those are built on a specific distribution for example on our Jenkins we go ahead and build a package on Debian 7 or on Debian 8 but unfortunately this same package won't run on Debian 7 because we've compiled against some system libraries, so it won't work and We use some the real format someone has a support for this But unfortunately only says Linux, but it doesn't say what kind of so we have Yeah, we need to work around to this when we lay out our deaf P Users and indices and this is the second important thing you see is in the upper corner deaf P is a little bit. Yeah, if he supports users and indices and you can have multiple users and multiple indices In each user's kind of multiple indices is a little bit like github We get up you have multiple users and each user can have multiple repositories. It's the same on deaf P just with users and indices and Yeah, you can think of an index just like a bag of your pricing packages Okay, so how do we use this to manage our stuff? So what we ended up doing is we created for each team. We recreated a deaf P user so they have their own credentials to login and then for each Operating system that we support we create one index and when we'd now build packages and Jenkins We upload to this particular index. So when we build on devian 7 we upload to devian 7 when you build on devian 8 we upload to devian 8 This works for these binary packages, but there's also packages like their pure Python packages no compiled stuff You don't always want to upload these to all your to all the operating system indices So we also introduce a so-called generic index Where you upload this pure Python packages and then we use a feature of deaf P which is called index inheritance So we can say okay this Debian 7 index inherits from the generic index and then you can see on the devian 7 index all packages that are also available on generic So this is basically the few for one team, but as I said we have multiple teams and So what we do is we create one of these users for each team and then have a so-called aggregation index or we call it the platform index and It inherits from the corresponding teams So if you look at the platform index it has a let's say devian 7 Index and this devian 7 index inherits from the devian 7 indices of all the different teams So in the end when a user wants to install the library or whatever You just write this command line below. So you say okay people store my package From this index server or from this step by and on this step by take this platform And the last part is the operating system you want to use and then you get the pre-compiled packages for this particular operating system So this is basically how we install all of our packages except a few specialties and One of these is that our team work and now we have the open source packages These are not built by our or crafted by our teams, but it's basically always available in the open source community let's say jungle and request and whatever and The different teams depend on those open source packages as so their software won't work without them, but we have to upload them somewhere and What we decided is okay We somewhat treat the whole open source community as one team in our company and created one particular user for them again with the different operating system indices and We now have a company white list where all packages stand on that we depend on and we build those using a tool Which we call that the builder it uploads them to the Touros as user and from there we can install them So yeah, that's basically the stuff for what we use for libraries and so on but That's also a thing. Yeah, sometimes you have packages which are not a library But a specific application with a specific configuration that supposed to run at one specific mission And you don't really want to share it for example We do a machine learning model So when you have a model for one specific customer You don't want to upload it to the opera provider part because then everybody could use it But they are not supposed to it Wouldn't make any sense for them. So we also create consumer indices Which are basically just indices inheriting from the from the platform so they get everything But can also upload their own stuff Yeah, that's basically our index layout how we use it and I can get that some of you think that's totally complicated and somewhat unnecessary But for us it has been a great success We somehow introduced fp in our company about a year ago Within days or even days or weeks Everybody has migrated to use it. So the adoption has been great and in this one year We have uploaded about 10,000 packages in documentation tables We uploaded those to about three hundred fifteen indices and Yeah, those have been downloaded about all the downloads accumulated to about 18 terabytes. So it's quite a lot for packages But we were able to do all this with only Yeah, small missions with four four cores and four gigabit of RAM so the whole setup is rather light-white Yeah, which is somewhat a great thing and So but we are now using all of this on Jenkins on desktop and service everywhere and We somewhat have the problem if the FP goes down Most of our developer cost can't go home for the state because yeah when they commit stuff The Jenkins job won't run because they can't upload the packages. So basically if that he goes down we can't work So we have to invest. How do we make the stable? How do we keep it up all the time? and This is how we do it. So there This is now a single host setup. So we just look what processes are running on this host So at the front we have an engine X to some of the recommended setup But what we now do we use a replica and they master and this is a concept of death P Which somewhat a you can think of it as a let's say transparent cache If you talk to the replica it can directly serve your packages if you want to install something But if you want to upload a package the replica transparently talks to the master the master Yeah, perform the state change uploads your package and finally replica can serve your request and say, okay everything worked For us, this has a great advantage that We come we wanted to a backup of the master We can simply go ahead disable the master, but the replica is still up So the requests will come down go to the engine the engine X will pass them to the replica replica will allow people to install packages If an upload and someone tries to upload a package this won't work But at least we can serve traffic if the if the backup is done We reactivate the master everything works again, so we can somehow do zero downtime update updates or backups For installations and it also helps when we have to do a major version update of death P There you have to export all the state from the master Create a new master import all data again, and this also a case where we can use The setup so we just disable the master create a new one, but the replica still serving traffic Now this is a single host setup, but you still have the problem. Okay, this single host might go down So we somehow have extended this to a multi multi-host setup The idea is basically the same you have two hosts Reverse proxy in front which basically routes incoming traffic to one host or the other and On each house we again have a replica and a single master. We're talking to death be supports only one master, but that's not a problem in our case because We we basically stick it the same guarantee that we as we have before because If the host a goes down, it's all right because host B still there to surfs traffic using this replica If host B goes down, we can still surf traffic and also upload packages and For us the serving traffic part is this important So we always want to allow people to install packages because if they can't upload packages They all they only do this during daytime. So we also can fix our host setup during office hours But the installation path also has to work at night. So this is basically a great setup for us because yeah, we are somehow We we don't need really need. Yeah, we don't need to stand up at night to fix this Yeah, this is somehow the how we used fp the general structure just the short repetition what what you've seen So as first I said we create wheel packages so binary packages We give them unique version. I must using it We then upload all this to death be indices with one index operating system Here's one fact that I've missed before We flag all our death be indices as non-volatile This means you can upload new packages to them, but you can't override existing ones and you can't delete existing ones So this is important because then we yeah We can say we only create packages and a user doesn't has to have to feel that once you've installed something and installs again You get another version or whatever, but he's always sure to get the right one yeah, last thing is we create users for internal teams and So yeah, that's basically it and if you're Adopt this setup The first three steps might be enough for you for now, but in the future if you think about okay It has to be highly available. You can look into the replica feature and use it for your advantage So but one last tip for you if you decide, okay This is somewhat interesting to lesson that we have learned that might help you first of all You want to set up a test server that your users can use Plus we had many cases where people created new build jobs for new artifacts and as you know Jenkins your first Jenkins drop your first run won't always succeed. So you get many broken packages and Then the people had to come to us and say, okay Can you please delete this package because yeah, I somehow messed up the versioning or whatever But all these index are non-volatile though. They can't delete it themselves they have to come to us and this somewhat overhead and You can solve this if they all if I also give them a server, let's say, let's you attest death be server dressed install there That's somewhat helping there and the other thing is You is somewhat advisable to not let your users use death be client directly It has some important command line argument that you have to use in a shared environment Or when you're using stuff like it STM and people tend to forget it So if you're using client wrapper, you can somehow fix those arguments for all of your users so they can't do it wrong Yeah, so That's basically the run through Through our setup. I hope you've enjoyed it. So thank you for your attention I'm open for questions now, but you can also come by the booth and ask us for afterwards Thank you Any questions Yes, you were talking about those the group for all the dependencies from the open source community can that be work like a proxing Repository for them. Yeah So if I request a new Package that seven hasn't been downloaded and I target the fight if I would download it and cash it for later Right. If he has a mirroring feature. So by default you wouldn't need this OSS index how we use it But if all deaf be has a specific use an index, it's called root slash pipi and You get all packages from pipi python.org by default. So you don't have to do anything manually It's just as we decided. Okay, we want to repackage all those packages as wheels and Most of them are not available as wheels on pipi python.org. So we somehow say, okay, we want to upload them specifically and It also has the advantage if you somehow do this meant this menu or somehow manual step that you say, okay I want to explicitly say these should be on my desk That if the company that's for example, we do software as a service that we have to look at the licenses We are using so We are we're not allowed to use any pack or all of packaging that are well been the internet But we have to look at the licenses for example, please view MIT is okay So we somehow do this license check when users say, okay Please add this package and this versions to this large white list and then our tool goes ahead and uploads them automatically So we have this step in between, but you don't need it. You can by default use the pipi mirroring feature. Yeah Okay, thank you How do you do authentication when installing packages? So I assume the DeFi server is available publicly. So the like in the internet so the service can actually install from it You mean authentication during install or during uploads exactly because I think you upload private packages So not everybody should be able to install them It's that's a problem somehow and the server standing in our company only internally. So it's not available from the internet This is somehow feasible for us because we don't run stuff on the open cloud But only on our own computers on our own data centers. So this works for us DeFi by default you have passwords when you upload stuff You can use LDAP users LDAP groups for authentication or just users you created you serve on the DeFi server Networks, but during install. There's no authentication there So by default if you know an index and user and the package that's there you can install it Yeah, that's the current so the only way would be basic off or something something like that from HP Once again, please basic authentication would work. That's what we use, but I thought maybe there's a better solution Yeah, that might work would I would be interested to talk to you afterwards. So, yeah, but for now we have it's it's basically open for us use to build wheels and In your private repositories, but also use the mirroring feature So is there any way that I can have like a local index with my packages that are aren't exposed Still use the mirroring feature, but I can build a wheel for LXML or NumPy or whatever that works that works Because I said you have this root pipi index and what you can do is you can when you have your own index You can inherit from this with pipi. So first of all, you will see your packages and all from the internet And if you then decide, okay, I want this specific package as a wheel you Repackages as a wheel and upload it to your index And then you basically get the rest of the internet your packages and this one as a wheel But there's one thing you have to look out for that somehow is security issue Because they feel defeat or it's a feature When you upload a package that's available on your index and also in the internet def because the head and says, okay I won't look at at the internet for this package anymore because Otherwise, some guy might go to pipi python.org and upload a package your private package with a higher version number And then you would get this so There's a white listing feature and if you explicitly say I uploaded this one as a wheel Also look for other stuff on the internet Let's So I kind of have to pin the version if I upload LXML 3.4 and 3.5 gets uploaded. I have to rebuild it Thanks for the talk firstly. Secondly, do you support something like snapshotting? I want to deploy a version of production and I want to know that version 2.0 For example only has a set of packages. I want to still be able to develop further So unfortunately, I didn't get the question, but can you repeat it, please? By default, there's no such feature, but you can somehow rebuild it yourself I mean there's a feature called you can def be it's def be pushed so you can say okay on this particular index It's this packs in this version and I want to move it to another index without modifying it So you can basically do the snapshotting like Okay, now move all this package to another index and then it can't be modified. Is this what you what you want? Yeah Maybe an explanation why we don't use this feature Because yeah, we have some guys now company that likes to automate stuff. So it would simply automate this Def be push command and then there wouldn't be any advantage So we said, okay, we don't need it because they will automate it anyway at the end of the Jenkins job and yeah Any more questions? Okay, thanks very much Stefan