 So, this new talk is the talk of Patrick Malboer. He will explain how to split a monolithic application with the repository and the rest, okay? So, good luck. Thanks. Yeah, hi, welcome to my talk. I'm Patrick. I'm working as a software developer at Blyanda. And if you came here because you only read the title without the abstract and like I do sometimes and are now expecting I will show you how to split a monolithic application into a microservice architecture, then I have to disappoint you. That's not what I'm talking about. I want to talk about what we did with our application which was, which consists of various Python packages and we had all Python packages in one repository. And at one point we decided, yeah, we want to have one repository for every package we developed. And yeah. So, let's start with this easy example how a Python package usually looks like. You have, for example, this structure. You have the actual package, my super library with this standard init module in it. Maybe you have a requirements file. You probably have a setup.py file. And of course you will have tests. And for us it looked like something like this. So we had lots of these packages in one repository. What we also had was this one requirements file for the whole application. And this was probably the worst design decision at the beginning. So even for the unit tests of every of these libraries we just installed everything and used that in our virtual inf to run the tests. And we had actually a lot more than four of these packages in it. Yes, so why do we even want to split it? So there were various reasons. One of the reasons was other teams of our company started using some of the libraries in their own projects and they always complained about, okay, we want to contribute, but every time I have to look into this big repository it's such a pain to get it just running whatever. So yeah, this was one reason. Another reason was, okay, I called it spaghetti code here. It's this, what I mean here is cross dependencies. So you, for example, in library one you import from library two even if you don't actually want it at the end because it's not well structured then. Yeah, so as other teams started to use our libraries we also had to release them. That's also easier if they have their own Git repository. And we also wanted to use something like setup tools SCM to get automatic versioning. So if you don't know what setup tools SCM is it creates versions for your Python package out of your Git versions or Macular versions. So usually I set a py file, look something like this and you have this version keyword argument where you have to manually adapt the version string every time you want to do a new release. And setup tools does this for you automatically. So here you would just say, yeah, my setup requires to setup tools SCM to be available. And I want to use this SCM version, that's here. And what such a version then looks like is something like that. So here we have one commit after the latest tag. So before I set a tag 0.0.1 and this step one then says, okay, you are one commit after the latest one. And at the end you have after plus and G the start of the current Git commit hash. Okay, so talking about Git commit hashes. So we now decided we want to split our monolithic Git repository, but how do we do it? If you just move certain sub packages somewhere else and then initialize the repository from the start we would lose all the history we already have. But there's, if you use Git, I don't know if there's something similar in Mercurial or any other SCM, but with Git you have subtree and subtree has another sub command split and split creates a new history of commits for a specified prefix. So here I have as prefix library three and if you also specify a branch name it will create a new branch which exactly has this newly generated history. What you can then do is create your new package and initialize it with Git and then you can pull this branch you created before from the monolithic repository. Now you have a new repository for your new library three and with all the history which affected library three. So okay, this has to be done for all packages in our monolithic Git repository. What then changed for us is how do we do our continuous integration workflow because before we had just this one repository it didn't matter in which package we made changes for our next feature as we used Jenkins. Jenkins just checked out the latest commit, every change in every of our packages were available and yeah, that was easy. That's one of the advantages if you have monolithic applications, architectures, whatever. What we then did is okay, now we have lots of Git repositories. Let's just check out every Git repository at the beginning and create our application artifact out of this. This ended up in a really messy Jenkins job. So if you don't know Jenkins, Jenkins has lots of plugins. One of those plugins is the multiple SCM plugin. There you can specify multiple repositories which should be checked out at the beginning of a job and yeah, we did this for all of our extracted libraries and had this huge list and this was horrible. It was really horrible when we had to do bug fix releases. I mean, can you imagine how hard it is if you have to configure all this? Okay, so in Jenkins you have to specify which tag branch or whatever has to be checked out. So if we wanted to do a bug fix release we had to specify the tags or commit hashes which were used back for our release so that we only changed the repository where we had to do the bug fix release. And I think you can imagine that this was really horrible so don't do that ever. What you actually want to do of course is just use your libraries in your application just like any other library. And the problem with that was if you add your libraries to your application's requirements, TXT, every time you change something you would have to raise the version in your requirements file because you pin your requirements of course. And yeah, that's also not a good workflow cause it happened quite often that we implemented new features in our library packages then the unit tests for this library passed then we thought, okay, we can do a release but when we actually used the new version in our application we saw, oh, no, that's not working at all and you want to have this feedback a lot faster. So what can we do here? We came up with this workflow so we run unit tests of our libraries. If they pass we let Jenkins upload a reel to our internal DevPy server and at the beginning of our application job we just install those with the minus minus pre-option of PIP install. With this option you can install pre-releases so beta releases, alpha releases or those Dev releases you saw earlier created by setup tools SCM. So we always had the newest versions in our continuous integration pipeline of all libraries, very new, at least the unit test pass. Then we created another job for actually doing application releases. So if you want to do a release of our application we now have to do releases of all the libraries where we know, okay, with this version it works and can then run this extra release job where this minus minus pre-option is not used. Okay, I mentioned DevPy, who of you knows DevPy or uses DevPy? Okay, not many hands. So DevPy server is a PIPY server. We use it at Bluioner for our internal packages but you can also use it on your laptops. So it's also just a mirror for PIPY and for example, if you are on the train and want to hack but don't have internet connection and have to install a package you could do it offline if it's already cached. Yeah, you can whitelist, blacklist that package and do lots of more things. Another thing is requirements pinning. Who of you ever had this version conflict exception? Okay, my colleagues and a few others. So what we agreed on, we said we don't pin the requirements in the setup py file. So you have this install requires cause if for example, library A requires requests greater 1.0 and smaller than 2.0 and another library wants to use the next feature of which came with request 2.0 for example, then you get this annoying version conflict error and most often it does not make sense even giving a saying, okay, I want to have smaller than 2.0 and yeah, the application is then responsible to use the correct requirements and this will avoid lots of these exceptions. One other comment, so we don't pin the requirements in the setup py file, but we pin them to have a special set on requirements for running the tests. So another developer can then check out tech 5.0 for example and see with which requirements the tests actually passed back then. Okay, so now we have all those repositories. What did we actually gain? So we can use setup tools SCM for every repository. Don't, we have happy library contributors. But on the other hand, for us application developers, it got a lot more complex. So now if you develop a new feature, sometimes you have to do changes in three, four, five repositories, you have to keep them updated all the time and that can be really annoying sometimes, but the quality and the structure of our code really improved. So now we have defined the requirements which every library needs, just this minimal set. So it does not happen that easy that we have this ugly cross imports which you don't actually want and so on. And that's actually a point. Now that you have a cleaner structure, it might be easier to see, okay, which components change with the same speed or something like that and where might be an extra service, where might it be good to introduce a new service with only that library packages. So maybe that, so I think what we did up to now is a step before actually getting to microservices. So, yeah. And I think that's all for now. If you have any questions. Okay, so thank you for your talk. If you have some question for another friend. So thank you for your talk, it was a nice experience. We have a pretty the same problems in our project. And so you mentioned that during the deployment, your Jenkins job, your Jenkins deployment is a simple peep install and it always installs the latest versions, right? So, I mean, it is the latest development versions for CI and the latest stable versions for the production release. What if you deployed some broken package to the production and you need to roll back to the previous version? How do you deal with that? Since I don't think you pin versions during your deployment. So how do you do roll back to the previous version? No, no, that does not happen. So we don't pin what the libraries want, but we pin in our application and we also then install with minus, minus node apps so that we don't get any recursive requirements. I'm not sure if I understood your question correctly. So do you pin versions of your libraries somewhere during the production or deployment? Yes, of course. So, like I said, in the libraries we don't pin in the install requires section set up pie but we pin our, we have an extra requirements file where we pin the versions and we use this versions to run the tests for the library and in our application we then pin, we want library is like 5.0 and so it cannot happen that something else comes in. Okay, we can talk later, maybe. Thank you. You mentioned that you have dependencies between libraries, so library A imports library B and each other. So how did you resolve it? That depends on how, it depends. Sometimes code duplication is better than having this cross dependencies. Yeah, that's basically it. It depends on how this cross reference looks like so I don't have a special example right now, sorry. How about depending one library on the other? So for the library A you require library B. Okay, sometimes it's not a problem at all but we noticed that for some, in some cases we had libraries which depended another library and this one again required lots of others and we had again this big unstructured thing. So, yeah, I don't have the... Did you consider using Git sub-modules for your problem like for pinpointing versions and then being able to have different kind of release branches for that? We said from the beginning we don't want to use Git sub-modules. You tried? No, we didn't be just that, nah, that's too dangerous to get fucked up. Okay, because we are doing that. That's the way how we can pinpoint different versions and do like even, yeah, hotfix branches on a super repository, so maybe we can talk. If it works out for you, great. Hi, you mentioned something about the code quality like by following this refactored code structure you had some improvement in the code quality. I'm just curious, like how did you measure that? Do you measure some kind of cyclomatic complexity or something else? Like what were your metrics to say that your code quality was improved? Okay, I have to admit that's just a feeling I had. No, that came, it's difficult to explain. Maybe we can talk after that. Another one? No? Okay, thank you so much, Patrick. Very well.