 Hi many thanks for the warm welcome. It's great to be at your person today and My name is alexander. I'm working on the exchange team at smarkas. I'm one of the main authors of March bot which is a technology we're gonna discuss today and this is my colleague maker You want to quickly introduce yourself? Hello, I am at smarkas the head of security and a few other things So What we want to talk today about is March but it's an open source project. We are doing and See aim of it is to make sure so that you automatically always have a green master Which means the head of your master should always pass ci test and Sadly, this is not Generally the case and most ci setups I will discuss why that is and I mean it's one of my pet peeves because It's one of those things which actually is not particularly difficult to fix and I think if you big benefits And it should really be a default With a github or a github or whatever and at some moment. It's not any some external tool like March bot. So For later, I'm Gonna come back to this. There's some something slightly wrong with this picture Have a look maybe you can spot it. I will reveal it later Cool So let me give a brief outline of the talk So I'm gonna talk about what's the problem. So typical ci setup is also how our journey at smarkas was how to fix a typical ci setup and From a conceptual perspective and also from a practical perspective because conceptually it's actually quite easy but you need to do more work to make it work and practice and in addition to Solving the problem of never having a broken master for from green pull requests I'm also gonna talk about some of the additional features that March provides for you, which are also of interest In our particular setup, but I think also for other people and as the audience take a ways I Think The obvious one is if you're using github, I encourage you to use March bot, but even if you're not using github I think There should be some useful take a ways. It should be clear how to adapt those to other things and I will also Briefly mention some alternative technologies. You can use a eG for github or how to do yourself and hopefully There's also gonna be a little bit of useful Git workflow discussion So cool So first of all, I mean at the risk of Statings the obvious let's briefly see why broke masters that I mean it's a foremost problem Which actually was quite painful when I started at markets is Is that when you can't rely on? Master working correctly when you start new feature work On our branch it can be a very high overhead to figure out what actually is your fault And what was a fault that you started off from a broken master Obviously it also makes it more likely said you ship broken stuff to production Which is very undesirable and a more subtle point is that also makes it much harder to sort of retrace your steps Like if you find some undesirable behavior, which you would like to be a bisect also bisect It's gonna break pretty badly if you don't know which commits have random unrelated Failures which are not really the buck introduction you're looking for Cool, so let me start out with just mentioning this markets workflow as it was when I started its markets and Basically Alice see a rockstar coder sort of wrote some codes and sent a patch wireless lag to Bob see software builder and Bob also rejected it in which case we go back to square one or approved it and which case Alice I'm pushed it straight to master with some additional Meta information for auditing purposes, which maker can talk about a little bit But is that kind of was it and then see I was one and I mean Obviously this is not ideal because it caused frequent breakage I mean in principle of course developers were meant to test think thoroughly before pushing some but it's much better to automate that with CI so Yeah, so Basically is there two reasons that our master can be broken. So the first one is that workflow and Gonna talk about how to fix that March but we'll do it for you And the second one is admittedly a bit more difficult. It's flaky bills So you can still if you have a good workflow, which most people don't Be bit by non-determinism But that's difficult to avoid in a sufficiently complex project. Okay, so The obvious solution to Improves this workflow is like maybe do CI first before you push things to master and maybe don't slack patches use some review system like Get lab or get hub of fabricator provide And in order to get this happening We first had to get I had to get Approval to to get GitLab Enterprise Edition, which I did and then we were sort of closer to the typical best practice workflow, which is You send some pull request some or we use it and some review system. It passes the irons and gets merged into master Can you just raise your hands? How many people roughly use a workflow like this? So I would say the majority of people and Do you sometimes have broken master despite using this workflow? Yes. Okay, cool. Excellent. So So There are complications to this because master moves. So there really is another Choice point namely, do you have a merge conflict? Yes or no? And if yes, you go back to square one if no, it becomes a new master, but there's more I mean master moving of course obsoles your CI results. So The next choice points is do you have a logical conflict? And if yes, no, you've got a broken master and that's that so How can we fix this? Well, it turns out as they actually is A way to fix it with GitLab out of the box without any other tooling So this might be a little bit hard to read, but basically you need to configure GitLab so that you only allow Merge requests to be merged in if they pass CI and you insist that it needs to be a fast forward merge and The effects this has is that's a new master will always be what you tested in CI, which is what you want, right? I mean you don't want to merge something in master if it hasn't been tested properly So the unfortunate thing is the way it works is Someone who does a rebasing all the time is you so I mean someone approves your merge request You rebase it to make sure it's current and then it goes through CI and by the time CI has done something Maybe someone else has got his merge request in and so you need to rebase again and This Can be ever so slightly painful Because it can take a very long time because everyone is emerging is is racing to rebase their things faster and CC eyeslays are spinning like Matt and and you sort of have a set rock star developer and So that rock star developer in this case was my former colleague Daniel Garan He decided to do something about it as a set rock star developer, but and this is the story of this bottom So what does this spot do well it does it does a rebasing of master into your branch for you So what's the I actually tests will be the new master branch as it will be and therefore it will always be green Does anyone know the difference to the previous thing there is a lot of students manually So there's one there's one box missing as a master moved and and that's kind of important because like if All is about that is like press rebase rebase rebase. It was still not scale at all because you would have Well and branches Where and is the number of open pull requests rebuilding all the time and something gets merged to master which is not scalable so survey it works instead is March bot maintains a cure and merges things in one by one and so technically as there still is this check But it's just if people sort of push things directly to master subverting support process, but cool Very quickly. I want to discuss how it can happen that you have a green pull request and master still broken One thing that's important to realize is even if you've got a good set up not all your Commits will actually go through the I generally because it's simply not scalable not even Google can do that for example So you will have some Normally what happens when you push some things the last commits that you pushed goes to see I and all the other Commits before don't make it through the eye So but let's look at those two pull requests here pull request one and pull request two Say both are cream on the last commits and tested in CI but it doesn't mean there is no logical conflict between some and some when you merge it in you end up with a broken master, okay, and a Couple of example in practice houses can happen one thing that happens quite frequently is you basically change an API in one pull request make it better like fix some type on some method name or so on someone else adds an additional calls that similar thing is like if I improve test coverage and one pull request and someone else Changes the API is this is also gonna break and It's the most interesting case. I think it's fragile base class problem Even if you have quite a good work flow and people don't step on each other's toes in general you can have interesting pre-credge due to some base class far away being modified and one pull request and and You modifying a subclass and some other pull request and You just sort of inadvertently rely on some implementation detail Okay, and none of those things will actually cause a merge request. So you will get a broken master was a default CI model So I just very briefly want to show that it's quite simple to set much up So you basically create SSH key create a kit lab account and token and sign you just Type in this line. So you see there are sort of three things in orange This is what you need to supply you need to supply the SSH key You need to supply the token and you need to supply your get lab URL and such as you're good to go I mean you then add March bot to all the projects. You want her to take care of things for you and I'm actually gonna do this towards the end But basically the workflow is exactly the same as you would normally do with this get lab instead of pressing on merge if build succeeds you assign To March and March takes care of it for you if it Succeeds it will merge it in if it doesn't succeed It will leave a comment and reassign it back to you telling you why what's the problem was cool So it's a conceptual fix as I said is actually quite simple only to do is like maintain IQ and go through it one by one and Your master will be all screen. So the main difficulty is was actually making it work in practice You need usability familiarity and scalability for this familiarity means you need to build on some existing solution like get up Or get lapsed people are happy to use So there's some sort of spit work to be done to Bend the get lab API to actually allow for this because it's not really meant with this particular use case in mind and Sensor various things you need to do for usability like for example March bot is quite good leaving comments telling you about what's happening We have some slack channel integration stuff like this We also have a small trick of prefixing the name the space so it comes up first when you assign to it's the first user you see So scalability is More work and you don't actually need it immediately I mean we started out doing March bot in a non scalable way and work completely fine for maybe a dozen users or so But if you've got 70 users like we have now it no longer works if you Basically run a single queue and do each much request individually in this queue So you need you need to batch them up and I'm gonna very briefly explain how this works so This is similar to the previous slide. So we sort of have like a A Couple of pull requests Three one is failing two other ones are fine and some what March bot does is it takes all the pull requests Up to a fixed number sets that currently have passed the eye And I create a temporary branch Oops Creates a temporary branch With all the good pull requests, which means they are have passed the eye and also you don't have a merge conflict when you When you try to merge it into the existing Temp branch and the rest is ignored for now and son You try running see on the temp branch if it works you throw the temp branch of a instant merge all the Individual branches in the same order you merge them to temp branch And otherwise if it fails you go back to either splitting the temp branch or merging things one by one And that makes it quite scalable cool and That's basically it. I mean, this is how you get a green master But there's more but before we go to that Micah is gonna tell you about the peculiarities of our Workflow and our requirements. Thank you Alexander. So among the other things I mentioned I worry about compliance I can see sneakers and I worry about how it affects engineering and The workflow we have is impacted by requirements due to regulatory jurisdictions and auditing as for background We have about 70 engineers Split across 11 teams and they maintain roughly 130 unique and Individual services That's a lot Komis landing master every few minutes as you can see and in branches even at a higher rate March is merely a gatekeeper. We ship to production about once an hour. So 10 times a day more or less And as we add more engineers more teams more projects, we cannot have this become any slower So are there any of you who work in a regular industry? Of a few have you had to change your workflow due to auditing requirements. I Have well more hands than before Okay question were those fixes for the better or at least they improve your workflow Okay, so this is where March comes in so as I mentioned We are in an industry where we get ordered all the time and requirements vary by country by country We can't cherry pick what we do. We satisfy all of them at the same time. It's not an option it has to be done and Since we want to maintain the velocity we cannot cripple the workflow This is the thing that March actually provides us and it's also made auditors and auditing is easier Which you wouldn't believe it's not even possible so Think about requirements what you might have Auditors want to know who who wrote the code when How was it tested? Who approved it? When was it shipped? Okay, who shipped it? Who thought it was even a good idea to write this piece of code? Yeah, these are questions you might want to know even if you weren't being audited. They are basically development hygiene What we do with this is we add git and March to the mix We get out of the box commit themselves. They tell us who and when that's an easy easy part Git and March but in particular with trailers which Alexander will expand later on tells exactly who tested who approved When was it built part of which series? We have March rewrite the commits and add these as a trailer so we can see from commits when they happened and actually what does the history and Then we for deployments We adapted our ship tool to record all the deployments against the commits. They were built from in the git nodes Don't do this The UI for git notes is atrocious, but it actually works nicely when you know what you're doing Eventually the question is why do we do this with git mostly because you can get all of this from other tooling They are third-party solutions that give you most of this out of the box But then what you have is that information is now held in the vault of that particular product You are locked into a particular vendor particular product particular workflow. You can't change what you do and You can see the couple of white items Those are business decisions. I can't help you there. Nobody can you know what you are doing and why? But the point of you doing this in git is that it makes it easier to work with and it's allows us anyone to clone the repo and Get the full history of only not what was done But also when it was gotten in production because as far as auditors are concerned git is pretty handy It provides Not immutability which auditors would love, but it gives the next best thing. It gives tamper evidence You can rewrite history But at that point you change your commit hashes it is obvious that someone has marked about and done horrible things to your git repo It's familiar you all know how to use it if you don't you're very close to get learning it and Above anything, it's platform agnostic. It runs absolutely anywhere Maybe not some qnx, but who uses that in any case the idea is that we are not bound to any particular solution Any particular vendor what we have is a solution which kales with us and allows us to move as time passes to other vendors other products and keep developing as we want and I'll handle over back to Alexander about how the workflows work in practice so over to you cool, so at smarkas we use a mono repo and Rebase based workflow, and I wanted to talk a little bit about some motivations and get workflow theory versus practice Because I think in theory something sound quite attractive. So for example sub module sound quite attractive as an ideal. Let's say you've got a Development model where most of your things are microservices Why not have one repo per microservice and have one big repo, which is like the actual production repo and yes It's microservices sub repos because that's quite nice it gives you like a macro view of what actually is in production and a micro view of the new video repos and Typically developers for a particular service only after care about the service when they do get crap or clone or whatever They just get stuff that's relevant to them. It's easier to set up. See I also so it sounds quite tempting, right? similarly merging sounds like quite a good strategy Because again have the sort of micro micro distinction When you rebase you typically lose the information of stuff came from us when you merge you can Typically with a feature-based approach you can have you can see a merge commit for every feature that was merged so the merge commits as a macro view and See individual things on the branches sort of give you some drill down into the development decisions that were taken and Important point also is for ordering whatever else you sort of have the same history all the time that has many benefits okay, but as Often theory and practice I think are not said well aligned. So in practice sub modules when you do like Google image search for some Even people in the are complaining about them and I think the solution is basically just go monorepo or use some other solutions I think sub modules add far too much bitterness and complexity to be worth it in practice Another problem with a merge-based Approach is Reverting things in good is a pain. I mean like there's a famous email where Linus tells you how to revert things properly and Well, good luck getting everyone in your organization to actually understand its contents and not mess up your master branch Let's see how we revert things in smarkas. We say get revert merge request and send the merge request number So which one would you rather use if production is broken? Cool similar problem this merge-based default Get workflows is it's difficult to figure out what actually broke something and typically When you do a feature-based development you care care about which pull request or so that was merged and broke things in production and You might be thinking okay. I know I'm just gonna use git by sector finds this out. Well, good luck to you Because and practice and git by sect is Quite difficult to actually get anything useful out because Without some third-party tooling you can't easily just look at merge requests for example Also One problem is by acting things often is as we saw before In a realistic set up not every commit will be tested which means not every commit will have past See I and work and that means bisect will sort of stumble over some random problems Like you might be looking for some severe bug and and and the test fails because some Linda failed on this commit or some Monsons like this okay, so How do we deal with that at smarkas? Well, it's a much but way as you say git by sector untested where test dot sh is a shell script that actually exhibits a problem and What this does is it only looks at commits that actually went through CI So you can be pretty sure that there's no completely spurious breakage which which messes up your your by section So how does all of this work? So? As a quick reminder, this is how we were running a bare bones March that up and you can like add more flex And you get more features Also So let's look at a concrete example. So here's a actual bare bones commit which we use to bump our version of a march and our own environment as markets If he adds the ad reviewers feature, you can see that March bought a revolt history to add a reviewed by trailer That's so Tony get was the reviewer in GitLab Then for the by section we can As a tested flag what that does is it takes It's the last commit on our feature branch, which is the ones that you know past CI and it's tested by trailer Which leads you back to the much request which? Apart from making it easier to protect things also is very convenient to figure out why something is there or to see Is the actual test that we're on And so finally we've got a part of and so it's almost the same as tested by it adds a link best to the merge request But it does it on every commit in the pull request not just the final one so This combination is quite nice because basically is all of have The benefits of both a merge based and a rebase based workflow one big thing about the merge based workflows. You can sort of see That certain things are Features that worship and you can still see this here Cool and then once we have those things we can basically just have a couple of of get aliases Which use those trailers to provide the functionality I showed before But like one tested and also reverse a few other ones Okay, so Assuming Assuming you're kind of convinced that those things or some subset of them is useful What should you do if you can't use? March but well at least you now know What you're missing and If you don't need something general actually it's not that hard to do some custom solution We also welcome PRs if you want to add another back-end and there are some similar tools other people have seen the light for example rust rust homo or so and does something similar for for GitHub Good, so I'm gonna wrap up with this So in summary a good PR workflow doesn't test just a feature branch It tests the future master and ensures that your master branch won't be broken God doesn't equal common as open as always, but in fact, it's not that hard to do if you use get lab I encourage you to try March otherwise Look for some similar solution or build one yourself it might not be that hard this March You get some extra perks which are quite useful and finally, I think Mono repose and rebased based workflows nicer so so what's both wrong thing here is There's a slight contradiction here between us claiming to automatically maintain a repository of codes that always passes all the tests and having those Build past trailers. I mean the only reason why you have some in GitHub or get lab is because your workflow is kind of broken. So you want to know where the master is broken and Given that we host March bot on GitHub where it doesn't work We are not dog-fooding it at the moment. Cool. Um, so I'm just briefly gonna pop up the credits and I'm ready to take questions Anyone hello, it's any kind of have a question Please stand up and come here. If there are no questions. I can say a little bit about implementation details No questions How deeply baked is to give up an integration. I mean does just call the API or just cause the API All right, so Straightforwardly to get up. Yeah. Yeah, I was also wondering wondering because we were actually using Jenkins for CI Yeah, and I was also wondering about That's that's not a problem. So because survey it works is You can configure different CI back ends and get lab and All that much does is It relies on you having configured get lab correctly. Namely Saying you need to have past CI So it doesn't care about the underlying build system or CI system. It just uses a get lab API to merge things Any other questions Okay, if there are no question, this will welcome Mr. Alessandra and the make-up and thanks for the awesome