 Thanks for coming. So today I'm going to talk about Mono repo and many repos And so first Mono repos are used by a lot of big companies like, you know, Facebook Google and a bunch of other ones But most of you are probably using what I'm calling the many repositories Strategy, right? So I'm going so in the first part I'm going to Talk a bit about what a Mono repo is and what is a many repo I'm going to talk about the difference between and to and when you need to use one or the other strategy And then I'm going to talk about How I've been managing a lot of different projects using those strategies So the big question is so Mono repos they work really well for big companies But is it something that you can use for your projects and Why would you want to do that for you know small project or small teams? And you will say that my personal answer is yes, it's very interesting to use a Mono repository For even the smallest projects and with very small teams So I've been using Mono repos for the last five or six years now Both for open-source projects, but also for private projects So the Mono repository strategy is about Storing all your code For a given project or for several projects into one Repository right so when I'm talking about a repository. I'm talking about a Directory right and directory can be Stought in a source control Software like git or subversion or whatever One such example is the symphony repository So we have one big get up repository symphony slash symphony under that repository we have more than 40 different projects and When I say projects I'm talking about Stundle on projects right and if you have a look at Drupal version 8 you will see that Drupal is using a lot of different symphony components Not you know all of them right and I'm saying that they are stundle on because if you have a look at any One of them you will see that it's a sub directory under the main repository and In the in this directory you have a composite adjacent file So which means that it is totally independent with the test and documentation and everything So symphony is quite large We have a lot of contributors a lot of activity And we will see that it helps a lot When managing a repository the fact that we have just one place with everything there But minor repositories are not only for big projects or Projects are using only one Language right for symphony. We only have PHP. That's all no JavaScript. No HTML. No CSS or Just a few files here and there Blackfire is not an open source project. This is a private project and it is much smaller It's just about ten people And about ten different projects, but we are also using a monorepository With many different languages so we can see that We have some c code go PHP javascript and and and and and some more So the documentation and the books are also part of the main repository The many repo strategies totally different. It's where you try to Split your project into independent sub projects Reusable sub projects right and for each project you create a new directory and a new Git repository One of the goal is to be able to reuse some part of a project to Within another one right so you have ten different projects and you try to Factor your code so that you can reuse some libraries between different projects and actually symphony is a Mono repo, but also We are also using many Repositories so if you have a look at symphony slash something you will see that we also have 43 different Repositories and each one is actually an extraction of one directory from the main repository so we we're trying to get The pros of both strategies and I'm going to talk about that later on so Before talking about The many advantages of using a money repository You need to understand that money repos are not About code Distribution or code deployment. It's really just about the development process Right, so it doesn't mean that you need to deploy all your project at once It doesn't mean that you need to have one version strategy for all project within the Mono repo It doesn't mean that you don't have many different teams working on different parts of the project You can tolerate distribute several independent components, but also, you know develop everything in one tree Which is exactly what we're doing with symphony actually And it the same goes if you have a Microservices-oriented architecture where you have a lot of different small projects talking to each other You can't still use these strategy for that kind of architecture And of course a Mono repo does not mean that you have, you know, tightly coupled code The fact that your code is coupled or not Is not related to how you actually store your code. It's more about The way you are working with your code and the way you structure your code internally So Mono repositories are awesome So the first and that's the big one The first advantage is that you can make large Backward incompatible changes very easily Right, everything is in one place So I mean if you have, you know, you want to make one big change and the change as a lot of different impacts on different sub-projects You can create one pull request. You can, you know, have one change Where you can update Everything in one go All right, so you can change for instance an API endpoint And then all the usages of this endpoint in the same pull request And this is very important because you can have one So if you have, you know, other developers Reviewing your pull request, they can't understand everything going on So the main change and the impact on all the other parts of The project, it also means that if you have Tests, you can be sure that all the tests path for all the sub-projects If you have many repositories, you need to create several different pull requests They are not linked together Which means that you need to review several pull requests You need to understand the interaction between different changes and then It's very complex to understand if you won't break something somewhere, right? Because everything is isolated And That's a great way for instance if you want to remove all the APIs All the calls coming from all the APIs, obsolete ones You can't do that with confidence because you know if all the tests pass and and it's very easy to grab the project trying to find all the API Usages so they can move them in one go So I've talked a bit about that If you want to change an API endpoint, so especially if you are working with a microservices oriented architecture You are doing that a lot you have a lot of communication going on between different microservices If you can change everything in one pull request, it's it's much easier and the cut review are also much easier You don't need to coordinate the merges between different repositories So continuous integration so I'm so Here I'm talking about different things So the first one is unit test and for unit test Obviously you don't care if you are using mono repose or many repose because you know There are isolated to just one sub project, but then you probably have Integration tests right and for those you need to be able to change all Undercode so being able to have one pull request with all the cut changes and then being able to run your Jenkins test for unit test But also functional tests and integration test be sure that everything is green before merging the pull request That helps a lot Productivity increases as well Because there is no switch between repositories depending on the project you're working on that's a small thing, but It's kind of important on you know on a day-to-day basis if you are if you need to switch between 10 different repositories all the time at some point it's it's annoying Collaboration between teams becomes natural So just because you have all the code in one repository even if you are a PHP developer And if you can have a look at the C code or the go code or whatever languages you are using It's it becomes natural, you know to collaborate between the teams So even if I'm just a PHP developer, I can have a look at the go code trying to figure out if you can also change this part of the code and That's what we saw with with Blackfair. We have a bunch of PHP developers They don't really understand go but Just because they were able to read go-pro request because they you know They started by doing some small changes here and there and at some point they They were able to do more complex changes On on some part of the code. They were not responsible for So that means that developers can fix bugs in all projects as well No one owns the code, which is sometimes a problem. You have different repositories different teams and You know all the teams are very isolated So if you need one change you need to ask someone to make the change to create a pro request and Then after the pro request is done, you need to wait for the request to be merged Then you need to wait for release then you update your dependencies in your project Pulling the new version of dependency. So it takes a lot of time It also means that you have less management because everything is centralized. So you basically have Just one big repo one Jenkins one ticketing system whatever you're using So it's it's the overhead for management is really small Dependency management is not issue for your internal code, right? You don't need to manage Different version of your code Because everything is in one repository. So if you make one change you can make changes everywhere Whereas when you have many different repositories, then you need to manage different Versions and then you need to pull a specific version in your code But if you are not going to reuse that anywhere else because everything is stored in the same place Then you don't need management dependency management anymore. So that helps a lot as well. So that's Why I think that you know monorepositories are very interesting But I can also easily demonstrate that many repositories or the many repositories strategy is also very Useful You get clean boundaries between projects, right? It's very clear What belongs to which project which can be a problem when you start a new project because when starting on a project You don't really understand How you need to split the project into difference of project at some point you might have to actually merge Subprojects or split a project into two different sub projects Which means that you would have to create new repositories or merge repository repositories, which is not always that easy So but if you know what you're doing you have clean boundaries very easily Cut is more reusable in other contexts. So if you have different big projects, you can create one repo per Library that you want to reuse Access control is very easy as well if you have contractors for instance and You want to give access to a contractor to only a sub project or part of the project That's very easy to achieve with many repositories, right? and Continuous integration can be simpler as well because you know in the monorepo strategy whenever you make a change anywhere Jenkins is probably going to run all the tests even if you've just changed a small part of the code You can of course configure Jenkins so that you know Depending on directory where The changes happen. You only run a test for this directory and sub project But you know it means that it's more complex to configure and on the other end For many repose unit tests are straightforward to configure because you have one Jenkins per sub project But integration test is a much more complex So my conclusion is that you should probably use both a monorepository and many repositories and That's actually the strategies and the strategy that I've been using for the last five years For almost all my projects So I've thought a bit about the symphony many repositories, but the same goes for blackfair We have a bunch of repositories one for each sub project actually So I'm going to talk about how that works in in practice now So the monorepository is where the development actually happens. That's where we create a pull request and Then we also have the many repositories and those are Read only and they are synchronized in real time So Yeah, this one is read only and then we need something in the middle that actually synchronize The monorepo with many repose right so we have one monorepository where we have different directories and For some directories we want to split them into Individual repositories, right? That's the many repose so for instance for symphony. We have the console Component it stored in under source symphony component console and then we extract this directory into its own Repository right that's how it works and that's the same for all the other ones so So there is The monorepository is not aware that there is the many repose and the same goes For the many repositories. They don't know that there is the monorepo. So everything Is like if you have the two strategy turn independently Which means that so the many repositories are read only So that if you want to make pre-request on one many repose that's possible But then you need to move the pre-request to the monorepo so that you can merge it and then it is synchronized Automatically back to the menu poll So let's take one example here. This is symphony console component so on the left side you can see where it is stored for the monorepo and then we want to extract at the content of the console directory into the symphony slash console repository so There is So if we have a look at the git history so at the top you have the history for the main repo for the console directory and at the bottom you have the same but for The many Reposal for the split of the console you can see that the history is exactly the same same commit message Same offer or same commit or some dates. Everything is exactly the same the only difference being the ash right And the ash is different. I'm going to talk a bit about why the ash is different So everyone is familiar with git Who is not Okay, so everyone knows that git is kind of like a file system right The big difference between git and a file system is that git knows nothing about the directory structure right Git is only about storing Blobs So for instance if we have a look at this commit So and in git git stores Blobs and trees and And tags and some other things but mainly blobs and trees if you have a look at So if we run this command you can see that for the symphony Console component we have a tree and tree as ash so the tree is a Pointer to the to the content Store stored under the console directory now if I do the same in symphony console for The ash equivalent of the one coming from symphony symphony you will see that the tree has the same ash and That makes sense because the content is exactly the same So that means that when we want to split a A Directory from or extract a directory from the main repository to the subtree split The tree is exactly the same so it does not consume any disk space Right, so what we are doing is that we are creating a commit pointing to this tree directly Right, which is just a subtree of the main commit and that's why The ash of the commit is different because the tree is not the same So if the if you have the history so at the top you can see the history of symphony slash symphony so for each Commit you can see that we have the tree or nothing for instance the second one We don't have anything under the console directory so what we can do when we split we So For each commit in the history We create a new commit and the commit points to a tree coming from the main repository And of course we skip the commit if the tree does not exist So there is no commit because there is nothing under the console directory And and we do that for all the commits and then we can reconnect the parents And we have the new history What is going to be exactly the same as if you would have created? The many report directly without them the money report Okay, so that works because this is how a git actually compute The ash for a commit So this one is not that interesting. So so if you run this command It gives you Everything you need to get the same commit between two different contents so you have the commit ash the tree the parent ashes the author the committer and The body the body being the subject and everything below Right if you have exactly the same values, you're going to have the same ash the same shawar and to do that with Git there is a command that git sub tree split Which was added in git by default two or three years ago So git sub tree as many sub commands actually it's not just about git sub tree split You also have git sub tree merge and a bunch of other ones, but I've only ever used Git sub tree split and I'm going to explain why So by default git sub tree allows you to To Sorry To synchronize Two different repositories both ways, right? So you can commit when wherever you want and then you can merge from one Repository to the other one and just around but doing that means that Git needs to store some Meta data about the merges so it knows where what to merge and and when and I wanted to avoid that because I don't want the monorepository to be aware of the many repose and The other way around So that's why I want to have the split Only one way right from the monorepo to the menu pose so So we started to use Git sub tree split before it was added to git Five years ago or six years ago now But the fact that we are only using split means that And not merge. It means that the performance was really bad because Whenever we add a new commit to the monorepository we need to start from scratch all the menu repository Split it takes a lot of time How much time so this is the command that you can use to split the console component of Symphony from the main repo you can see that it takes more than 15 minutes from scratch on my laptop Which is you know doesn't make any sense So we have so many commits are going on symphony that that's not scalable. That doesn't work So it means that okay, so and that's for so on average it takes 10 minutes from scratch, but then Because we have some cache and that's not a case by default. So I'm going to explain how to get the cache then it took around two minutes to have You know the difference between the current state and the new state right two minutes is also A large amount of time and those two minutes they depend on the size of the repository Which means that it's getting slower and slower over time so at the beginning for symphony we We started to split only every night. So it was not a real time It was every night, but the thing is so at first it took You know ten not ten minutes, but one hour and then two hours and then six hours and then more than a day to actually Split the repository so it was not manage manageable And and the problem is that splitting with git sub tree So it actually Spends most of the time creating sub processes. So basically doing nothing So git sub tree uses the plumbing commands of a git so it executes almost ten Git sub commands for each commit So you can imagine that you know at scale That's a huge problem Nowadays symphony has more than four Hundred different splits. So doing that from scratch Just doesn't work So you can also use this other command. There are a lot of blog posts everyone on the web Git filter branch. It is much faster. It takes almost Less than two minutes for the same as split. The problem here is that You need to start from scratch Every time because there is no way to cache anything and then the other problem is that it actually overrides Your current master branch that you are splitting, which is not the case with the get separate split as you can see the last Option is actually a branch where you want all the commits to be stored Which helps so What we did at first is that we optimized get separate split the first The first thing that we did five years ago was to actually move from bean bash to bean SH It was a huge boost So you don't need to do that anymore. It's done now by default The second one is that because the way subtree split works by default it caches the association between the ash in the main repo and The ash in the split repo But at the end of the process it just trashes everything. So we just commented the line Where it actually delete the cache and all of a sudden it was much faster So it it helped But you know as symphony grows It it was not enough anymore It's very slow We need we need a lot of disk space. I think it was something like 15 gigabytes for symphony And and we needed a lot of inodes So a few months ago, I've decided to actually reward the command with go and leap you to And as can see the performance numbers Speak for themselves so now we can you know start from scratch and So it's not one minute. It's actually 10 minutes And and then incremental splits. It's less than 10 milliseconds and and those splits They do not depend anymore on the size of the repository Just depend on the number of commits that you want to split So for so we have four more than four hundred different splits for symphony Doing the splits from scratch. It takes less than 10 minutes, right? Using subtree split It took more than a week This disk space Not that much and now you understand why because we are mainly training commits pointing to same trees. So We have just one repository With everything inside so and and and and this is more than just split is split a stage is more than just a Rite of git subtree split. We added a lot of different features To help manage those kind of repositories. So the first one is that we have a tool To automatically move pro requests from a subtree split to the main repository It's even better than that. I think I'm going to talk about that later on anyway So it's even better than that because We try very hard to keep all the information so that when you actually Merge a pro request on the monorepository and then there is a split again The ashes for the commits are exactly the same as the ones coming from the pro request the initial Pro request which means which means that if you are using The status is going to be merged and not closed which is nice Which means that from the contributors point of view It doesn't need to be aware of the fact that we actually move the pro request Somewhere else and then synchronized back From each point of view you just merged the pro request directly Continuous integration and continuous deployment can happen anywhere and that's because so we have a tool that helps you tag Subprojects so you can so for instance for symphony and and that's optional actually but for symphony whenever we make a release for Symphony we also release all the components Right, but I don't want to manage all the tags for all the components. So when we have a tag for the main repo We have a tool that helps us Synchronize the tags as well Automatically and being sure that we actually tag the same commits Access control just works. So if you want to give Access to one sub project to some contractors you can do that really easily they can create a pro request directly on sub project You can move the the the pro quest that works Okay, I've already talked about that we have a full API to manage a repository. So for instance, we have You can ask the tool. So for this commit What is the the ash the equivalent for the monorepo of just around so Things like that So tagging we have PGP support as well Sanity checker so that just to be sure that the state of the main repositories are Exactly the same as the monorepo repository. So same tags That, you know, everything is up to date branches and we also have packages support We also have compatibility mode for all version of such as pleats. I don't care about that So and black fare for instance is using all those features So we have one big repository with everything inside. We have one contractor working on one project Some project are even open source Right. So that's also something that can use you have one project one so small library You want to open source? You don't need to extract that Because you know, it's it's a nightmare It's it means overhead. You can just, you know, separate split and just works And you can see that some Sub projects we always release them as a bunch or and for other ones We just have different release strategies Do you want a small demonstration? Yes Okay Hopefully the Wi-Fi is going to work. Can you see my screen? No, let's mirror Screens, is it big enough? Yeah. Okay. So the first step is so we have a bunch of sub commands a Project list for instance. So this is where we're going to Okay, so the Wi-Fi is quite slow Going to switch to 3g so Split SH is used by symphony. It's used by black fare, of course It's also used by other open source projects like Laravel and PHP BB Okay, as you can see here Sylex at a symphony polyfill and so I also started to create one for Drupal It works really well So the configuration so I can show you the configuration for The symphony project for instance That's a JSON file really so nothing really Okay, that's really slow. Okay Ten is not enough. Sorry So basically, this is a JSON file where We say, okay, I want to Split this directory and I want to be able to push that into this github repository I Only want to split those branches and I want to ignore those tags That's for the sanity checker because we did not have Tags for symphony 2.0. So we want to ignore them. We don't want to synchronize them between the repositories And I'm going to show something more interesting the black fare one is interesting because we are doing oh So the commit is there because so it was one of the optimization that we had for the github to split command So basically it says that Instead of starting to split from Commit one We want to start from this ash right because you know some of the components they were added Yeah years later, so you don't need to split everything before because we know there is nothing in there So I think it's not I'm sure it's not that useful anymore nowadays But it was very useful when we were using a github to split so here for Yes, this one person so signify is a small go binary that we have and basically it is Used to sign Requests whatever Anyway, so here we want to split more than one repository one directory And that's not possible with github to split actually So we want to split the signify directory and we want to put the files directly under the root directory of the new Repository we also want to split this directory and we want to put that under Subtree splits blah blah and then this one as well We are doing that because The last two directories are actually test Shared by many different sub projects So we get you know the best of the both worlds where we can easily Recreate repositories and move things around really easily. Okay, so now The main The main command is split of course so you can get a list of the last Splits so here. I'm under a symphony directory. So I have an environment variable saying that I want to work with the symphony Project here. So that's why I don't I don't need to say minus minus project equals symphony So here you can see What's going on behind the scene so API means that The split was triggered via the API Git means that We add a web hook coming from get up So you can see that it takes Anywhere between one millisecond to ten milliseconds to actually split this Thing and here you have a bunch of other information. So duration is more than several seconds because It takes a lot of time to push The new version to get up it takes a lot of time unfortunately and Then here you can see the number of commits that were Traversed by the tool the number of commits that were actually created by the tool and things like that Okay, let's create a split. So I'm going to split So I'm going to create a split via the API. I want to split console master and I want to watch it real time So here we created an API call and Done right so it's It's real time. It's really fast, right? You can say I want to split Everything under consult so all the branches And should be that as well Done Now so that's the first part of the tool So I talked about The sanity checker so we can say I want to check that everything is fine for all the projects and I'm going to do that for Silax because I have less Things there I want to watch the result So here it's going to actually check that tags not tags actually here, but that the branch the The the tips are actually the same between the the main repo and the split one I can add minus-minus tags and now it's going to check that the tags are also synchronized between Between the repositories Okay So if you want to Tag something you can say tag sync synchronize if you want to synchronize tag or if you want to create a tag you can say blah blah whatever and if you say minus s it's going to GPG sign all the the tags Okay, we also have an option which is tag sync Packages and I'm going to explain why So whenever you create a tag on GitHub There is a web hook So there is a call from github to packages so that packages can actually update the version the thing is speedsh is creating 50 different tags At the same time which means 50 different request API request from github to packages and that doesn't work It crashes packages which means that most of the tags are not on packages, which is really bad So this command just synchronizes and and and and check if if the tag exists on packages If not it triggers a call to the API on packages so to update other tags Okay Now I'm going to do something that I should probably not do So this is a symphony request I have one here and we are We are going to Merge it and see what happens. So here this request and it's not going to work probably So this request is about a change under source sorry Source symphony component debug right all the files which means that it should update symphony debug This one is on symphony 2.7. So we're going to go to 7 here So right now you can see that the last commit was on In March I think it's going to it's going to take some time because so g8 is a small tool Used by a symphony core team to help us merge pull request So we're not using the green button on github It does a lot of different checks to be sure that everything is okay before actually merging anything it's also a Nice tool because you know if the proof request should have been merged into 2.3 for instance I can say g8 merge switch to 3 and It does the work automatically switching branches so that we don't need to ask You know contributors to actually create a new pro question in the base branch and so So so we're checking that the core team actually accepted the the pro quest test. Okay proofing is okay So we're going to fetch the branch And and do the merge Okay Your phone is really slow What do you mean? Do we have a tool for that? So this if you want to use SSHB after that that's very easy because you can and keep the history Yeah, sure So we don't have I don't have any tool for that because I've always created a monorepo first So I I've never been in a situation where I wanted to actually aggregate different repositories but So that's a bug that's a bug. It's a bug so we need a category because Then we have a challenge logs are are created automatically based on those information But I think it's it's it's it's it's very easy to actually move a repo to One directory of another repository. So yes, Nicholas Yeah, very easy If you don't know how you can ask this guy But yeah, it should not be that complex and And the great thing is that when you have this meant big repo And you split again, you're going to have the exact same ashes for all the commits that you had before Yes Okay, so I want to send us a nice message to Nicholas. Thank you Then I push the changes to get up Yes We're pushing Yes, and I can monitor the split Doesn't exist yet. So we need to wait for get up to actually make a call We are fetching fetching the new stuff and you know, we are mainly fetching there So that's just yeah get up being really slow nowadays. I Mean, I'm not sure for you, but for me in the last couple of months get up has been really slow and Bam three milliseconds Go back here debug refresh Yes That works never do that at all. Okay, let's go back here. Okay. Actually, that's that's That's all for today. So speed sh is used nowadays by Laravel Symphony php baby It might be used by Drupal if if you think that helps That's totally possible It it's going to be released as open source only if there is some interest in the community because You know, it means a lot of time to actually make it Available as an open source project if Drupal is interested. I can also that on our server. That's not no problem Doesn't consume any resource. So that's a no-brainer really Anyway, so if you're interested you can shoot me an email My email is there or can come and we can talk about that. I'm here for the last For the next couple of days That's it. Thank you