 Good afternoon, everyone. So that figure is not something that I've drawn. So what we are going to talk about now is what kind of experience a team has had developing code in a monorepo. So how many of you have worked on monorepos or working in a monorepo? Quite a few hands going up. And how many of you are hungry and waiting for lunch? Fewer hands, that's good. So I'm Jay Santhosh. I work for Microsoft. That's my Twitter handle. So I'm part of a team called Microsoft Search, which builds search experiences across a bunch of product applications in the Office 365 suite, OK? Yeah. So what we build is a bunch of search controls which are integrated across a bunch of host applications or products, which are part of the Office 365 suite, as well as Windows and Bing. So the complexity also increases, where we have to build these controls across six platforms, like web iOS, Android. We have UWP, Win32, as well as Mac OS. So the work that our team does is essentially build controls like the search box, which is integrated across these applications and across platforms. We have the search results page, which has to be, I mean, the key for search experiences is that they have to be really performant. And also it increases in complexity based on the confidence that we have with the search results. So these kind of experiences is what we call as high value experiences. So these are experiences which are an important action for every user that we have on these products in their tutorial life. And these exist commonly across applications and across the Office suite. The other principle that we bear in mind when we develop these experiences is that these experiences must feel familiar and the user shouldn't feel these are drastically different in each application. While we are like a parasite, which plug on to the host apps, right? So within the host app, also we need to ensure that the user doesn't feel a difference when they jump onto a search page or a search experience. So it has to be homogeneous within the app as well. So let me take a slightly relatable example like a people card. So the people card is something that started as a contact card, the card that you had, and it slowly grew in complexity. So the complexity that the people card had now, like it includes a bunch of insights between the person in the people card and some contact information or structure that comes in. A lot of events between, common between you two, file shared between you two, a lot of insights and information around the conversations and emails that you've had and so on. So which means each part of this people card grew in complexity. So you have conversations, you have about information of that person, you have a lot of documents and so on. So this means the code has gotten much, much more complex. So while developing these experiences, you can broadly break it down into two loops. One is an inner loop where you write code, get it reviewed, raise a pull request, get merged into your master branch and so on. Then once a code is in your, then you fork off into a release branch, release into various strings through production, get user feedback, and then iterate again, right? So imagine when this is being done across six platforms. So when you are doing this across six platforms, usually you may start up with them being separate expertise and also being in separate repositories, which means like over time as these controls get more and more complex, they are getting developed in silos, which means you are losing up on the agility part of writing code. So that's where we try to see how we can bring all the source code together in one repository and since the team kind of understands and can build on the same code, that's something that we try to work on. So here like the source code is across multiple experiences like this and we have about more than 150 developers who are working across time zones, over 100 plus packages that are getting built. So let's just to sum it up, to achieve agility and coherence across a slightly complex experience like this across applications and across platforms as well. So if I break it down roughly, I can break the code into slightly smaller packages like for the people card, I can first start building an API package, then a common auth package, then I can also start building some core components which are common and a bunch of components which can be shared across platforms. So then you add the platform level wrappers over these components and like over time as you add more and more things, the number of packages increase, right? So we also need to be critical about how you're logging things so we unify the telemetry across performance across user interactions as well. So we started thinking about ensuring that these packages are smaller in their responsibility and that way like we were able to keep the interfaces very clear between the packages and also the responsibility was becoming more and more singular, right? So since they are smaller in nature, iterations over these packages can be released independently and can be published independently, right? The other part was when we started breaking it down and releasing a small packages, say for example, the teams app wanted to just pick the org chart from the people card and use it in their applications. So that way it's easier to pick just the org chart package and use it in their app, right? So this way it also allows us to manage and releases a lot easily. So what we decided, since we combined all the source code, we put it in a mono repo and this allowed us to achieve a good amount of singularity across these packages, right? So what is a mono repo? Repository with multiple packages. So there are two parts to this. One, like why do we need multiple packages? As I explained before, like when other repositories or other packages can reuse a small parts of code easily and you can stitch together smaller units to customize or integrate it much more effectively. The other part is to ensure that you have good code reuse because the interfaces are very clear. Developers can write clear code and it's easier to integrate these smaller bits of code when you're collaborating with multiple packages or multiple teams. Why in one repository? Because it allows the developer to pick up more agility, which means that you can develop a lot faster. There is a good consistency in code because you have a bunch of unified processes across all your packages. So this allows us also to like each developer can discover code very easily because it's all in one repo and can figure out, okay, I can probably use this instead of redoing it myself. Right? So but to ensure this all goes smoothly, you need a lot of tooling that's important. So tooling can be over a bunch of areas and just highlighting the main areas that we need. You need a good amount of tooling to manage all package dependencies and the packages in a more narrow code very effectively. You can also have, you also need cross package orchestrators. When I say orchestrators, I mean like some commands or scripts that can be run across packages in a uniform manner and very easily. You can also have linters, testing tools that need to run across packages in a unified manner, build system and infrastructure. Since we have seen the bottom three topics a lot in various other tasks, we'll mainly focus on these two topics. So first let's look at package management. So package management, when you're building a package in a mono repo, your mono repo also consists of various other packages. So we have about 100 plus packages. This can pull these pull in over 2,500 dependencies. So which means that if I'm going to install 2,500 dependencies whenever I check out or when I'm rebuilding sometimes, I need a larger need for de-duping common packages across our common dependencies across these packages. So let's understand the package a bit better. So package, package managers read the package.json file in your package. And essentially it fetches a bunch of files over the network and writes it to the disk, right? Node, which in runtime, it doesn't understand packages. It just understands modules. So whenever you require some module, a node kind of resolves it to a file on disk. So these are two independent operations, which means that this is something that we can take advantage of. So let's look at how node resolves modules. So say if some package in your code requires something like react. It first looks at that particular packages, package node modules folder. If it doesn't find it, it goes up in the hierarchy and sees if the parent folder has a node modules folder where it can find react. And until it finds, it goes up scoping to the root of your hard disk and until it finds a module. So this kind of resolution is something that we can take advantage of. So let's see how we can do that. So initially say you have two repositories for two packages. Now you want to put it in a mono repo. So you put it in a mono repo like this. Now let's look at the dependencies. So you can see that package one has a bunch of four dependencies and then or like three dependencies. Then package two has one dependency, which is in common with one of the dependencies of package one, right? So b.b1.o should be deduct when installed. So what we can do is try to take it to a common ancestor or common mono repo root folder where you can put all these dependencies in the node modules folder. So this way, each of these packages when they are required, they can still resolve their dependencies easily because it's just two levels up in the chain. As well as it's easier. So we also do something extra where we sim link each of these packages in the node modules folder as well. Say package three later on requires package one, it can easily find it. So I didn't install it separately as such. So this allows us to dedupe packages very efficiently. And this concept is called package hoisting. So you can do this with a bunch of tools. So Microsoft open source delivery called rush.js which allowed you to install packages and run commands across packages very easily. The other tool is yarn workspaces which allows you to create a workspace of multiple packages and manage it effectively with package hoisting. You also have other tools like PNPM workspaces and learner. We use yarn workspaces. So I'll just explain that a bit to just understand the gist of what these do. So to configure a yarn workspace, you would add a package json file in the root of your monorepo, right? So in the packages you have to ensure you're adding it as a private package. Then you define your workspace, the packages in your workspace, right? So you can give the list of packages or you can also use the regates path if, so this is what we usually use because upon new package addition, it automatically gets added to the workspace. Optionally, you can, if you want to restrict hoisting for a bunch of packages in your code, you can add them under no hoist as well. Then it's as simple as running yarn from the root of the monorepo, right? So then your packages all get installed in the root folder of your monorepo and this allowed us to bring down package installation time across 100 packages or some of the packages that you work with from 20 minutes to about six minutes. We're still improving that. It's still not good enough, but yeah, that's a good boost that we got in developer productivity, right? So the other task that we talked about was cross package orchestration. So this is where you can run commands across packages easily, right? So there are a lot of operations that you would do on a monorepo, like you would do some pre-built steps, post-built steps, you might want to clean up some packages, you might want to run tests across, you want to run build, you want to run watch, pack watch across your packages, and so on and so forth, right? So we have a big list of tasks that is only some of them that I could fit in the viewport, but you have a huge list of tasks that we do across the monorepo. So some of the popular tools that you can again use are Rush, you have Lerner, okay? You can also have Gulp and you can just basically use Shell Script as well. So these, but the latter two can be a little difficult to maintain as your monorepo gets more and more complex. In our monorepo, we used the tool called Lerner. I'm not going to explain Lerner fully, but I'll just explain the essentials that are required. So you add a Lerner JSON configuration file in the root of your monorepo. So you add the list of packages it should operate against, right? And then you can also switch the NPM client if you want. If you have a custom NPM client, you can use that as well. Since we use Yarn, I've added Yarn configuration. By this is only a subset of the configuration that you can add. The other thing is like if you're using, if you're set up workspaces in your monorepo, you can tell Lerner that I've set up workspaces leverage that, right? Then it's as simple as running a command with Lerner run and then I give the script parameter. So this runs the packages very efficiently across all the, the commands across all the packages. So this way we're able to easily run the scripts. Lerner also makes it a little efficient with by computing the package topology and accordingly see if it can run some scripts in parallel. That way it makes it a lot more efficient. There were two main things that we were seeking and using Lerner for. One of it was versioning. So when you have like 100 plus packages in your repo and you want to, you're iterating on them constantly. Like there are like 150 developers, probably it's 200 now. So we're working on these packages and they want to publish independently according to their own sprint cycles and so on, right? Which means that I want to maintain this release versions easily and effectively, right? So whenever you will run Lerner publish in the root, it only updates the creates new versions for the packages which have been updated. It doesn't touch the others. So that way untouched code doesn't just get bumped up in version numbers. So this allows us to manage package versioning very easily. The other really useful command that we saw was Lerner diff. So this kind of shows you like if you give a package parameter, it gives like what is the difference in code for that package and so on. The other one is Lerner changed, which allows you to figure out what all packages in your repo has changed. So these become really useful when you're in your build pipeline and in your release pipeline and constructing this information is something that becomes much, much easier. The other part that we really liked Lerner was how you can scope commands to a subset of packages that you're working on. Like in the monorepo, I mainly work on or like try to build such experiences. So I need to only care about the packages that the search experience requires, right? So what we do for this is usually we have a dev app or a developer test app where we, because we integrate these experiences across multiple products, we can't just run all of those products during development and write code. So that's where we have a dev app for each of these experiences. And I can just try to scope the list of commands that I need to only the packages that my search dev app is dependent on, right? So when I run something like that, I can see like if you see the second line, I don't know if you can read, if you say if you want to run watch across all the packages that search dev app is requiring, right? So that way if I run Lerner, run watch, and then I can add some extra parameters where it allows me to run some process, watch process in parallel, which are not dependent on like based on the package topology. So if I have like 10 packages, which can be run independently, they're not dependent on each other, I can run watch on those easily, right? So this way, parallelizing these efforts allowed us to bring down the, even the build time during watch and also like observables on this, the files that have been changed, right? So that was something that we really thought Lerner was helpful for us because we were developing multiple packages and we didn't require ultimately a single large bundle for a single page app. Since we are not building single page apps, we just are releasing packages, right? Which each of these product apps integrate and release across their platforms, right? So to make, so our complexity of development or release process is you publish a version of the package and these applications can just bump up their version and easily they get integrated experiences across the platforms. The other thing that we really have to take care of is how smooth this process can go because if you are launching a very horizontal project like this, you need to ensure that every release or package deployment is much smoother and also it allows you to release across products in time for all users, right? Lerner has a very good community adoption as well. Popular repositories like Babel, Create, React, App, Jest, React, Router, of course, are a repo and a lot of repos have started using Lerner a lot more over the last two years and it's being developed and improved every month as well. So to sum it up, when you are building a mono repo, you need to ensure that you are picking the right tools based on the complexity of your app. Say if you're only having a few packages, handful of packages, you may not need a lot of complexity in building your mono repo and so on, right? So you may probably just use a few shell scripts effectively but depending on the rate of growth of your mono repo, pick the right tools that you need, whichever is useful for you. For linting unit tests, perf testing, functional tests, I mean, this is a topic that we can discuss all day and because each of you would have your own preference but if you're using Lerner or Lerner-like tools, it allows you to run these processes or like linters and testing process very easily across packages, right? The other part is if you're running or if you're using TypeScript modules while there are a lot of advantages, there's another prereq step before you can generate JavaScript code is you have to handle TypeScript compilation. You have to, for better developer experience, you want to ensure that the typings can be generated much sooner so that your intelligence on your IDE can easily pick up and like that way the developer needed wait for the full build process to run before they can start coding, right? So we have a bunch of optimizations on, because we use TypeScript a lot. So since there is a buff session on TypeScript separately or you can also, if you have a lot of TypeScript related questions, you can always catch me after the talk as well. So yeah, that's pretty much what I heard. So these are some of the references that I've used material from in my talk. So Vincent, who is in our team, he published a medium article over because he was part of this Monorepo team. Since the beginning, he has written a very good article on how it evolved and what are the difficulties that they had to go through and how we have now evolved to a much better state, right? So I've also given references of the tools that I've mentioned in the talk that can really help you if you are using a Monorepo setup. So I think it's time for questions. We have very little ahead of time, so a lot of time for questions. Hi. I have a few questions. Sure. One is like, first question is like, so you were showing all the packages, right? Is it only a fee or you have the language everything is combined? Like I said, in your Monorepo, you can have API related code as a package. You can have auth related code as a separate package. It may not be just front-end related components. It can be all the back-end code as well, which can be broken down into smaller packages and can be reused. Yeah, but in that case, how that... So because you are saying about the not modules, how it is dependency handling, right? But there are some packages which are back-end. So they are the package management is handled in a different way, right? The dependencies are managed in a different way. If it is an NBM only, it is handled within order. Yeah, so obviously like this Monorepo, if you're writing all JavaScript packages or TypeScript packages, you would use, you would put it in the same repo, but yeah, across languages, unless probably you have like the relevant transpilers required, depending on your requirements. So currently... I'm not sure if that's a... Yeah, currently how it's doing in your search, I just want to... Yeah, so we have like a lot of middle tier code, data layer code, which is part of the Monorepo and which is used across... So we have like the search experiences integrated in a slightly different way in Bing.com. So, but they still use the common API calling mechanisms, telemetry mechanisms, even some of the core package code that we have across all the search back components, something that they use as well. I mean, this grows in complexity as you add a platform-relevant wrappers over each of it. And it also, basically you have a lot more to handle. So breaking it down into smaller units really helps you a lot. And so how the interdependencies between packages, Sandeul? Like one package is dependent on another, right? Yeah, like I said, probably in that slide, as showing apart from the package dependencies, you also sim link the rest of the packages in the root folder. So if say, as I said, package three depends on package one, it is already available and installed in the root folder. So it can be resolved easily. Yes.