 Thanks everyone for joining this evening, this morning, wherever you are from coming. You know, I appreciate you all coming into the Selenium Light Conference. This is kind of the precursor for the main event. Like I like to tell my team that this is kind of a test for the main event. So we kind of test if everything is in place and we can, you know, kind of go from there. So I would love to introduce Simon who's going to be kicking off the light event. I'm sure most of you know Simon. He's been the author of WebDriver, done a whole bunch of things with Selenium. I would say today's Selenium is Selenium for a large degree because of Simon's contributions. So Simon is going to be talking about some of the CI related stuff which is an extremely important part. In fact, I think of it as lifeline for most projects. So Simon's going to be talking about that today and sharing his wisdom of, you know, how this project, the Selenium project itself uses this. So without much delay, I think I want to hand it over to you Simon. Brilliant. Well, thank you very much for that introduction. I'm very kind of you. And it's fantastic to be here back at Selenium Conflite. Let's find out that everything works. As Narek said, if you've been part of the Selenium community for a while, you may know me as the person who led the Selenium project from 2009 until the end of 2021. But if not, it's nice to meet you and have a chance to talk to you. Now, normally in these kinds of talks, I introduce like new features or talk about the direction of the project. But today I thought I'd take a different approach. Hopefully if you're here at Selenium Conflite, you believe in the value of automated testing and you probably have a CI pipeline. So I thought I'd explain a little bit about how we build and test Selenium itself and why I think you might want to look at adopting some of the tools and techniques we use. Now, this here is a screenshot of the current source of GitHub in GitHub. As you can see, the Selenium repo is effectively a monorepo containing bindings with Java, C-Shop, Python, Ruby, and JavaScript in the same tree. The language bindings all share some common code, which is written in JavaScript, which we call the atoms because of the smallest indivisible unit of Bratzel automation. These are compiled using the Google Closure Compiler to create minified lumps of JavaScript that we pull into each of the bindings. The support for Chrome's own debugging protocol, the CDP, is also shared between languages with code generators wrapping the raw JSON descriptions of the CDP to create language specific implementations. Of course, not only do we have language bindings, but we also have the Selenium server in the same tree, and this is written in Java with a React-based web UI. Finally, we try and share as much of our test code as possible, and this includes shared web pages for test cases and a basic web server to set these out. As you can see here, the code is primarily organized at the top level of the project by language, and then within each language we organize by whatever seems right for that language. When we started the project, it was a little unusual to arrange a code like we do now. Nowadays, with the rise of complexity-wise written in JavaScript and protobufs being more widely used, it's increasingly common to find these polyglot code bases, that is, those with a mix of languages all within the same repo. Even if you're not using lots of languages, things like Docker and Kubernetes are changing how we perceive our code can and should be built. So Selenium is a project and code base that's almost ideally suited to being built with a build tool that can handle multiple different languages. I think more projects are starting to head into this world. Before I begin describing what we do now, I think it might be helpful to see how we got to where we are. When I joined the Selenium project in 2009, there was one person left who understood how the whole build worked. The bulk of the code was in Java, so it was natural to use Maven back then. This was great for the server-side pieces, but for the language bindings, there was an obvious problem. People wanted the APIs for all supported languages to match and keep pace with each other. To solve this problem, the API was described in an XML document, and this was transformed using XSLT into the specific language bindings. Maven would then delegate to each language's native build tool to compile and package these language bindings properly. Needless to say, there was a surprising amount of stuff that was needed to get the build working, and it depended quite heavily on the developer having set up the machine just so. Building away from Windows was somewhat tricky for the obvious reason that .NET wasn't really portable at the time. Surprisingly, the complexity of the build wasn't really a problem. After all, because of the way that Selenium RC worked, the language bindings were mostly dependent on the XSLT. Once it was working, the real action happened either in the server, which was written in Java, or in the injected JavaScript, which was shared. Put another way, if the build worked in Java, it probably worked everywhere, and that meant the only people who needed to worry about the complexity of the entire build were the ones doing a release. Now, WebDriver started in 2007 as a small open-source project independent of Selenium. I knew from the start that I wanted to support multiple different languages and that I'd need to compile them all in a single build, so I went hunting for the right tool. The problem is, at the time, there weren't really any great choices for a polyglot build tool. Ant and Maven existed for Java, but they really didn't handle Python or C sharp, well, if at all. MS Build and Nant were great for your .NET projects, but when it came to Ruby or Java, no good. So I looked around for a solution, and the one I settled on was Rake, which is a Ruby build tool. Why? Rake? Well, while Rake is ideally suited to building RubyJones, it has the interesting property that it's a DSL for describing builds, and each rake build is itself a Ruby program. That means that it's entirely possible to extend the DSL with custom types of build artifact, providing you can figure out how to express that compilation in code or as a shell script. Better yet, Ruby makes it easy to execute random shell commands. The original build for WebDriver then consisted of a single build file, which declared some helper functions, and then just ran a fairly simple build. I say fairly simple, but as the project grew and the number of browsers and languages we supported expanded, this rake file became increasingly terrifying. In the end, there was only one person on the project who felt comfortable working on it, and that was me. Clearly, the situation had become completely untenable. We needed to find another way of doing things. By this time, I was working at Google, which had a really impressive build system. The way that builds at Google worked, and I believe still work, is there are build files scattered all over the source tree. While there are a lot of build files, it's easy to see how they link together, and I mean that changes in one part of the tree don't affect changes in another. The other nice thing with the build files was that rather than describing a set of steps to perform, they contained a description of the outputs that would be produced. A Java library error, Python binary there. Each of these outputs listed the inputs it needed. It looked very much like a series of Python calls, and that's because originally they were Python calls. These ideas of scattering build files through the tree and having them describe the outputs to be built were utterly, utterly brilliant, and I loved it. So I asked if I could open source that idea, and the open source people at Google had no problem with it. By now, we'd merge a web driver and Selenium into one project called Selenium. We still use rake for the project's combined builds, and you'll remember that rake builds are essentially a DSL that executes a Ruby program. You can do whatever you want within that structure. So I did. Using a parser generator called Radle, I described the build grammar I wanted, and then I implemented that as a series of Ruby functions. The build started by finding all these build.desk files in the tree, and then used those to generate rake targets using a target syntax that looked remarkably like Bazel's does today. Once that was done, we just let rake do its thing. On this slide, you can see one of the build.desk files. But there was a problem. We'd accidentally written a not entirely awful polyglot build tool at this point, but the Selenium team has never been huge, and we already had our hands full maintaining what was rapidly becoming an extremely popular and large open source project. We didn't really have the time or energy to support a second project. So we used some social engineering to make sure that no one would try and extract and use the build tool we had in Selenium for themselves. We gave it a ridiculous name that described how much I'd enjoyed working on it, and we called it Crazy Fun. Crazy Fun was all well and good, but it had some drawbacks. We never really got parallel builds working properly, so although we could build anything, we could only build that anything very slowly, and that's suboptimal. By this time, I joined Facebook, and there I had met up with Michael Bolin, who'd also worked at Google. Now, an interesting thing with the Zootlid diaspora is that many of us missed the wonderful tooling that Google have. Michael was no exception, and he was missing Blaze, their build tool. Rather than moping, he'd created a new build tool called Buck, which was rapidly gaining usage within Facebook. It shared the concept of build files scattered through the tree, and also had a very familiar syntax for its build files. Best of all, it was optimized for speed on a developer's machine and could automatically parallelize work by analyzing how targets within the build files related to each other. The question I asked was, what if Crazy Fun delegated a buck wherever possible and just handled the bits of the build buck couldn't? It turns out it wasn't hard to wire the two together with Crazy Fun delegating to buck where possible. And suddenly, Selenium builds were a lot faster. This was joyous. Coding was fun again. At this point, it sounds like we had everything sorted out, right? But if that's the case, why is this talk called CI with Bazel? Well, while buck was definitely a step forward, it wasn't ideal, and why not? Well, there are plenty of reasons, but one of the most important technical reasons was that extending buck was extremely difficult. When we'd been working on buck at Facebook, adding an extension mechanism had always been a high priority, but never quite made it to the top of the list. That meant the only way to extend buck was to fork it and to add your own extension to buck's own source before compiling a fork and uploading it somewhere. Being able to extend our build was really important for Selenium. We were building things that people at Facebook just didn't need, like the Zippies of Firefox and .NET code. In addition, the Selenium team had been very patient humoring me with various build systems, but it was clear that people would be happy using something that other people knew. And that leads me to the other reason for moving away from buck, the social ones. The frankly weird build system that Selenium had made it really hard to find help when it went wrong. There was no way to find the answer to your problem on Stack Overflow. There was no one to ask for help other than the Selenium developers themselves. We really wanted a build system that had an active and vibrant community, a build system where we could ask questions and where Stack Overflow might be able to help. Community matters. Sadly, from the outside, it also looks like buck has basically ground to a halt and it never really had a large community built around it for any number of reasons. I'm really looking forward to the rumored buck, too, that's coming out at some point, but for now, the version of buck that Selenium used looks like a dead end. So we decided to look for another build tool entirely, one that was well supported, fast, flexible, and good enough for Selenium. What criteria did we use to assess the build tool? Stepping all the way back, let's answer the question of what a build actually is. Regardless of your language of choice, we can think of it as a process of transforming the original source code that we can ship to our users. Be that like a binary, like the Selenium server, libraries people can depend on, or just a simple script that we can render do something interesting. If you're using a compiled language, such as Java, Go, or Rust, one of these transformations is to invoke the compiler to generate binary outputs. For JavaScript folks, one transformation might be concatenating all the source files based on inputs and another might be tree shaking to reduce the size of the generated JavaScript. Another kind of transformation would be to take generated outputs and package them up into SIPs or JAR files. If we tilt our heads and squint a little, we can even say that running tests is one of these transformations. After all, we take test code and transform it into a series of passes and failures. Now, behind the scenes, build tools typically attempt to reduce work by looking at the inputs of each step of the build, and if those inputs haven't changed, they simply skip the step. What inputs should we be considering? Well, let's take the case of using NPM to compile some TypeScript. The obvious inputs we think about are the actual source files, but we've all run into the problem where the build works on one machine but not another, so that can't be all there is to it, right? One reason is that tooling, such as the TypeScript compiler is present on one machine, not on another. So the TypeScript compiler is an input. But they fix bugs between releases. So I guess the version of the compiler also matters. The version is an input, too. Because these inputs aren't normally stated, we can think of them as implicit inputs. Worse, compilers can be affected by things like the command line flags used, or even environment variables that the OS you're running on. These are more implicit inputs, and very few build tools enumerate them properly. This is one of the reasons we all suffer so much from works on my machine. So in the ideal build system, we'd have a way of listing not only the obvious, but the implicit dependencies. Another thing to bear in mind is that the outputs of one function are inputs to another. We need to properly track what's being produced rather than just dumping it out on the disk and hoping for the best. Not tracking the stuff properly is one of the major reasons we need to run clean builds, and those are incredibly wasteful if we do them, and risky if we don't. I guess we better add that to our list of things for an ideal build system. Track outputs. Another implicit input is when we're running our build. Try and resolve the latest version of some dependency you use right now, like Selenium, and you'll get one version. Try the same thing in six months. You'll likely get another version entirely. How do we handle this? With lock files. These allow one person in one place at one time to figure out which particular set of dependencies play nicely together, and that down somewhere where we can check into the build. This means that subsequent builds don't need to take the time to do the same dependency resolution, meaning a faster build, and we've also removed the importance of the exact time and date where the build is being run. It's such a good idea that almost all modern build systems have some mechanism for creating sharing and using lock files. Clearly, the ideal build system would do this too. Once we've nailed down implicit inputs, our next problem is we need to be able to tell if this has actually changed or not. One common way to do this is to take a look at the time where the input was last modified, the end time. If that's later than the last time a particular output that was generated from it, then clearly it's dirty and we should regenerate the output. That's the way Maven and Gradle all work by looking at the end time. But this isn't a great way to do things. It's all too easy to get the modified timestamps out of line with the derived outputs, causing us to either under or over build. Either way is not great. What else could we do? Well, the thing that really matters is it was in the input files, right? We don't actually care when they were modified just that they've been changed. The way that this problem is commonly solved is by hashing the file and using the hash to figure out if the file has changed. There are plenty of fast hashing algorithms out there and there are ways we can avoid having to read the entire file system each time we do a build. So although this might be a bit slower at the end time, the reduction in read work is enough to make this a really good way of figuring out whether something has changed. Our ideal build system will hash inputs rather than relying on the modification time. There's another thing we'd like from our ideal build system, that for each step, the same inputs lead to the same output. Why is that something we'd like? At the simplest level, it makes caching a lot easier, but meaning that we can do less work in our builds. Now, sometimes some non-determinism in outputs is unavoidable. I'm looking at UC compilers. But most modern tool chains for most languages we care about make this possible. So our ideal build system will do its best to make sure that each step of the build is what's known as a pure function. Given the same inputs, we'll always give you the exact same output. Now, we know the hashes of all our inputs and we also believe we have a build where the same inputs lead to the same outputs. We can start doing some pretty clever things. For a start, it's possible to store each and everything we produce in content addressable storage, using keys derived from hashes of the inputs. If you're not familiar with content addressable storage, just think of it as a giant dictionary or hash. A key points to a specific value and output. We could host that on some central server and then everyone in the team who's trying to build the same thing can take advantage of the work that's already being done. But yet, our listing of the implicit inputs is sufficiently good. There's no reason to be constrained to just running on our local machine. Why not just have a fleet of boxes running in the cloud, each of which can reach into the shared cache, grab the inputs it needs, run the build steps contained in a single pure function before uploading the results into the cache again. That way, you could run hundreds of build steps at the same time and everything would go faster still. It would be nice if our ideal build system supported remote caching and through that, distributed builds. There's something we ran into really quickly with Selenium, but more and more people are running into now. Our builds are seldom just one language. It's not uncommon to have a JavaScript front-end talking to backends written in Java Go or Rust or maybe all three. Maybe using protocol buffers to define the shape of the data in the RPCs that can be used. And then we want to be able to run these things locally but also package them as Docker images. We need almost all these things in the Selenium project. Our repo contains code written in JavaScript, Java, Python, C-Sharp and Ruby. Each of these languages use fragments of JavaScript compiled down to the small arms of reusable code, those atoms. So we all need to consume JavaScript. The Java build reaches a web-based front-end written in React and the language bindings also have code generated from CDP definitions which are stored as JSON. We want to be able to test by deploying the grid in a fully distributed mode to a local Kubernetes cluster. There are so many interdependencies and so much shared code between the languages that separating the repos would be a tragic waste of effort. If we relied on language-specific build tools, the number of tools would be mind-boggling. Weaving them all together to produce a cohesive and sensible whole would be a daunting task and one that most people would quite rightly avoid if at all possible. Crazy fun already offered us the ability to run Java tests for the same ease as JavaScript tests or Python ones. We could already build things like Python, Wills, Ruby, Jams and Java Jars. We quite liked the build files being easy to work with with a common syntax being used no matter which language we were building with. Our ideal build system would retain this polyglot nature and the simple build files. We're asking a lot of this build system, right? It'd be like finding a unicorn, right? There's a slow of new build tools appearing recently but the one we settled on was Bazel. Originally, this was called Blaze and was part of Google's secret source for dealing with their massive monoreaper. It meets many of the criteria we have for our ideal build system. Trax inputs and outputs carefully, so we seldom need to run clean. Each build step in the build called an action attempts to be a pure function and often succeeds in being so. It works with all the languages we support although the .NET supporter is shakier best. It's also sounding someone else is working on. There's a wealth of support options available and plenty of answers on Stack Overplay. There's another reason why making the jump to Bazel was good for us. It made getting started building the project significantly less painful. Using Bazel means there are fewer dependencies that need to be on the developer's machine. We now use pinned versions of Python, Ruby and the JDKs used to compile the project as well as pinned versions of the MPM and all of our third party dependencies. We even have pinned versions of Firefox, Chrome and Edge as well as the drivers for each of these browsers. That means that someone new to the project doesn't need to have all these things already installed before getting something done. Instead, they need to install Bazelisk which is a widely used wrapper for Bazel that allows us to pin the version of Bazel we use, the .NET developer tools, the various runtime libraries for browsers need or they're not the browsers themselves and Python 3 which we transparently call from a script run before every build. It took us a while to fully adopt Bazel and members of the Selenium team had been responsible for improving many of the language specific Bazel rule sets we use. It's not always been fun to get that but we made the leap and now many of the rough edges have been smoothed. All this means that getting started on the Selenium build is a lot easier than it used to be. But there's another reason for picking Bazel and that is it opens up some really interesting possibilities for our CI builds. A lot of this rests in the fact we can query the build graph. The build what now? Graphs are really useful data structures. They represent nodes that are connected to one another with lines. Think of a software map and you've got an idea of what a graph might look like. The kind of graph we're interested in is a directed acyclic graph. Directed means a line connecting each node and that gives them a direction and acyclic means that there's nowhere in the graph where if you followed the lines in the direction they point you'd be able to make a circle. You can imagine your build describes one of these directed acyclic graphs. Now, some build tools use a really coarse build graph but Bazel looks to have a very fine grained one. Why is that? The nice thing is with graphs is there are some really well-known algorithms for figuring out how to parallelize operations on each node of the graph. The finer the build graph the better the chances parallelism can be used the faster our builds. Within the Selenium project we aim for a single build file per Java package. On this slide you can see a simplified build graph generated by Bazel of the dependencies of the core WebDriver APIs. Each node has a label attached to it. These begin with a double slash, have a path and then a colon followed by another string and they tell us exactly where in the source tree each target is. I said it was important for us to be able to easily extend our build tool. Bazel does make this easy by allowing us to write extensions in the form of our own build rules. This is the Selenium project. Bazel rules are written in a Python-like language called Starlark. Notice that if you're comfortable with Python you'll feel pretty comfortable straight away. If you're not then Starlark has taken away much of the complexity to make the programming model even lighter. Don't worry too much about what this rule does but the interesting thing is that each rule is composed of a series of actions. In this we create two actions. The first creates a directory and the second runs a shell script. As the user, the build is composed of a series of rules that are tied together in the build files. But under the covers the build is actually a series of actions each of which can be executed when their inputs are ready for them. This allows Bazel to have an even more finely grained build than it otherwise would and again allows us to code just a little bit faster. So let's assume you've checked out the Selenium project. How does one build for example everything in the Java and Python trees easy like this? Notice that we're building both languages at the same time with the same Bazel build command. The build is also pulling in various bits of JavaScript to make things work the way we expect them to. This is an actual build of Selenium. For me on my machines a clean build can take about two to three minutes but here I've already partially built some stuff notably the React UI so the build is a lot faster only taking about 11 seconds in the end. Now let's run all the smallest tests we have for Python and Java. I could do the other languages but I think this shows the idea. It only takes about a minute to build everything and run and you'll see there's a failing test in there if you've got keen eyes. One thing that Bazel is doing here is it's automatically parallelizing the test for us so we're using my machine as efficiently as possible. If you've keen eyes you'll see that some of the tests haven't been run, they've been cached. That's because I ran a subset of these earlier before recording the video. Bazel knows that I've not changed anything and the test passed before so it's not running them again which is probably what you'd hope a build system would do. I don't need to know anything other than the fact the tests are there to be run. Even though I'm running both Python and Java tests the same command is used to run them. This means that anyone on the team can make a change in the shared JavaScript and rerun all the affected tests without needing to know how to do this on a regular language basis. It's a really powerful thing. So now you know a lot more about Selenium why we chose Bazel and the journey we took to get started using it. Along the way we've made plenty of contributions to the Bazel project and the rules that we use. I'm really proud that the Selenium team has had a positive impact and I hope that some of you get to benefit from the work we did. But this talk isn't about the history of the Selenium project and the why and how we did our migration. We started with Bazel using Selenium as an example we'll talk about. So continuous integration. For the sake of this discussion I'll ignore the CI builds where we were still marrying crazy fun in Bazel and instead focus on just the Bazel bits. Guess we should start at the beginning. The Bazel migration took a long time when we started and we did it in stages starting with Java. When we started our Bazel CI builds the way that many teams do by running Bazel test dot dot dot was used when it was green. This approach has the advantage of staggering simplicity. With it you know the definitely builds and the definitely passes or you know doesn't. The approach worked well but the machines that our GitHub actions run on are pretty underpowered so we decided to make a few changes. Remember the Selenium repo is a polyglot repo with different languages being broken out into different top level directories. Simple enough to create a workflow per language without a fan out. So rather than Bazel test everything it's possible to run Bazel test everything on the Python, Bazel test everything on the Java. If caching had been set up right then if those tests turned out to be a no op then they'd finish pretty quickly since it would just be hitting the cache. The simple heuristic we could have used is to simply check the top level directory of every file within a given PR and just run the per language workflows for those but there's a few problems with that. The first is that occasionally we'd modify a file that wasn't under a particular top level directory which didn't have a matching language specific workflow. I guess it's not much of a problem because then you just do nothing and let the build pass, right? Except then you hit the second problem. Sometimes those top level files actually matter. You do want to run those per language workflows as an example bumping the dot Bazel version file to contain the latest version of Bazel should really force all the tests run. Related to this, sometimes there's code in one tree that affects the others. Most often this happens in the Java tree where the grid server is used by the tests of the other languages and the JavaScript tree where there's a substantial body of shared code so language specific workflows are great but we needed a more sophisticated mechanism for indicating what we should do. It would be nice if we could use Bazel's build graph. We already know that Bazel maintains a build graph. It also provides a query command that allows us to introspect that graph. As an example while I was preparing this talk I noticed that if I change the color class from the support package, it won't test a run. It's not obvious why. Let's ask Bazel and see if it can help. First of all, we want to get a list of the tests that need to be run if the color class changes. Then we'll pick one of those and find out the path between that test and color. I don't know about you but I find text quite hard to read when it's like this. Let's generate an image with the graph so we can actually see the reason. I know we're going quite fast here but the idea I'm trying to get across is that Bazel allows you to easily introspect the graph. This opens up all sorts of interesting possibilities for us. Don't worry if you miss the details as long as you have the idea. Here you can see where the test depend on color. It's because it goes through this driver, this module, which is used by all of our tests. Fantastic. It's good to know that. This idea of querying the build graph to figure out what's changed is often called target determination. A target determinator is a tool that identifies the files that have changed between two revisions of the codebase, then runs a series of queries to identify the targets that have been affected. Each of the language-specific workflows we have is gated by a script which is very heavily based on a similar script used by Bazel itself in its own CI builds. You can see the meat of the script here. For each file in the PR, we identify the matching Bazel targets. We then look up the reverse dependencies of each of those targets in the entire repo before filtering that to a list to identify only tests. Once we have that list of test targets, then we can compute with the per-language prefix. If and only if there's a match there, we can allow the per-language workflow to continue. That's where we are now. For example, if you're building Bazel, you can also adopt some of the tooling we've created around the project. Specifically, the idea of target determination can take a traditional build pipeline of run all the small tests, then run the medium tests, do some panning and packing things up, and turn it into a far more focused process. By running only the specific success of tests you need, your average CI times can drop dramatically. A one project I worked on, build times dropped by an average of about 40%, dropping from two hours to about 20 minutes. But there are some things we can do to go even faster and have our build to be even more precise. The first of these is to improve our target determinator. The one we have is fine, but it misses some pretty important edge cases, such as when we changed the pinned version of Bazel. Fortunately, as part of the Bazel Contrib GitHub organization, there's now a rather nice target determinator that's being actively worked on, by a few projects. It'd be really nice to adopt this, and the demo I'm about to do, this open source project, is the second tool I use. It's the binary called target determinator. Look, here's a demo of it in action on a recent Selenium commit. You can see that when we run the current target determinator it returns an error, and so we don't run any tests. Now the change that we're looking at only changes the IU driver, but we don't build that with Bazel since it's effectively frozen. Because of that, we're not expecting to see any files to be detected that are changed to cause any tests to run. Now you'll notice that the second target determinator appears to be a lot slower than the first one. Part of this is because it's actually working. Bazel didn't even get a chance to do any work previously. But the other reason is the second target determinator is doing a lot more work, as it's comparing both the current and previous versions of the repo to make sure that nothing is falling through the cracks. The tradeoff with completeness is speed. Given that RCI builds can take an hour identifying that no targets need rebuilding has saved us an awful lot of time. Well, just let this finish running. I think it's nearly done now. It appears to be querying the state of the project after the change has been made. It's all very exciting watching a script run, isn't it? And you will see just a few seconds more . There we go. After a minute, it figured out nothing needed to be done. We saved an hour. I mentioned it's possible to run Bazel tests remotely. Now, there are a handful of open source projects that allow you to set up and manage your own distributed build grid, like Selenium's build grid, you know? I've been using Build Barn recently and it's been great for us at work, but for an open source project, I'm not particularly keen on running the infrastructure for a Bazel build grid. Given that we're all Selenium users, I guess most of us are familiar with companies such as Source Labs and Browserstack that offer Selenium as a service. Fortunately, there are so builders of service offerings that are starting to appear. One that I like is called Endflow and I have a fork at the Selenium project that can use it. Just as before, we're going to run our tests, but now, rather than running them locally, I'm going to run them on Endflow's distributed build grid. It's as simple as adding one more flag to the Bazel card, telling it to use the Endflow config. Now, one thing this allows us to do is to run more tests that I can on my regular machine, because there are hundreds of tests and these can scale horizontally. But the other thing is that we can share work easily. In this example of us running the same build, we're using two different machines. I'm using separate dock containers, but there's nothing shared. You can see this plugging away. Now, I started running this test while I was talking about this retry request test. It takes quite a long time to run. Just settle down, make yourselves comfortable. You'll see that we've run 653 tests. Scrolling up. What I'm going to do, I'm going to copy the command that I used to run these commands, these tests. You will also notice the host name begins with 4-8-F. Here, we're on a different host. There's nothing shared between these machines other than the interflow distributed build cache. They can be on different computers, different planets. It doesn't matter. Different planets would be really impressive. When Bazel starts, all it does is it takes a look at its workspace and it figures out what's in those build files and what it has to go and build. Then, for everything, it goes up to the remote end and it goes, have you built this already? Have you done this already? Have you done this already? You'll see that's what it's doing now. These tests are coming through really fast now. You'll see a third of the way through, quarter of the way through, third of the way through. That thing there being built is the ReactUI front end for the grid. You'll see here that zero out of 653 tests needed to be run again because they were all cached. Imagine that and you'll see I built. Now, our local build times using Bazel are already pretty good. With the remote built enabled, I think we can get these wickedly fast build times. That's why if I had to choose between a better target to terminate Bazel to do distributed builds, I'd always plunk for distributed builds. But a time is drawing to the close. Perhaps I should start wrapping up. Selenium is the complicated project with lots of languages. Would Bazel be a good fit for you? Let's look at the downsides first. Most importantly, there's the learning curve. If you have access to someone who's used Bazel successfully before, it's a lot easier to get started. If you've not got access to a local expert, then I'd be cautious for now. The nice thing is the pool of people who are familiar with Bazel is growing all the time. It's already less of a problem than it was when Selenium adopted the tool. And that's a situation that's only going to improve. The next rough edge with Bazel is that writing build files can be hard. Fortunately, that's also changing pretty rapidly. There are build file generators for Praetopuffs, Go, Python and Java. If you're interested in exploring this area, the tool the Bazel community is working with is called Gazel. It's well worth a look. Another problem is that pinning everything comes at a cost. Downloading all those tool chains and third-party dependencies can lead us needing to download hundreds of megabytes of stuff before the very first line of code can be compiled. Waiting all that time is dull and it's worse because we can't really do anything until it finishes. Well, at least we only need to do this once when it's a fresh checkout. Or when someone updates a dependency, that doesn't happen often. It's also possible to reduce this cost by having caches which are stored between Bazel builds. Now, I will note it's not a problem unique to Bazel. Anyone who's using a modern Java or JavaScript build tool will be familiar with long pauses as they download all the required dependencies or do dependency resolution. It's true that Bazel's approach leads to more things being downloaded, but at the end of the day, it typically is a cost we sell them at the pay. Now, some tool chains like Go's are incredibly quick already and adopting Bazel for them isn't such an obvious win. But the moment you start needing something to coordinate steps, perhaps if you want to generate some files, then any build tool starts to look more useful. And if your project is going to grow and develop over time and you're going to use multiple languages, then considering Bazel becomes an option. IDE support for Bazel isn't great. If you're on the happy path of working with Java, C++ or Go, you'll be okay. But once you're off that path it gets less comfortable. Fortunately, the plugins for both IntelliJ and VS Code are improving and support is getting better with each release of these plugins. The other day, JetBrains announced that they were taking on co-maintenorship of the IntelliJ plugin. And finally, Google uses Linux for almost everything and there are very few people there using Windows. Bazel is derived from Google's own Blaze build tool and so it has a strong Unix bias. It works great on macOS, but Windows can be a bit of a struggle. If first-class Windows support is something you need, Bazel isn't for you. Now, lots of the Bazel docs talk about how great it is at massive scale and plenty of the Bazel advocates, myself included, are mono repo fans. But it scales down too. This is largely to do with it doing better rebuilds and being able to selectively run subsets of tests. If there's access to a build grid, then the point where build speed improves comes even sooner. Another massive benefit is the simplicity of the shared command line for building running and testing. Just knowing that to build something I need to use build and to test something is test is conceptually simple and because Bazel does such a good job of caching I typically just choose to test everything. And one of the nice things about Bazel's build files is their language agnostic. That means that if I can read one, I can figure out how the various bits of the build hang together without needing a deep knowledge of how multiple tools work. Again it makes it easier to work in a polyglot repo but it also makes it easier to switch between repos in the same company that all use Bazel. One is if one project is using Java or Kotlin one's in JavaScript and another is in Go no biggie. As engineering organizations get larger, that ability to move smoothly from repo to repo becomes more important. Because Bazel pins dependencies there's far less variance between what one developer is using and another. Couple that with the being able to build remotely and you can dramatically reduce the number of cases where something works on my machine than not someone else's. The number of engineers out as I've seen this say has already been incalculable. As you saw earlier builds are ultimately defined as actions and these have all their inputs and outputs defined. This means that if you have access to a Bazel grid you can use it for anything that Bazel supports. You've seen examples of that today when we did the distributed builds. Such a powerful thing to do and so useful when you have access to it it's hard to go back to building locally. And remember, the nice thing is that Bazel is a lot like Selenium that if you wanted to you could set up and manage your own grid but you want to have someone else manage the updates make sure the latest versions of things available handle the security side of things and ensure the uptime of the system but there are options already available for you that you can turn to. So I hope you found this talk interesting and informative. Thank you very much for your time and attention. I hope there's enough time left over for some questions but if there's something you need to ask just come and find me on the Hangout or on the Selenium or Bazel Slack channels on both of those things publicly and I'll be happy to help you as much as I can. Thank you very much. Awesome. Thanks. Thanks a lot Simon. I see four questions in the Q&A section so if you still have questions folks please use the Q&A section. We'll take about five minutes. We'll go through a few questions and of course like Simon mentioned, he'll be available on the Hangout for more questions. If you're raising your hand it might be helpful to just put in the questions and we take it from there. All right so here we go Simon I'm just going to read out the questions on my screen. What are all the upcoming plans for Selenium Web Driver development in the next two years? So I sit down as Project Lead so it's not up to me anymore but I definitely know that on the list is support for what's known as WebDriver Byday so the original WebDriver model is a request response one where the test sends out a request to the browser, it does something and the response comes back but that doesn't allow you to do bi-directional communication sometimes it's useful to get events from the browser like other error logs being executed by JavaScript like we'd like to get those back, watching the life cycle of the page, doing request interception and stuff like that. So the WebDriver Byday spec is an attempt to allow that bi-directional communication to come through and it's being worked on by Google, Microsoft, Missila, Apple are involved like all the major browser vendors are involved and it'll come soon. If you're using Selenium 4 you can already get a taste of some of the things that we're planning on offering by taking a look at interfaces such as like network interceptor for example in the Java bindings but I think that's like the big feature. Cool, all right and there will be certainly be a lot more talks about what's coming up in the main conference so please do attend that if you're more interested in some of these topics. Moving to the next one how in your view how would you compare UI automation tools like Cypress, WebDriver IO to Selenium like how competent do you think they are? I have opinions. I think the Cypress model Cypress does a really nice developer experience like the UI is nice the IDE has is nice everything is packaged up reduced to the amount of thought you have to put in but fundamentally the technology is based on what we abandoned with Selenium RC attempting to inject things as JavaScript into pages is incredibly painful and they've only just managed to land support for navigating between domains and when we did that with Selenium RC it was a painful man like they're just going to run into more and more problems. Things like puppeteer and playwright they use a bi-directional communication they take some shortcuts so it looks like they start faster than Selenium does but if you put them into the same mode they're about the same speed but the problem with both of those is they're tied to specific versions of specific browsers rather than being a general tool so you can't take the test you wrote today and run it with the current version of Chrome the next version of Chrome the current version of Edge and so on and so forth and Firefox right so you're limited in your testing options I think the capabilities that BiDi offers will mean that the things that you can do with those frameworks rapidly become less of a differentiator like both Selenium will be able to do all those things There's a question from so far the questions were anonymous so I didn't read out the names but the next question is from Mahesh he's asking again about playwright already taking up the API integration so any plans of having API integration in future in Selenium you're asking the wrong person like the technical leadership committee of Selenium is the right people to ask and they would be able to help I think Diego is speaking later today he's on that the TLC one question from Madan after completion of your session I would I think he's asking something more specific about a problem he's facing with Chrome 103 driver and .NET so I guess Madan if you can join the Hangout table you might be able to deep dive with Simon it would be even better if he joined the public Selenium Slack channel and asked for help there like I haven't run into this issue so I don't use .NET so debugging that is going to be quite complicated for me but people who are on the public Slack channel will be able to do it in order to go there go to Selenium.dev and then click on support and then one of the options is like join the public Slack and that's where you'll be able to get like the best help perfect alright we'll take one last question and then we'll move to the Hangout again from anonymous attendee can we have the video recording facility in Selenium WebDriver if you use Selenium Grid you already have it like it's there as an option so use Selenium Grid or use something like BrowserStack or Source Labs which also are for the ability to have videos recorded of your tests perfect alright I think with that we'll wrap up this session thanks again Simon this evening for spending time with us it's always great to hear your insights from the Selenium project so I'll now kind of wrap up the session thanks again everyone for joining as well we will now move to the Hangout Tables meet Simon over there see you later thank you bye bye