 So before I start my talk, let me ask you some questions. First of all, who knows what Selenium is? Raise your hand. OK, I'm just checking you are on the right conference. Now, who knows what Selenium is? Raise your hand. 10 persons. Great. And who needs to run millions of Android tests? Raise your hand. Several persons. Let's move to the top then. My name is Ivan Krutov. And today I'm going to talk about scalable Selenium clusters and how to run millions of Android tests with it. A few words about me. Being software developer during the last decade, my main experience is related to Java and GoLang programming languages. I also actively participate to open source projects. For example, I did a lot of contributions and I'm one of the core maintainers of the Selenoid project. My main activities during the last four years are called with a buzzword devolves. That means that I'm creating and maintaining various infrastructure. And one of the biggest products I'm working on is a big Selenium cluster. So how big is this cluster? Compared to a typical Selenium grid with 50 browsers and executing 10,000 sessions per day, my cluster has more than 5,000 browsers running in parallel and executing more than 2 million sessions per day. This cluster is distributed across five data centers. The average load is about 4,000 requests per second. The traffic is about one gigabit per second. And this cluster is certainly working all the time. This cluster has all popular browsers and platforms, including last-time versions of Firefox, Chrome, Opera, all versions of Internet Explorer, Microsoft Edge, Android simulators running on hardware servers, iOS simulators running on Macminis. And for some teams, we also have real devices, phones, and tablet PCs connected to the same cluster. And today, I would like to talk in detail about Android. But why do we need native Android testing at all? For several reasons. First of all, according to statistics, more than half of the internet traffic today is the mobile traffic. Then, three-quarters of mobile devices run on Android. Let's check this. Who has an Android device? Where do you have it? So I'm seeing the majority versions. And finally, according to our experience, there are a lot of Android-specific bugs that cannot be reproduced even when using mobile emulation in desktop browsers. Having said that, I have bad news for you. Android automation is complicated. You may disagree. It should be simple. We take a computer, add a USB hub with multiple ports, and connect a respective number of desired device models. As usually, the devil is in the detail. First of all, there are a lot of Android device models with different Android version, phones and tablets, x86 and ARM with different screen size, resolution, and pixel ratio having hardware buttons and without them. This gives you a lot of combinations to cover by purchasing more and more devices, which is expensive. But buying a lot of devices is the beginning of the story. And real devices require a lot of manual maintenance. They often discharge, lose Wi-Fi connection, and log out from the Google account. From time to time, you will have to manually confirm software updates on every device. You will also have to buy a rack or a stand to store the devices and a USB hub with sufficient charging power. And even doing everything correctly doesn't prevent some of your devices from dying during your product release test run. So real devices are nightmare. We need something similar to desktop automation, which scales well. And before we dive into the details, let me ask you who knows how Selenium works on the desktop for a different time on those. Sure. So I think most of you already know how a typical Selenium architecture looks like on the desktop. It consists of a Selenium server handling test commands, a browser installed in the operating system, and a web driver binary translating test commands to browser specific commands. Simple, right? Compared to desktop browsers, Android automation is slightly more complicated. You have to deal with a lot of components, including Android debugging bridge or just ADB, which is a common line tool, using to do the most common operations with real devices and emulators, like installing, starting, and stopping applications, copying files to the device and from the device, forwarding network ports, executing shell commands, debugging running applications, and many more. Android SDK also includes an Android emulator, which is a desktop application showing you just the same screen as the real device and allowing to test mobile applications without real devices. When launching Android emulator, you can choose from a list of ready to use configurations corresponding to real device models. But you can also create your own configuration with custom screen size, device pixel ratio, screen orientation, SD card size, and so on. Android emulator supports both x86 and ARM platforms. Internally, every emulator is a standalone virtual machine using KVM and KAMO technologies. And Android debugging bridge is working exactly the same way with both real devices and Android emulators. Android automation is also based on Android instrumentation framework, which is a low-level Java API allowing to subscribe to any events in the running applications, such as opening the application, typing the text in the fields, clicking on the buttons, and so on. But continually, it also allows to send any events to any of the application parts. Thus you have full control of what's happening within the test application. The next Android automation component is called UI Automator. This is also a Java-based library used for cross-application testing as well as testing the interaction between user and system apps. And UI Automator, when testing the application, sees this application like a black box. The last piece of Android puzzle is called Chrome Driver. Chrome Driver is a standalone binary using JSON-based Chrome debugging protocol to send commands to the browser. The same binary is used to automate both desktop and mobile versions of Chrome. And it's already distributed as a standalone web server compatible with Selenium API. We now know all Android automation components and need a single Selenium-compatible web server using one of these components depending on what the user requires. And such server exists and is called Appium, certainly. So Appium is a powerful web server implemented in Node.js which is introducing its own protocol MobileJS2Wire protocol which is a superset of the web driver protocol adding mobile-specific operations such as tapping on the screen, rotating the device, locking the screen, and so on. And to support all these new commands, Appium maintainers provide client-side libraries for different programming languages based on the original Selenium code. So there are a lot of ways to shuffle all of these components and let me show you how a truly efficient combination can look like. So first of all, we start by installing a small server called Selenoid. We do this on a server with latest stable Ubuntu version and a recent Linux 4.4. We have a recent blocker version installed on the server and we don't have neither running containers nor images in the storage. So first of all, we go to GitHub and download a small standalone binary called cm which stands for configuration manager. We go through this page and can just copy the link to a ready-to-use binary. So we copy the URL and download this binary to our server with a WGIT tool like this. So it's relatively small binary, it's about 10 megabytes in size. So we add execution permissions and execute only one small command to install all the Android automation stuff. cm, Selenoid start, and we specify Android version six as the browser desired. What it does, it downloads the latest Selenoid release. First of all, and then it downloads ready-to-use image with Android version six inside. So we take some minutes to complete. I accelerated a bit the video here. Then it configures the server and starts it. So after running just one command, we have a ready-to-use Selenium compatible server running on a standard Selenium port 4444 in Docker and some images already present in the storage. Now we can quickly install the user interface by typing a similar command, cm, Selenoid UI start. Could you turn on the light? So we type the command and thank you. And now we have two containers running. The second one is running on the standard HCP port 8080. And we have its image already present in the storage. So we can now go to the browser and open the user interface. All this installation takes, I would say five minutes, depending on your internet connection speed. So we open the UI and we are seeing Android present here in the list of available browsers and platforms. Now we can launch our demo test. So here, this is a Java-based test using Maven for dependency management and standard Java Selenium client. What it does, first of all, it creates an Android version six session, opens a built-in calculator application and then types inside two plus seven and checks that the result equals to nine. So very, very, very simple test. Here is the real speed, the real execution speed of this test without accelerating the video. Usually the session, the new Android session starts from scratch during 10, from 10 to 15 seconds. So you'll see right now. And the overall test takes about 25 seconds. So that's it, 24, even 24 seconds. Now we can see the screenshot of the calculator application and the correct result inside equals to nine. Next feature is the live Android screen. We can place a breakpoint on the first test line and launch the same test. We need to wait again 10 seconds for a new session to be started. So let's wait, let's wait for the session. And we will now see in the UI, a button showing this running session. So the session should now start. Please start. We can now go to the UI and see this Android session. When we click on the button, we are seeing the real Android emulator working inside a Docker container. So if we resume the test execution, we are now seeing the Android test running in real time, just in our browser. What we can do else, we can quickly switch Android versions by just changing the version capability in the test. And we can also quickly switch the device orientation or device skin by adding a new capability like this, the skin. Here I'm specifying skin equals to WSGA800, which corresponds to a tablet PC. So I will now rerun the same test and you will see that we will have a pad device, a tablet PC by using just the same container. So as usually we are waiting 15 seconds for the session to appear. In fact, most of the time is spent waiting for Appium to do its magic inside. So we click on the button and this time we are seeing a pad device, a landscape orientation. And we continue the test execution and seeing just the same test working in the pad device. And the last thing I would like to show you is the video recording, which is also working out of the box. We add one more capability called Enable Video and we can optionally specify the desired video name. So for example, here we have Selenium Confindia.mp4. So I rerun the test and when it finishes I will be able to see the overall video of the overall test execution. This can help, for example, for debugging some tricky test. So let's wait for the test to finish again. Okay, that's finished. And we can now open in browser a particular web page showing the list of available recorded video files. So it's available on the standard port 4444 slash video. Here is our recorded video file. We can open it either in browser, but in this video I'm copying the URL and showing this video file in the VLC player, for example. So here is the recorded video file for the emulator. Here is the emulator being restored from the snapshot. It launches, then APM does some magic like unlocking the screen. And now we will see the calculator application starting and just doing the same test. So interesting, right? Let's now dive into the details and see what's under the hood going step by step. If you remember, this talk is about running millions of Android tests. This is why trying to launch Android emulator on your workstation makes no sense. Usually you start playing with the new technologies on a clean virtual machine in your preferred cloud platform. So having a virtual machine, you download Android SDK and pack it, create an Android virtual device and launch an emulator. It starts, but is extremely slow. This is because default ARM Android emulators are very, very, very slow. You spend hours digging the documentation and find that a Nix86 emulator can be quick when launched, can be fast when launched with enable a chemo flag. So you add this flag and it fails to start. Why is that? If you remember Android emulator itself is a standalone virtual machine and KVM stands for a kernel-based virtual machine. To run a virtual machine with KVM, your host environment CPU should support a number of instructions. And this is what is usually missing in the standard virtual machines in the cloud. So you can run a fast Android emulator on a hardware server supporting these instructions or on a particular type of a virtual machine supporting nested virtualization. So on the next day you take a hardware server and successfully launch an Android emulator. It works slightly faster, but a lot of applications like Google Chrome doesn't start this professional startup. And why is that? Because Android 3 and above require GPU 3D acceleration support to start these applications. And in order to fix this, we need to configure 3D acceleration drivers which are usually bundled within Android SDK. So launching an Android emulator was not easy, right? But the rest should be simple. Having a running Android emulator and an ADB instance, we install Appium, launch our first test and it works, great sense. So scaling the solution should be as easy as adding more emulators. For example, you start 10 emulators and you're on the same test. But what's happening? Your tests are randomly freezing. This time ADB is culpable. If you remember, ADB was initially created as a debugging tool. So it's not so efficient when working with multiple Android emulators. How could we solve this? We need to use one ADB instance with only one Android emulator. And in order to have several Android emulators per host, we need to somehow isolate multiple running ADB instances. Who knows the most popular lightweight isolation engine today, any ideas? Yes, Docker point, just the most popular one. So after installing Docker, you create an image containing one Android emulator and ADB instance and an Appium instance. Having such image, you can now launch several identical containers, roughly one container per CPU core. But your tests still require a single entry point proxying your requests to the upstream Appium nodes. Who knows a good candidate for such position? Just a hub, just a Selenium hub. So just a Selenium hub. So with such architecture, you will now able to run a lot of Android tests without any freezes because one ADB instance is now working with exactly one Android emulator. That's it? Not really. If you leave such server running the tests for a day or so, you will soon run out of available emulators. Sometimes Appium disconnects from the hub, sometimes ADB or Android emulator processes go down. So in order for this architecture to remain alive, we need to somehow kick running Docker containers with emulators. You can certainly add a current job periodically restarting the containers, but this can interrupt running sessions. So it's better, it's safer, to implement an extension for Selenium hub that will be aware of running sessions and will restart only the containers in the idle state. And such extension, the most common way to implement such extension right now is to add a server-based extension to the hub. So what you're seeing right now is the first possible architecture proven to work under the high load. It already provides good isolation between emulators, Docker containers. All the issues with ADB freezes are now resolved. And the overall count of the emulators remains constant, but it's not ideal. Selenium hub is known to consume a lot of memory and sometimes becomes slow even with dozens of connected nodes. Containers are always running in the operating system and that's consuming its resources. And the most annoying, it's very difficult to start different Android environments on the same host, different Android versions on the same host. Can we do better? Certainly, we need to have a Selenium compatible web server using the Docker API to start these containers. So when a new session requests arise, a new container is started. And when the test finishes, this container is removed, thus leaving your operating system in the same state as it was before launching the test. And as a year ago in Berlin, this simple feature isn't still implemented in standard Selenium server. So welcome to the wonderful world of Selenoid, a lightning fast Selenium compatible implementation launching browsers and Android emulators within Docker containers. Who is already using Selenoid? Where is he fan? Several persons. Great, you're awesome. So we now know how to create an image with Android emulator. But if you try to start to use this image with Selenoid, you will face a new issue. Android emulator starts very, very, very slowly. Just compare. A typical container with a desktop browser takes from five to 10 seconds to be ready. And the same Docker container with Android emulator on the same hardware takes from 30 to 40 seconds to start. So just the same amount of tests will run time slower just because of that. Can we fix this? And it would be great if we could start Android emulator within several seconds. Yes, with the last releases of Android SDK, this is now possible. In December 2017, Google introduced a new feature called Android Quick Boot, which is very similar to the hibernation feature in the desktop computers. So here's how it works. First of all, you do a cold boot of an Android emulator which takes as usually from 30 to 40 seconds. Then you stop this emulator and its memory snapshot is saved to the hard disk. Next time when you start an emulator, this memory snapshot is read from the disk and used to quickly restore the emulator state. And with this feature, Android emulator now starts during from five to 10 seconds. So we now have all the major, all the major Android issues resolved and let's now build an Android cluster from ready to use pieces. So first of all, we need the images with Android emulators. We don't want you to spend your, to waste your time creating your own images, so we provide a set of ready to use images for different Android versions. Every image includes inside a headless X server and Android emulator configured to use exactly one Android version. An ADB instance, an Apium instance to be compatible with Selenium protocol. Optionally, a Chrome driver to be used for mobile web testing and an Android quick boot snapshot. So sometimes you will need to create your own images with custom Apium version, custom Chrome driver version or you would like to add your custom APKs inside the image. So we also provide all the automation scripts used to build these images. So all this stuff is open source. So having all these images have a remarkable feature you have already seen. They can be run with any desired Android device skin. So by just specifying an environment variable, you can quickly change the skins using just the same image and this allows you to use parameterized tests to quickly check that your application works well in different Android versions and device models. So having such images, we can start Selenoid on, for example, on a server and one server can, depending on the hardware can run up to 30 parallel emulators. And what happens next? Everybody wants to use such server and your Android research server quickly becomes a production installation used and release procedure of the multiple teams. So let's now quickly scale this solution because your friends are just waiting during the Facebook, let's finish this work. In order to have a readable cluster, a full tolerant cluster, your Android servers should be installed to two or more data centers. For example, we install several units here in Bangalore as usually fails in five seconds. So again, that's a bug, that's a bug in the office. Okay. So you install several units here in Bangalore and the rest, for example, will leave in Mumbai. Now we take our extremely efficient load balancer called GGR and install it to one data center. So GGR has its configuration files knowing all the Android hosts as well as data center information. Being a readable software, GGR can still fail if because of the network loss or the power outage in the data center. So we need to install the second instance. Both instances are distributing Selenium requests across all Android servers. And the last thing we need to do is to deliver the single entry point of this cluster. If you remember, Selenium is based on the HTTP protocol. So we can use the well-known HTTP load balancing scheme, including two or more and Jinx instances, for example, and a readable network load balancer. So a few words about the load balancer. Usually this load balancer is also distributed across multiple data centers but has a single fixed IP address. So what you need to do, you need to just assign a domain name to this load balancer IP address and use this domain name in your tests. So it sounds a bit complicated. Sounds complicated, but let me quickly show you how easy it is to install such configuration. So initially we have, for example, two virtual machines, one in Mumbai and the second one in Bangalore. Both have a Risen Docker version installed and as usually they are clean. So there are neither running containers nor images in the storage. What we are doing here, we just go to GitHub and we can take a ready-to-use configuration. So I just open-sourced our configuration for the cluster. This is open-sourced in form of the Docker Compose configuration. Who knows what Docker Compose is? Great, so everybody's using. Here in Docker Compose file, we start to services. One is Nginx and second one is GGR running on non-standard HTTP port 55555. So we also have here an Nginx configuration file with the upstream section and the proxy pass, the proxy pass using this upstream with the hosts. So it's randomly distributing requests across two or more servers. And in the grid router directory, we have a file with users, so the user test and it's encrypted password equals to test password in this case. And we also have a quota file for this user having all the information about the upstream hosts with Android emulators. And to install this stuff, we just clone the repository and execute only one command. Who knows this command? No? Just Docker Compose up minus D, which stands for detached. And then Docker Compose does all the rest of the work. So it pulls all the containers, copies the configuration files, mounts the volumes, and so on. So in 15 seconds, we now have everything configured. We have two running containers and Jinx on standard Selenium HTTP port and the upstream GGR, for example, on the port 55555. And now you install the network load balancer and assign, so here I'm showing also the images, the images, two images. Now you configure a network load balancer with a name, for example, like this, seleniumexample.com. You check that its port is configured correctly, it's open, so connection succeeded. And the only change you need to do in your tests in order to work with such cluster is you need to change the URL. So you just use the new domain name, you add the test user and test password. Some of you who are already working with Souselabs, I think already know such notation. So that was it. You now see how complicated can be Android automation. Please use the right tools to have Android automation working like a charm. Trust me, Android automation can be painless. And as usually at the end of my talk, some references. Here we have links to GitHub source code projects, our Twitter account, our Telegram support channel, our website, having links to various selenium related articles, and my personal email. So thank you for your attention. You can now ask your questions. Is it on? There are a lot of tools available in the market for Android automation, like commercial and open source. So the main parameters should be considered for selecting the Android automation tool. You think? I don't understand. Okay, there are tools available in the market, like various tools for Android automation. Just APM is there, Perfecto is there, CTest is there. So in order to automate Android apps, which are the parameters to be considered while selecting the tool? Could you translate me the question? Actually, I just don't get. Oh, why APM? Just because certainly we had an idea to implement everything, because we prefer using, for example, GoLang to implement our tools. And as you know, APM is implemented in JavaScript. And certainly we had an idea to rewrite everything in GoLang because it would be faster and so on and so forth, but it's too much work. So we prefer just creating stuff from ready-to-use components. And APM is, first of all, Selenium-compatible because that's our requirement to have a Selenium-compatible stuff. We could, for example, use Expresssoft, I don't know, Express, Expresssoft framework, which it's not compatible with Selenium protocol. So in fact, APM is one of the most popular open source tools compatible with Selenium. So that's the answer. Hi, my question, I have two questions. One is, have you tried to implement GeniMotion, like virtual devices? GeniMotion, yeah. It's something commercial so far as I know. They did have some free, like they give you like free, a couple of free devices. No, because we prefer the open source stuff. And we prefer official. So we take, regarding Android emulators, we take the Google official stuff. So everything is just official. The reason why I ask that is because, at least in this part of the world, a lot of users use like Samsung devices, right? So those images are kind of like different from, they have a lot of wrappers and skins and a whole lot of another layer built on top of the Google images. So the application is not exactly tested on the intended environment. So that's where we get some disconnection. That depends, I think that there is no clean, clean, let's say, reply to this question because it depends on your application as usually. So you need to have your own statistics which bugs are reproducing on each platform. We can say just the similar stuff, for example, for Samsung browser, stuff like this. But we are usually testing using the Chrome. So there are differences. So this is all about having your own history bug statistics and deciding whether you need to cover these patched platforms or not. So no clean reply, I would say. Yes, this definitely helps. I mean, you can go some way. Thank you. Hi, this is Saurabh and my question is, everything is open source so far, I have seen. Yes. How are you guys are making money? How? How are you guys are making money out of it? So because you are doing competition with browser stack, SauceLab and these people, this stuff is so amazing. I just want to know how are you guys are making money? I would say that right now we don't have any money for this stuff but we certainly understand that in order to provide good quality, this should be somehow monetized. So currently we have just a parallel application for Kubernetes platform, which is called Moon and this is what is commercial from the beginning. So tonight, there is no, I would say, it will always be open source, open source free, all the images will always be free and our idea is to create another product that will probably bring us some money. So this is all, this products will always remain open source. Thank you. You can see all the licenses are a patch too. So really open source. All right, we got about time for one more question. Are there any auto scaling capabilities available with this? So let's say we just have to create, so in your example, you showed that you created two machines and you routed through them by an engine server, right? But what if we wanted an auto scaling capability, let's say on an AWS. Auto scaling. Auto scaling capability on an EC2 instance. Okay, just understanding the distinct words only. So regarding the auto scaling, you mean using the AWS stuff like this? So far as I know, for example, in Google Cloud, I don't know, I'm not an AWS user, but in Google Cloud, you have a virtual machine with nested virtualization support so we can just configure a virtual machine like this. And they also provide the auto scaling. It's called, so far as I know, scaling groups, Google Cloud. So some clouds already provide the virtual machines with nested virtualization. So far as I know, Azure, Microsoft Azure also provides such virtual machines. And in AWS, you only have bare bone, I think. Hardware servers only for the moment. So you can use such virtual machines also. But they are certainly more slower than hardware servers. So it's possible, but you should check whether the speed is correct. All right, thank you Ivan.