 Hello everyone, namaste, namaskaram myself Shyam and welcome to the session let me share my screen. Welcome to the session when Ansible meets Selenium Grid the story of building a stable local IOS simulator firm. I'm Shyam and I'm currently working as a senior software engineer at Carousel Singapore. Originally I'm from a small beautiful island called Perimbalam in Anapi District, Kerala, India. I'm one of the co-founder of Tequila QV community in Singapore. You just heard it right. The pronunciation is same as the famous Mexican drink but here Tequila stands for Test Automation Quality Engineering Lab. I have created a Chrome extension for quality engineers called Relative Expert Helper where you can find the Relative Expert expression of two developments in two clicks. If you're interested please check it out. About a little bit about Carousel, we are Southeast Asia's largest online classifying market founded by Yusuf Ray, Lukas and Marcus in 2012. We are the fastest growing mobile classifiers. Our mission is to inspire every person in the world to start selling and buying to make more possible for one another. I have a small poll for you. I just want to say, I just want to know how much you have tried the selenium grid. So you can go to the poll section and you can click yes or no option. I just want to understand the audience in terms of how much you know selenium whether you have any prior experience in selenium grid. So I can give you 30 seconds for that one. You find the poll section in the right side where you can go down and select the question, have you set up, tried to set up the selenium grid? Yeah. Meanwhile, you guys are voting. Then I can go ahead with the agenda of the talk. So I will be sharing with you my experience as a speaker. How did it start my journey? Also the problem statement, running UI test against each poll request. And why Ansible is a good fit for configuring selenium grid? How can we auto-configure the hub and node setup using Ansible? How you can restart, stop the APM servers using Ansible? And there is a small walkthrough and a demo. And also we'll share the key takeaways and links. So sorry to interrupt. Since we lost 10 minutes in this, so we might need to hurry a bit in the obvious things and we move to the poll. Thank you. So I can see from the poll more than 80% has already tried the selenium grid. That's really cool to know. So let's move on. I always want to speak at a major testing conference like selenium. But I didn't have the enough courage or confidence to do it. I wanted to get a breakthrough and I have submitted different proposal 14 times in various testing conference across the world. And each and every time I was getting rejection emails. I'm a big fan of Jim Morrison. And this is my favorite quote by him. My first selenium conference experience was at Berlin, where I had a chance to attend the selenium grid workshop. I'm so honored and privileged to talk about the same subject in the same conference today. While I was at Berlin, I was seriously thinking about appearing in the lightning talk session. I did submit my talk. But unfortunately, my talk was at the 11th position. The selection criteria was purely based on first come first serve basis. And I couldn't believe that I lost this golden opportunity to speak at selenium conference. After six months, I had another chance to attend one of the main conference called SourceCon 2018 at San Francisco. And this time I was very much determined to give my first lightning talk. I was eagerly waiting for the submission vote. And I managed to submit two proposals this time. But there was a small surprise. Instead of first come first serve, the selection criteria was based on the total number of votes by the conference attendees. And they announced that only six speakers would be selected from 23 submissions. And well-seasoned speakers like Richard Brushow, Jana Dalibs, Aaron Robyn, they were already made their submissions. And I was like, oh boy. You're going to fail here, too. But yeah, my real-life clickbait title worked well this time. And I have spoken about my relative X-platform extension in SourceCon lightning talks. That was the first time I was speaking in an international conference. And it helped me a lot to regain my confidence. I was feeling super lucky on that day. And I decided to participate in a ping pong tournament at a spin bar in San Fran. And guess what? I managed to win the title. And the first prize was an Apple Watch. I decided to sell the watch on Carousel once I came to Singapore. And I posted up that all the money will be good to Children's Cancer Foundation for charity. And a few days later, one of my colleagues bought it from me. Like this, we believe that there are interesting stories that would be happening with each and every Carousel transactions. Getting a chance to speak at a global international conference is not very easy. This is one of the main reasons why Tequila is providing a platform for everyone to speak irrespective of the speaking experience. Last week, we have done our first online conference called Tequila Lightning Test 2020, where 15 speakers has spoken about various testing topics in 10 minutes each. You can check out all the videos over here. You can scan this QR code later. Yeah, about the problem statement. So in Carousel, we have the apps available on iOS and Android web platforms. Usually, we do the regression test. And we were thinking about running a subset of UI tests that's around 20 existing critical UI tests against each pull request. One more criteria that we were looking for a way to get the test status within 20 minutes, that including the build time. In other words, when a developer opens a pull request and going for a coffee break and come back after 20 minutes, he or she should be able to see the test status on JIT Hub. That was one of the criteria. So we call it fast feedback test because you need to give the feedback in a very faster way. You don't need to wait till the nightly regression to get your result, whether your peer has broken any feature or not. But our existing infrastructure was not capable enough to do it, support this. So this peak says it. We had one Mac Mini connected with 200 and two iOS phone back then. This one we have developed as part of an internal hackathon event. So there were a few more issues like people thought that we have a new charging station and some people came to the test infrastructure and disconnected the USB from home and charged their own phones. So the challenges we had here is the device maintenance like we have battery issues, OS dialogue boxes, then the connection issues, Wi-Fi was disconnecting, ADB related issues, X code related issues. And definitely we needed for more parlour in order to bring down the total execution time. Also there were spiking up of peer. So we needed to support each and every pull request. And we have releases on every Friday, which means we have a code freeze at 8pm Thursday. The total peers will be spiking up just before the code freeze because they needed to ship everything and make sure that all the code is seen before 8pm. So we need to support that one also. So what if multiple peers comes on the same time? We needed to support each and every peer, right? So with only two phones, it's not enough. If you go ahead with the cloud vendors also, let's say you have purchase 10 concurrent session, but what if four or five peers comes at the same time and all the last peer need to wait for all the other execution to be finished, which may eventually result in exceeding more than 20 minutes. So that against our acceptance criteria. So that was another challenge. And lastly, which one should be used for test execution? Simulators or the real phones? Yeah, so some people argue that we should be testing on the real device always because none of the customers or clients are using simulators in production or emulated. That's a debatable point, but when we are running our sanity test before the release, we are running on real devices, but for fast feedback test, we are running it on the simulators. We decided to proceed with tackle the iOS issue first because we felt that that's the most problematic one because there are already solutions available for web and Android with Docker containers. This is how it's gonna look like. There will be Selenium Grid. Since most of you guys already know how does it work, I will just quickly go through it. So there will be a hub and Appium Server will be running us now. And it will be connected to the grid. And when we are instantiating a driver, we will be pointing towards the grill URL and it can talk to the Appium Server. So the queuing will be automatically taken by the Selenium Grid. There will be one-one mapping between Appium Server and Simulator, which means that if in a MacBook, if you are using five simulators, there will be five Appium Server will be running. This is how the grid configuration for Appium look like with Selenium Grid. So you can see that you need to specify the on which port the Appium is running. You can add the UDAD also along with this one. The device name, the hub details, hub port, hub if it is the IP address of the hub, all such details you need to fill with the configuration. And you can, when you run the Appium Server, you can pass it as a parameter so that Appium will understand. I need to register with the Selenium Hub. And how you can automate this process? So how you can start a hub programmatically, start an Appium programmatically? Also, you need to create few configuration files for Appium. Is it possible to automate that one? So how you can start and stop Appium Server on a remote location programmatically? Because you don't want to do it manually and sometimes you need to periodically restart the Hubby nodes. How you can really do that? Is it possible to automate? Answer is yes. So Ansible will help you to some extent. So Shyam, maybe we can skip through the definitions and move to the workings because we are going late of time. Sure, sure. Yeah, I will handle that. I will go in fast, no problem, but no worries, Pooja. So Ansible basically, it's an open source configuration management and it's founded in 2017 by Michael Dehan. And currently it's part of Red Hat. Basically it worked, everything will be working based on SSH. I can tell you an example. Let's say you have 10 machines with you and you have a use case like, use case like, every morning six AM, all the machine should be restarted. So what we can do, you can set up this with Ansible and Ansible will have a hub and it can connect all the other nodes via SSH. And you can write your script for restarting the machine at six AM. So what the hub will do, it will send the command to all the connected nodes and it will be restarting at six AM. So this is just an example. So likewise we can start a PM server and create the node configuration, et cetera. This is a high level overview of how does the Ansible architecture look like. It has something called playbooks. It is plainly YAML fails called yet, it's yet another markup language. And there is something called inventory file where you specify which are the IP address of the machines to be connected. And there are a few other stuff like there are inbuilt libraries like core and custom modules and plug-ins connection API. You can relate this with a robot framework. If you have worked on your robot framework where it purely worked on the key weights and there are standard libraries associated with it. I will show an example when you go through the demo. This is a comparison, quick comparison between Ansible Chef, Peppeta and Saltstack. Ansible is a clear winner here in terms of easy of setting up and managing the sources. And up to 10 nodes it's free to use and anyone can actually use it without that much programming language. Why Ansible is a good fit of quality engineers? The first one, no agents needed, which means that in order to keep the connection you don't need to install anything. Everything works based on SSH. The second one, it's idempotent. What is idempotent? Then operation is idempotent if the result of performing it once is exactly the same as result of performing it repeatedly without any intervening actions. So in terms of idempotency, Ansible will give XR result always. It's so stable. And the one declarative index, this is the main selling point of Ansible for us. So everything works within the Gamble files and it's just like what like a plain English text and it's easy to understand. And the landing curve is very low and you can land it without any help. Also, a lot of our documentations are available online. You can just refer to that one and try it out. It's easy to land. This is a typical example of how does Ansible inventory look like? For example, we have two kinds of MacBook one having a GB RAM and 16 GB RAM. And the last MacBook, we will keep it as a running the hub. And you can see that the IP address has already mentioned here with the Ansible username. So let's say, how does it work? When you are running a command, you can specify on which group of machines you need to run. So if you provide host as market, then that particular Ansible command will be running only on the machines where grouped as market. For example, you want to start the Selenium grid hub on a particular machine. So when you run that command, you just specify it should be running in the host as grid. So it will be running only on that particular machine. Yeah, this is how a typical AML file look like. You can see that this code I am using for killing an APM server. What it's called, it will call a script called killserver.search and you can see the what happens inside the kill server. So it will fetch all the process ID of the APM and it will kill one by one. And you can see that this action will be happening on market and market system machines. So it's very easy to understand, right? And this is a high-level overview of how does the architecture look like in order to set up the iOS simulator form. You can see that, yeah, so there are starting hub modules, starting APM modules, killing it and downloading distribution build because you need to, when you want to run a test against each projectors, you need to get the build for that particular branch. Then you need to pass that build to all the machines and then only APM can take care of it. Then in order to configure the APM nodes, writing a new configuration file, also you can automate that one. And occasionally, if you want to automate the JVM or the India Mac machine, that also you can do. So it's a combination of Ansible, APM and Selenium. Yeah, this picture illustrates how did we set up the Macbooks in our server room? So we have around six to seven Macbooks connected like that in our server room. One machine will be acting as a hub and another one will be, all the other machine will be acting as node. Let me share my desktop again and we will go through a quick code back through. Are you able to see my screen again? Yes. Okay, cool. So the code is already available on the GitHub repository, you can check it out. So in this example, I'm illustrating it as a local machine. I am not connecting to any other machine. Every code will be running on my local and that's why I have connected that. It will be on the local host and connection is local. There is an Ansible configuration where you need to specify the path of the Python and the inventory file we have already seen that one, right? So we're gonna call the setup.yaml file and you're gonna call a role called Selenium Grid APM. So you can see there is a folder called roles and here you can see Selenium Grid APM. There are three folders here, default, file and task and you can see inside the task, there is something called main.yaml where you are specifying which show actions that need to be taken place. So what it will do, it will download the Selenium standalone jar first. It will configure the hub, it will configure the node. If the hub is running, it will kill the hub, it will kill the APM, it will kill this grid and it will start all the API nodes again. Let's see what happens in download.yaml. These are the variables in Ansible. So you can see that all the variables are stored in default main.yaml. For example, Selenium jar path. I'm gonna download the jar file to this path, user, shared server, okay. So you can see that, insert the download. Yep. So if the directory exists, then I'm not creating anything, otherwise I'm creating a new directory. Then I have the path for downloading the jar, then I will be downloading to that particular folder. In the main.yaml default folder has the version need to be downloaded or re-given, you can check that, you can check that out. Then once it's downloaded, it will retry 100 times. So within two, three retimes it will be finished. So the jar file will be downloaded. The next one happens in the main.yaml configuring hub right. So you know that you need to create a jar file in order to configure that one. In the template folder, you can see this will act as a boilerplate. You can see that you can pass this variable to this one. So the corresponding port will be created. Already hardcoded for us the local host port. So it will try to create a new jar file for the hub configuration. And you can see how does it look like. So there is, Ansible has something called template. So it will pick up the source hub.json.j2 and it will try to create a hub.json in the given path. Same for the apm. So you need to create the apm nodes. That one you can do by calling another task called configure apm.yaml. And I am passing two loops here. So you can relate this like a while loop. And I am passing few parameters for name, which port the apm.yaml, what should be the WDA port and OS version. So you need to make sure that you have a simulator should be in your machine. And it should have the name iPhone X with the OS version 12.0. So what this script gonna do, configure apm.yaml file, it will call the instruments command. And it will list all the devices. And it will pick up the corresponding device name with the OS version. Then it will try to get the UDID of the simulator from the expression. Because you need UDID in order to register the apm nodes. That will be added to the configuration file and it will be created. And there are, if you look at the node.json.js, there are a lot of expression that need to be replaced that what happening in the rest of the steps. So it will add all the other configuration and it will try to create two apm.json files for one for iPhone X and another for iPhone 8. Once we do that, yeah, before that we need to kill the apm, right? If there is any existing apm session is there. For that one, I am calling a share script called kill apm.service. You have already seen that in the previous slide. So that part I am giving in the file folder, okay? And you can see that one. So it will be killing all the existing apm session. And in order to rerun the apm session, I am calling another share script. So it will be calling the apm in a screen. Basically I'm using screen because I want to run it as a background process. So screen is like a virtual terminal. So if you call a command on screen, it will be running as a background process. And you can see, you can just type screen command to see which all screens are running and you can provide the log file to be generated. So that all the apm log file will be generated locally. Then I am passing the node configuration because when we call the command, the node config file will be already created here in the node path. And I'm passing the call bug port and the apm port over here. Let's do a quick demo. Are you able to see my terminal? Hello, Pooja, can you see the terminal? Yes, I can see. Cool, thanks. Yeah, so I already checked out the Ansible project here. So the command you need to run Ansible playbook, setup.yaml and use specify the inventory.in. So let me run this one and see what happens. So it will start this and it will try to download the jar file first. Yeah, so yes, the directory already existing so I'm not creating and I will be downloading that one. Then it create the hub configuration file. Then it trying to find the UDID of the given simulator that's iPhone X and iPhone 8. Then it will replacing the node configuration files based on the configuration that we are providing. You can see that I'm printing the UDID of the device. You can see it over here. Then it will check the second phone. The same actions will be happening here. Yeah, it's replacing all the boiler plate components and I'm killing the existing Selenium Grid and apm server session. So it killed the 4724472.fave and started to apm session and done. So let's go to the, yeah, server folder. So you can see that the hub JSON file has been here. It's generated and the node config device with port name 4724 has been created. Yeah, and let's see the content of this node configuration file. Yeah, so you can see that the browser name as iPhone X, the device name, the UDIDM passing. They've already passed UDID again, then application name, apm node name, then the URL to hit, the host, et cetera. So already given all these parameters as a variable, so it will be creating the apm nodes over here. And let's see whether it has came here or not. I can check local host. Two nodes are registered here and you can see that the iPhone X is here and iPhone 8 is here. Now, you just proceed as usual. You just call this hub viewer with your driver instantiate and you pass the UDID capability here. So the web driver will be, apm driver will be trying to check whether this UDID has been registered or not and it will try to execute on particular simulator. Okay, going back to the slide, let me share it one more time. Okay, cool. So we got last 10 minutes. Yeah. Five minutes to wrap up in five minutes question. Sure. So customizing the Selenium grid. You can see it in my screen, right? Yeah, so we have this strategy, like one node equal to one simulator and we could see that you can run up to eight simulators in a 16-digit RAM machine. And we are using different apm servers for each node and in order to customize it, we have used the custom capabilities and catch custom server light. I'm not going to the details of that one. You can check this blog grid overdition by Krishnan Mahadevan. I have visited this blog multiple times when I was working on this project. So thank you, Krishnan, for writing this awesome blog. It helped me a lot. The challenges we face. So downloading and distributing the test field because we needed to get the bill for each and every PR and what we have done, we have added this to a GCP packet and again, we downloaded the corresponding bill to each machine. You can upload it to a centralized location and tell apm to fetch it from there, but it's quite time-consuming. Apm will be more faster if your APP file on the same machine. Then periodic handling multiple runs at the same time. So this one need more parallel session because there will be multiple peers will be coming on the way and we needed to handle that one also. So periodic restarting of apm servers. Yeah, because sometimes it get frozen and you needed to restart the apm server in order to make it correct again. Then there were a wifi issues like sometimes the wifi was disconnecting and we are not able to run it further. So we need to handle such issues. And after the COVID situation, this became very painful. Like we needed to debug it from outside the office network. Then we have connected the VPN machine to the hub so that we can connect the Mac machine from home. And we needed to get the screen recording that won't be solved using the apm's default with recording option. Yeah, moving to the key takeaways and learnings. Ansible is very easy to learn and you can use it without any prior experience in programming language. And it's easy to set up and maintain in with the readable email files. And it's truly a Q-Verbs tool and it's easy to configure Selenium Grid with Ansible. Yeah. I think you can just prefer these links in order to learn more. And thank you. Thank you everyone. Really feel great to get this opportunity. I really would like to thank my colleagues at Carousel and Abhi, Jerry, Longnan, Eva, Tiger. So thank you for all your support. Thanks for all my mandates. Yeah, thank you, Shyam. And it's a lot about great and Ansible and it's more about its perspective like what can solve your problem, not getting attached to one thing and then thinking problem solving at the core and then choosing what is right for your case. That's a good takeaway I would take from this session. So thank you. We have a few questions in the Q&A. Maybe we can have a look at them. Sure. So there is a question from Harshita about how to run one test case in all the platforms. Yeah, that's a good question. So ideally we should keep, we are on multiple platforms and keeping different test cases for multiple platforms. We are not an option for us. So what the team has done, we have a single test cases to support multiple platform like let's say, we are using Java with a Cookmer framework. So we have a base class and the subclasses for each and every platform. So we are using page of check pattern and for example, we have a login page. So imagine the login is implemented differently for Android and iOS. So what you can do, you can create a subclass for Android and a subclass for iOS in order for the login page. So depends on which platform you are running the test, that class will be instantiated and the method in that class will be overrun. So that's how we have done. So basically we went ahead with writing all the subclasses for different platforms. So that in abstract level, your Cookmer, BDD each will be scenario will be looking same, but it has got different implementation according to each platform. I hope that clears your doubt. Yeah, next question we have from Srikanth. How can you manage the execution time of a critical test where login test is prerequisite? Oh, okay. Yeah, that's another good question. Let's say, login is something you're gonna do for all the test cases, right? And the question is like, since this is a common method, can we exclude that one? We have tried that one. So you can do something called deep link. And when you start the APM, you can provide that this is the username and password that I supposed to call using this deep link. And instead of login using the UI, you can directly go to the homepage. So you can implement that one if you want to save some time. So it will help you. Let's say you have 20 test cases and the login step take three to five seconds which means that you are saving almost 60 seconds for a single run of execution. So you can do that. It's technically, this is possible, yes. Yeah, so next question, thanks, Shyam. Next question we have is, shall we rely on simulators or use real devices? I mean, like I mentioned earlier, that's a debatable question, right? So what I personally believe, you don't need to run all your tests on the real phones. But if you have a specific use, it's like, let's say a camera. So at Carousel, we are walking like something called a snap, list, and sell. So where you need to take the photo of a product and you need to list so that buyers can check with you, chat with you regarding whether, I mean to get the price and writing condition, et cetera. So you cannot do this with simulators. So such special use cases also, it will be nice to test it on a real device during the spring when you are doing exploratory testing, but not every time, especially at least for running this fast feedback test. I don't think it's necessary to run every time on the real device. Yeah, next question we have from Sir Sharath in parallel testing using grade, should we config each node with UUID? Yes, ideally, why I have mentioned this because UUID will be help you to navigate your test to a specific simulator. For example, even though there are multiple simulators that are available, you would like to run your test on a specific device. So you're telling your framework saying that, hey, I want to run this particular test on this particular simulator rather than picking up the random data. So in that case, UUID is very much helpful. Yep. Okay. And the last question we have is we can take from Harish, can we trigger the test case from Windows machine to run test on the Selenium Grid in Mac? Can you trigger the test case from Windows machine to run the test on Grid on Mac? Okay, let me tell this one. So like when I set up the Ansible, so I did it on Macbook. So Ansible need a hub machine, then hub machine will be talking to all the other nodes. The hub machine need to be either Linux or Linux-based machine or Mac-based machine, but the nodes can be Windows-based machine. So your question is, when you call the test, you can use the Windows machine, but where you are running the FPM, it needed to be on a Mac machine in order to run the simulators. You can install the Macbook on a virtual box, but nothing is guaranteed because you need the Mac hardware in order to get a 100% certified working. Yeah, so we come towards the end of the session. So thank you so much, Shyam.