 Welcome everyone to APM conference 2021. We are glad that you can join us. So in this session, we'll be talking about test automation on the living room devices with Andy Chamak. So without further delay, I hand it over to you, Andy. Yeah, thanks, Alisha. Hello everybody, Andy here. And as Alisha mentioned, the topic is a living room device test automation and how APM 2.0 can affect this niche market for automation. So first let me introduce myself. I'm a product manager at a company called Sweet Test and we are focusing on test automation on the living room devices. Basically the whole talk is a collection of our experiences, what kind of problems we can see in this area and what is the typical way of solving those problems, how you can do that. And how APM 2.0 would affect the common issues that people have when testing on the living room devices. So let's first dive into what is a living room device? What kind of beast is it? It's pretty much any device that has a big screen and meant to be used for consumption of media content. So things like video on demand, Netflix, YouTube and so on. And this encompasses quite a big number of different devices, different platforms. It includes smart TVs, setup boxes, media consoles, gaming consoles, streaming sticks and so on. Quite a huge range of devices. Can you please raise your hand who is actually doing some test automation right now for the living room devices? There should be a button in the Zoom to do that. And I will check the results then later on. All right, so why should we care about test automation on living room devices? It's such a niche market and would probably affect only a few people, right? Well, the market is actually growing quite fast. You can see that from the report by Sandvine. Sandvine is a network intelligence company. They are doing different researches on the network traffic on the internet. And according to their estimate in 2018-2019, about 60% of all internet traffic was a video traffic which is mind blowing actually. Yes, video are heavy. They take a lot of space and require a lot of resources to transfer, but still 60% is quite a huge number. And what's even more interesting is that Netflix takes 15% of the whole global network traffic worldwide. It's just one company Netflix. And the other big content providers like YouTube are not far behind. So another interesting thing that we can see from this chart is a trend. Of course, two data points is not quite enough, but we still can see the trends that video content is increasing the share of video content in the global internet. It's not an absolute numbers. It's a percentage from the global traffic. So the share is increasing while the share of biggest content providers like network and YouTube is decreasing. And of course, there is a slight chance that Netflix and YouTube just created some new encoder in 2019 and the video traffic went down. But I think the more plausible explanation is that there are just so many new vendors, new applications, new services that provide video on demand content that they are basically taking this share in the traffic from those big guys. And you probably seen those brands, right? You're probably familiar with names like Netflix or Apple TV, YouTube, Disney Plus and so on. And they were on the market for quite some time. Well, some of them are quite new like Disney or Apple TV and Peacock, but some of them are there from the very long time like YouTube and Netflix. And if we check the Wikipedia page, which is, yeah, I know not a very reliable source, but still even Wikipedia knows of 47 video on demand services that have more than 100K subscribers for each service. So only the big players are 47 of such video on demand services. And then you have countless number of smaller video on demand services of IPTV providers, pretty much every internet provider nowadays has also an IPTV offering delivered together with the internet. And then there is a hybrid television, the European HBP TV and a free view play. And in US it's ATS 3.0 or NetGen TV. And hybrid TV is basically a broadcast channel, the regular TV plus layer on top that is handling all that extra content video on demand, live videos, ketchup and so on. Then there are dedicated ketchup services. And those are the apps that meant to provide you content that you missed in the broadcast. For example, you stuck in a traffic jam and you missed your favorite series or a news report on the TV. So you can just go to that service and watch it kind of on demand. And then there are live streaming apps and there is a huge amount of audio apps starting from online radio and ended up with things like Spotify when you can listen to your tunes from TV or from some smart speaker in your living room. So the market is actually exploding and we can also see that in sweetest since the recent years we have a huge amount of new leads, people who are coming to us and saying that we're now just starting to build a new app and we need some solution for test automation. So there is a good chance that a lot of you will at some point have to deal with one of those living room devices even if you're not dealing with them now. So how hard can it be? I mean, most of those applications are just HTML based and you have a quite good tools for HTML apps testing, right? Well, let me tell you how hard it is. The market share, sorry, the device fragmentation is a huge deal in living room devices and smart TVs. For example, Samsung TV, Samsung brand of smart TVs has a 32% market share in first quarter of 2021. So they are a very big player and this is a typical signature for their model, so you can see that devices are different based on the market they are meant to. So there are US devices, there are European devices and even within the Europe, there is a separate set of devices for Germany alone. Apparently they don't consider it Europe. Then there are different screen sizes which does not matter that much. There is some new technologies that Samsung is trying to promote. There is a screen resolution, UHD or 8K, which is very important because it will also affect the chip that is used inside of that TV. The bigger resolution you have, the more powerful CPU and GPU you need to cover that pretty much. Then there are huge differences between different model years in firmware and hardware as well. There are different generations of TVs, there are different tuner, again, different hardware depending on the country and the format that it has to support. And there is also design differences and manufacturing. So at the end, brand like Samsung can easily produce 20 to 30 models of the TV every year and there are even more of them if you count just a little differences like screen sizes and so on. So the amount of device fragmentation even for a single TV manufacturer is just huge. And then you imagine that Samsung is not the only player, there are also 66% of the market here and every single manufacturer has at least several TV models per year and very often it's more like 10s or 20, 30 models per year. So next, there are some problems with testing smart TVs. So here is a short overview for you. There is no standardized protocol for automation, right? You can have of course, you can use Appium even now for Apple TV or for Android TV, Fire TV, which is nice and it works. But if you want to test something a little bit more exotic like Samsung, Tizen and G-WebOS, there is just no out of box ready-made solution that you could use there. Then there is also no standardized way to deploy developer applications. So if you want to add ad hoc app just for testing, every manufacturer does it differently. There is no standardized protocol to emulate user input because TVs are very special in that respect. You don't have a keyboard or mouse and you don't have a touch screen as in mobile devices. You have to use the remote control with infrared or Bluetooth, right? And if you want to emulate that for test automation, it's not that straightforward. Then there is a poor or non-existing developer tools and debuggers and the virtualization is also not very good. The simulator for Apple TV and emulator for Android are quite good. But when you go again to a little bit more exotic space, then emulators very often not represent what the actual device can do. So they are missing some modules like DRM and they handle application or lifecycle completely differently. So you cannot test a lot of use cases that you would normally want to test in the automated way. So now let's talk about solutions. How you can test with existing tools on the smart TVs. There are basically two big groups of solutions. Some of them are open source, some of them are proprietary. On the left, we have object-based solutions, basically the ones that are getting the hard data from the device in some way and then allow you to use assertions on that data. And APIOM is one of them. And there is a native framework Espresso for Android test and XUI test for TVOS. You're familiar with those tools as well and you can use them, but then for other smart TVs and set-up boxes, there is no APIOM solution at the moment. And that is where APIOM 2.0 becomes very important with all the ecosystem for sort of party drivers that can fix that basically. People, vendors and open source communities will start to produce drivers for those exotic devices. And SWIT test is also one of those object-based solutions where do things a little bit differently, not the same way as APIOM, but I will get back to that when we're discussing other topic as well. So image-based solution is a huge part of it. And image-based is, and its simplest implementation is just a reference image comparison. So you take a screenshot, you save it somewhere, and then once you have a new version of your application, you run the same steps and then you compare one screenshot to another. If the screenshots are matching, then your test is passed, otherwise not. And there are of course some tools that are more smart than that. They are using machine learning and the text recognition computer revision to detect text on the page so you can also assert on text and so on. But overall, yeah, they have their own limitations and advantages when you compare image-based and object-based. So let's now discuss a few most common pitfalls that people are seeing when they try, the first try to automate something on the smart TV. And the first thing that you would want to do is to install my application, right? I want to install my apps on the TV for testing without the need to go through App Store. I want to test them before they go to App Store. So as I mentioned before, there are a lot of different platforms that do this differently. So you have a different API for every platform. Usually it's some sort of network API, but there are also tools that are still using USB drives to install on the Hoke app. So you have to put your app on the USB drive, put the USB drive into the TV and then you will be able to install it. And most of the test automation solutions that I showed on the previous slide actually can do an okay job for it. They are not perfect, but they can do a decent job installing the app there for you. You just need to figure out which tool you're using and which platform in support. Then the next question that you will probably have is how do I record my screen? And you might want to record your screen for several reasons. You want to access your remote devices while you're on a home office. This was a huge topic during the COVID lockdown. I think it still is, although we are kind of climbing out of it. People want to be able to test on the devices that are in the office while they're at home. So for that you need to see what is going on on the TV screen. To get some feedback. Then for image-based testing, you of course need some sort of screen capturing. And then when you have your test run done in automated way, you want to be able to review what was going on there on the test in case it's failed and see where exactly the error was. So you want to have some record from the screen. So there are several ways how people usually do that. There is a programmatic screen grabbing, which is easiest one. That is what you would probably want to use for Apple TV and Android-based devices because they provide that. But it has its limitations because of DRM content. So DRM is a way to protect your content from piracy. So you need to have a specific decoder for the DRM content. And it's done on the device operation system level. So it's very low level. And device will not allow you to take a screenshot of DRM protected content. You will just get a black screen instead. Then there are HDMI grabbing tools, but that is kind of illegal. You will have a legal issues with that. Again, HDMI has its own protocol for protecting the data from piracy. And although there are tools that can work around it, I would highly not recommend you to use any of those. Your legal department will not be happy with that. And then camera in front of the TV is a very nice option because it works for pretty much any device. You don't need to have a different solution for different platform, but it has also its limitations. You don't get a pixel perfect image no matter how good is your camera. Yeah, so this is an example how this camera in front of the TV solution works from our friends at HeadSpin. The way it works is that you have a high resolution small TV screen and then you have a camera in front of it and all this setup is put into a black box so that there is no external sources of light. And then this box is module to be modular so you can put it into the server rack and have several of those server racks there. Of course, obvious limitation here is that it's not very scalable. If you need to cover huge amount of devices, you need a lot of those boxes and it will become very expensive very quickly. And also if you have 80 inch smart TV would like to put it into any box like that. Yeah, so basically depending on what is your requirement if your app is using DRM content or if it has at all HDMI output or if it's a relatively small device you can choose which solution you want to use for image grabbing. Then there is a topic of user input emulating. For most smart TVs it's a matter of sending infrared signal from the remote control and we can emulate it in several ways. The most common way is using the network API if device provides such an API against smarter devices like Apple TV, Android, Roku as well, some gaming consoles. They have API so you can just use that and that's very easy, that's very reliable. The only small issue is that it does not cover the whole path of this user interaction. So you are not sending the actual infrared signal in the same way as your end user would do that. Instead you are using some sort of API so it's not exactly the same. And then there is a way to blast the actual infrared signal. There are a few companies that provide solutions for that. There is a RedRap that is focusing on device management so they have a nice solution for blasting infrared signals to the different brands of TV and also sweetest supports that. So this particular images from our documentation we have this bubble that you put on the TV and it's just able to blast infrared signals. There are also other solutions. For example, you could use a USB keyboard or some sort of simulation of USB keyboard or USB mouse connected to the USB port of your TV. Usually it works nice, but the problem with that solution is that your end users are not using keyboard. You can very rarely find a user who would actually connect a USB keyboard to the TV to watch a TV, right? They would rather use a remote control or a mobile app. And another possible solution is something called HDMI CEC. This is a protocol inside of this HDMI stream that goes from your setup box or gaming console to the TV screen. And this protocol is meant originally for users to be able to use just one remote control from your screen, from your TV. And then over this HDMI cable, it will send a signal back to the gaming console or streaming stick and you will be able to control that device like that. So there are adapters and you can set it up in a way that you will kind of inject your commands into the HDMI programmatically. And it works nice, but it also has its limitations. Of course, you cannot use it for TVs because TVs simply don't have HDMI output. And it also has a limited set of keys here. So you can send all possible keys. You can only send some subset of that, right? And another very common pitfall when you're testing on the TV is keyboard input and basically login to your application. There are two very common ways how login is implemented in the app. And one of them would be to use on-screen keyboard. So you have a regular form with username, password and so on. And when you focus that field, the keyboard will pop up and user is expected to use arrow navigation on the remote control and okay button to punch in his credentials one letter at a time, which is extremely time consuming, but especially when you want to automate it and you want your test to run very fast, but that's how it is. And there are several ways how to overcome this. You can send text over the network API, some platforms allow that, some don't. And you can also try to ask your developers to build some sort of backdoor for you. Just a simple line of code that you can insert the basically execute JavaScript in the runtime of the app and get logged in. But then you don't do the same steps as your end user. So it's somewhat limiting. And another very common way of logging in users into your application is with this signing with QR code way of doing things. The idea there is that user will see this on the TV screen, user will get his or her mobile phone, scan the QR code, login using mobile because it's much faster, it's much easier and then the TV will be logged in automatically for you. The limitations that you will hear here is that you need basically two sessions, two testing sessions, two browsers to test it, right? You need one on the TV to actually run this application and then you need a browser session running in parallel in the same test scenario that would just log in you with a regular browser tools. And this is another reason why I think Appium will be a big deal here because Appium has that support out of the box you can create as many sessions as many drivers or browsers in a single testing session as you need and a lot of tools for smart TV test automation don't do that and once you will have a driver for TVs you will be able to cover this use case extremely easy with Appium 2.0, right? So how do I run assertion, another common topic? How do I get some sort of confirmation that my application is working correctly because now let's recap, I can open my app, right? I can install it automatically, open it automatically, I can log in, I can navigate through all the menus, select a video to play and so on, but then how do I confirm that it's actually playing and that the app is actually showing me what it's supposed to show? For all those tools that are based on image, I already discussed it with you how it works, it's basically just comparing images to a reference image and or grabbing a text out of the image using OCR and then you can do assertions based on that. Well, Appium is using a different approach is getting this information from the APIs that device provides. So there is a espresso on Android or there is a XCOI test for TVOS and then on top of that, there is Appium that kind of covers all that and using that native protocols, native protocols provided by the vendor to request the data needed for your test. And here's where sweetest is different because what we do, we inject our instrumentation library into the runtime of the app. So we have a kind of an agent running there alongside the application and we are able to grab all the information that we need and send it back over network to our server for assertions. Yeah, so that's how it's done depending on the tool that you select. Next, I want to just give a few tips based on the experience that we get with multiple customers about how to keep your sanity with all this complexity, how to manage your tests, how to set up your testing pipeline and processes so that it's easier to cover everything that you need to cover. You've probably seen this pyramid already multiple times but it is a little bit different for living room devices for smart TVs. Now at the bottom, there are unit tests and your developers are usually covering that. So they're writing user tests in their own way as a part of their process. And then you usually have a functional or integration. I have to hear a label of end-to-end tests for that. And usually it's just one section but I split it into two and the reason for that is that you want to have your business logic tested in some simulated environment. Let's say a browser for HTML based platform or a simulator for Android or Apple TV. And you can get as thorough and as deep with that as you want because business logic usually does not require any device specific APIs. And you can test all sorts of things like logging in user there, making sure that the bookmarks are working that video starts from the same place where it ended last time and so on and so on. All this basically things that make your application work you can test them in a browser in a way faster way. A lot of people actually utilize their developer team for that because there are tools for easier and a very quick development of test by developers like Cypress for example, or WebDriver, you always app him or things like Nightwatch. And then developers are covering not only unit test but also covering business logic with end-to-end tests. And the next section is the one that is most important and that is the testing on the real smart TV, on the real deal, on the real devices. And the reason it's important to test on the real device is that you cannot cover everything in a browser or simulator. There are a lot of device specific APIs that only available on the real device and the passing of your test on the simulator will not guarantee that it will pass on a real device because of all the device fragmentation all the different models. It's simply not possible to validate it only on simulator and then just submit it to the app store and it will not work like that. So here you would usually have a smaller set of tests that are focusing on a few user journeys that are the most important for your application, for the success of your application and also cover all those APIs that are available only on your device. So you will kind of try to keep it to a minimum. And the reason this is so big is because usually although it has a less test, it's way more resource consuming for you to create such a test. And then on top of that, there is yet another topic and this is quality of experience. This is something that started to come up only a few years ago and people are caring a lot about that in the smart TV test automation at the moment. So it's getting a lot of traction. And what it means is that quality of experience is basically how your user perceives your application, how your users can would kind of apprise your application if it's running fast, if the loading time is good, if the video didn't take too much time to boot and so on. And also what is the quality of my video stream? Is it blurry? Am I getting this pixelated rectangles there or is my audio clear and loud and so on? So this is extremely important for customer retention in the field of video on demand services. So it is very important to also cover at least in manual way if you don't have any tool for that quality of experience. And there are actually quite a lot of tools now that can do that for you. I already mentioned Hatspin, our friends that are doing just that. And yeah, they have a very nice tool for quality of experience. So next topic, what is next? What next we can expect from the test automation for living room devices? Well, there is a Cisco report that states that by 2023 about two-third of the connected to views will be 4K. And why is it important? Why 4K is important? Well, you should understand that the time of life of a single smart TV of a single device is extremely big compared to other devices. So you can have your mobile phone for two, three years usually maybe four. And then most of the people will just replace it with a new one, right? So you don't need to take care of mobile phones that are older than three or four years most of the time. With the laptops, it's usually something between five and 10 years, right? When you will want to get a new laptop, but with the TV it can go up to 12 years and even more because most people just don't see an incentive why would they want to replace their TV with something new because it still performs quite well the way it is. And the 4K is actually the reason for people to switch to update to a newer device. And this is very good news for us because newer devices have way better APIs. They are way more stable. They have way better developer tools. And this will simply make our lives much easier when people will drop those old dinosaur devices. And another huge game changer is APIUM 2.0. I'm not saying it's just because this is an APIUM conference. I really believe this is a game changer for the living room devices industry and for a few good reasons. First of all, APIUM is going to introduce the driver ecosystem. I already mentioned that my expectation is that vendors and enthusiasts, open source community will start adding new and new drivers for platforms. So over time we will gradually see all those weird and exotic platforms covered with drivers for APIUM 2.0. We will have a driver for Samsung TVs, LG TVs and whatever you can imagine. And in the worst case, even if there is no such driver, you could just go and write it and you will have all the ecosystem that APIUM brings with it. All those different testing clients written in different languages. So you can choose whatever language you're comfortable, Java, Python, JavaScript and write your test using that language. So this is awesome. And another one is a plugin cycle system. This is a less known feature of the APIUM 2.0, but I think it's no less important than the drivers themselves. And because it will allow you to run a different sorts of assertions on your devices. So if you want to run an image-based assertion, you can have a plugin for that. If you want to do some quality of experience test, you can have a plugin for that as well to just get this functionality easy and out of box. And I also expect that such plugins will start popping up once APIUM 2.0 release is there. And I can see from Git that the core team of APIUM is pumping up better very quickly now. So I would expect that we should see this APIUM 2.0 release quite soon. And the last piece of information that I have for you today is an announcement that I have to make about sweetest and APIUM. We are building our own APIUM driver for APIUM 2.0. It's meant to be released in October. And it is going to support all the platforms that sweetest can currently support. So pretty much any living room device, any platform that is out there, except maybe some extremely exotic stuff. Sweetest is already supporting and we are bringing it to APIUM by providing sweetest APIUM 2.0 driver. So we are going to be among the first vendors who delivers that. And even more than that, we have a public device lab that we launched quite recently, about a half a year ago. And these devices in our public device lab will also be available over APIUM protocol. That means that you can pretty much connect to any TV in our lab by just specifying the correct APIUM hub address in your settings and then run your tests on that device remotely. The same way as you would do for browser testing, for browser stack or for mobile testing with some APIUM hub, right? And that sort of exists. So I hope you're excited about this. I know I am. And this is the last slide that I have. So thanks a lot for your attention and now we can switch to questions. Yeah. Thank you so much, Andy. So we have one question here that says, you mentioned the way APIUM running assertion is by grabbing the data from the API on the device. If the device doesn't provide the interface, does that mean APIUM doesn't in this case or is there other way to do it? Yeah. So that's exactly the case. So APIUM is relying on getting, APIUM needs to get some, the data for your assertion somehow from the device. If your assertion says that this button has to be green, we need to get the color of that button before we can compare it with your reference, right? And the way it's done now is that APIUM is using APIs provided by vendor. So for Android, let's say it's a espresso framework for Apple, it's XCUI test. And that is the only way at the moment how APIUM can get it. Or for web platforms, it's web driver protocol, right? So it's also built into the browser. And this is exactly why APIUM 2.0 is such a great news because it opens up the gates. It provides an API for developers to create new drivers and new ways to get data from the device for you. So at the moment, you cannot do it if device doesn't have such an API and there is no driver for it. And in the future, you will be able to build your driver if you need or you can wait until a community does that or a vendor does that. OK, there's another question. Can we also plug in our own device with Sweetest Public Lab and share across the remote team? Yeah, yeah, we have a public device lab and we have also a private or your own devices. We provide tools how you can do that. It's all described very well on our website. So just go to Sweetest website and you will find all the information there. And there are quite a lot of tools or quite extensive tools for sharing between team members so you can access your devices remotely. We don't have any more questions, Andy. I guess we can close out this session. So then thank you so much, everybody, for joining us. Thank you, Andy, for sharing your experience today. Yeah, thanks a lot.