 Welcome everyone again to this session. This is about visual validation, the missing tip of the automation pyramid. Not as much as a tongue twister if you slow it down, but yes, it can get interesting. Quick introduction about myself. I'm Anand Bagmar. I've been in the quality space for more than 20 years now. I've worked with various product and services organizations over this time across various different countries, and had great mentors, colleagues, friends, whom I learned a lot from, and also many whom I learned a lot of things that I should not be doing when it comes to my opportunity. But that's enough about me. You can reach out to me on LinkedIn or Twitter, and we can definitely have more follow-up conversations beyond this Agile India virtual conference as well. So let's start off. And I would really like you to try and participate in this and put in your thoughts. I'm gonna be asking a few questions. Put in your thoughts in the discuss section of your browser window over there. That would help us make this a little bit more interactive than just me rambling on. So let's start off with an activity, right? How do you test ink pen? Okay, so can you probably just write on what are the few top things that come to your mind in the discuss section? Okay, trying to write with it. Yes, that's right. Anything else? Okay, I'll give you some answers that come up first when you take a pen in hand and you try to test it, right? So of course it has to write properly. Yes, there's another thing, fill the empty ink tank. And that is you're assuming that there's a ink tank set over there, right? You don't know what type or how are you gonna really fill the ink over there? But yes, that is also very important. You would also see if the cap closes properly. It looks as expected. And of course it is visually appealing as well. You can hold the pen properly while writing. There are so many other type of scenarios, test cases that you can come up with now for this testing of pen. But the challenge is if you wait for the pen to come in your hand for testing, it is too late. And why? Because a pen is made up of many different components. It could have a cartridge or it could have an ink tank. The way Sid mentioned that is one of his thoughts around it. So there could be different mechanisms of how you're going to fill ink in the pen for it to be able to write with it. Also, there are various different components. The nib does it fit correctly or not? Is it of the right shape, the right thickness? Is it smooth enough? Do all those different smaller pieces fit in well together with each other before the final assembly can happen for the pen? And that's how you typically build a pen and how you would end up testing it eventually. But if you miss out on any of these earlier validations in place, what if that ink cartridge does not fit correctly into the pen or it is loose when you shake the pen that cartridge comes loose? The pen is of no use in that case. In fact, your clothes might get spoiled or whenever you're keeping the pen that might get spoiled. It's of no use to anyone, right? So you really have to think about how are you going to build the quality inside out, get those components tested correctly before they integrate with the bigger, larger pieces, and eventually make it a pen. And not surprisingly, this applies to software as well. You look at a website or a native app, it seems like a simple enough product, a simple enough interactions that you're trying to do. But in the back end, it couldn't be a really complicated architecture that is really enabling that kind of functionality for you, for whoever that end user is. So you have to think about how is your product really built inside out and based on that, how can you test early to get that feedback? Now, what is really missing in this case, right? Let's look at that. How do you ensure what was working before continues to work well now? And this is definitely a more applicable in software type products, not like a pen where the architecture of a pen is not really going to evolve for the same pen that you create different types of pen, but it's not the same pen. But from software perspective, we release a product at whatever point in time, and then you incrementally make changes to that product to add more functionality or fix issues based on feedback that you have got. How do you make sure that what was working well earlier continues to work well after these changes are done as well? And that is a big question that you need to really get and handle on because it's not just about how my testing the changes correctly, it's about making sure the overall product is also working well. So if you think about a typical testing approach, right? And I'm not going to preach to the choir over here. We are in Agile India. Agile has been around for a long time. The XP practices have been around for a long time. We've heard about these on far too many occasions, but we know that test automation is one of the key enablers to get that early feedback and make sure we keep validating the existing behavior with what was expected as the product changes. And with the automation, we of course need to think about the pyramid. What are the different types of tests that you can automate to get that feedback earlier? And you think about it from a technology-facing perspective or business-facing perspective, you have to think based on what kind of feedback are you really getting, right? If it's a slow feedback, can you restructure your test to get that feedback faster? And a part of all these automated tests, you also need to have a good strategy for doing exploratory testing for tests that have not been automated, cannot be automated, or certain things that you understand from a human context perspective of interacting with the product. So the pyramid actually can get broken further based on the context of the product as well, right? There are tests that you can automate from an NFR perspective, for example, performance, security, accessibility, analytics. These are all very important types of tests, including many others, again, the types of tests that you would automate depends on the context of your product. But all of these tests are, again, very, very important when it comes to building a quality feedback for your product. That said, there is a challenge that is there, right? And that challenge is mostly about how are we addressing and approaching the non-automated tests for your product. And typically the approach taken by the team, whoever the team is, usually or mostly it is QAs in various, in most project teams, but you take an approach of finding, thinking about this as what the difference, right? How am I really going to do exploratory testing? Yes, I'm going to explore, but there has to be some aspect of thinking why is this exploration results correct or incorrect, right? And one of the approaches over there, which unfortunately comes up with is for the differences. Okay, I'm going to back off a little bit from where I was in interest of time and let's get started with it again. Again, apologies for the technical glitch. This is definitely on my side, as I mentioned, internet issues and no power as well. So let's see if we can get to the next 30 minutes without any interruption from the network perspective now. So we were speaking about the automation pyramid. We know what the value it brings and how we can help look at and get quick feedback of the overall quality of the product based on these automated tests. But there's still a big challenge that remains and that is in terms of the manual and exploratory testing that is happening. And for that, one of the techniques usually subconsciously that QAs end up doing is almost like playing the game of spot the difference. So if I show you this particular image and ask you how many differences do you see over there, given a little time, you'll be able to come up with the number of differences and where exactly those differences are. But if I tell you this quickly and say, okay, tell me immediately right now how many differences are there, you may find a couple of them maybe not as well, right? But in this particular image, there are four differences that is there. Let's take another example. If I give you this particular comparison, what are the differences in these two images that are seen and how quickly can you come up with the 10 differences that are there over here, right? Maybe half a minute, a minute or so. But I lied, they're not really 10 differences. Again, there are just a few differences over here. And the reason I'm bringing this, sharing this point is a lot of approach to doing manual testing or exploratory testing is unfortunately just looking at the product and given the context of the product from your past experiences, you would see if it is working correctly or not. And what happens is many a times is, which is not a surprise, right? This happens in software as well. It's not just about looking at the image and seeing what is going on. You will see countless such examples in the field, whether it is Twitter, whether tweets are misaligned or UPS where on the tablet version of the app, the product does not show up correctly. And for those who spotted it, there are two lines also over here on the screen in the title, right? So these are differences that you may or may not realize very quickly given the context of how you're testing what pressure you're working under or what timeline you're working under. So again, many examples, financial times, the title being too long and it overlaps into the content itself. Here's Amazon webpage from the past and a big sale launch, which is gonna happen in India soon in a couple of days now. And instead of the numbers growing, they saw the numbers actually dipping in instead, right? So you look at the product and realize, oh, shoot, there's something basic that is missing that we could have found out. Also another example of HDFC bank pair on the homepage itself, we are seeing this kind of weird overlapped content, right? So the challenge over here is that the defects escape because of our approach to testing is incorrect. We are focused on a very raw ad hoc way of testing at the top layer of the pyramid. And that is the challenge that we are trying to solve. And that is a challenge that I'm trying to discuss with you with the solution, how you can look at the UX or visual testing, which is a missing piece in the overall aspect of quality that you are focused on. Okay, yes, 10 was a misleading number, but you're right, it's actually in that aspect. So now let's look at what happens when visual testing is not done. We've already seen certain examples that I shared earlier, right? UPS or Amazon, certain products, they have, there's not a revenue loss in that sense, right? So financial times the example that I showed you, what would happen if that text title is overlapping? It's fine, I'll still click on that article and I'll be able to see the content completely. Well, nothing is going to, the world is not going to end if I'm not able to read that few lines of text over there, same for Twitter as well, right? But in case of UPS or Amazon, there's actually a revenue loss that would happen as a result. And that is a challenge which comes across if visual testing is not done as part of your overall testing or quality strategy. You might end up with business or revenue loss, there's a loss of brand and credibility and more or equally important, you start losing your users because users have a very short attention span, right? So it is extremely important to think about how you can make sure the visual aspects of your product are also working correctly. Let's quickly understand for those who are not aware of this, what exactly is visual testing? So visual testing is a way you can validate the visual aspects of the screen, right? The name itself is quite indicative in that sense. You make sure that what is expected or rather what is presented on the screen is matching to what is expected in terms of the layout, the appearance, the content and the overall UI experience itself, right? That is visual testing. Now, how is visual testing typically done? This is done manually. We saw that example, right? We are manually trying to figure out if there's something wrong with this. And the reason this becomes a challenge is because of it being done manually. It is a very tedious process. It is extremely error prone and it is impossible to scale and repeat. And if it is any of these conditions satisfies in your approach of testing, then you are going to be a blocker for enabling CICD, okay? And that's the part that we really need to focus on. How can we avoid, right? How can we remove the blockers and enable the team to move forward faster? So you can use functional automation to help in certain ways, but there are still challenges of how much functional automation can help in this case, right? So over here, this is where I would like to present to you how a visual test automation can work to solve this problem, okay? So the first thing that visual automation needs to do is create baselines and that is typically done by taking screenshots of the expected UI. When you have created these baselines, you would compare the screenshots when the test runs the next time with the baselines and see if there is a match or not. This can be done at a whole page level or snippet of a page level. Now the whole page itself can be what is seen on your screen versus you scroll and take a full page screenshot and compare against that as well. That is also a further level of detail about what type of validation can be done. When you do the comparison, you would find out if there are any differences reported in this either that is because your product has evolved, that means functionality is changed, so you need to update your baselines or you've actually found a defect of which is going to be a problem and you need to fix quickly. That defect could be just a visual defect because there is overlapping text or it could be a functional defect which means that, oh, I expected certain values to be seen over there, a certain data to be seen over there but it is not there and we'll see more details of this in a demo as well, okay? Now the challenges of visual testing is the false positives and negatives and this is extremely important to understand. Basically it comes down to how the visual test automation itself is being done. If you're going to use Pixel comparison for comparing your UI, that is going to be a big problem because browser version changes can create updates in the rendering engine which means the way pixels are rendered for the same page could be different, hence you will get differences over there or dynamic data becomes a problem or also the responsive web design, right? Or even if you have native apps across all different types of devices, we've got devices right from 4X inches to six and a half, seven inch screens or tablets as well. How many such baselines will you take for each of these specific viewport sizes to do the actual pixel to pixel comparison and make sure everything is fine? So the false positives and negatives is one of the biggest reasons why visual validation fails, automated visual validation fails. The second aspect which is a challenge is how do you really create a baseline and maintain these baselines? And again, as I mentioned, this could be for different browsers and devices and resolutions and viewports as well, right? You need to have a baseline for each of these combinations, otherwise I'm trying to compare an apple with an orange and just because both of them are fruit does not mean they are represented the same on the screen, right? So you need to do an apple to apple comparison in order to get results from a visual validation and that result analysis itself has to be rock solid and give you the correct results, no false positives, no false negatives over there, only then can you really take action on that, okay? So this is what is really important from, these are the challenges from an automated visual validation perspective. So I hope this resonates with you in terms of what is a problem statement that I'm trying to highlight over here for you. A quick thumbs up if you're still with me. Excellent, thank you so much. Okay, so how can we really solve this problem in a better way, okay? And this is where I want to talk about one of the options what you could use which fits in very well with your agile way of working where agile or not for that matter, right? It's a matter of getting quick feedback and accurate feedback about what exactly is happening with your product. And that's where I would like to show a quick demo of one of these tools, aptly tools, which is a visual AI tool and how it can solve the problem. Again, there are a lot of other tools as well. This is the tool that I'm using from a demo perspective point. What I have over here is on the screen, hopefully my browser is visible where I have integrated aptly tools with my functional testing tool. Now, in my functional testing, I could have, you can see from this page there are more than 40, 50 different SDKs based on the choice of your functional automation tool. You could choose a particular SDK of aptly tools to integrate visual testing along with your functional testing. And the advantage it brings out is you are using your functional testing tool like Selenium or APM for example, to drive your application under test to do the various different functionalities. You open the app, you log in, you navigate to different screens, you interact with the screens or do actions on them, right? And that validation is happening from a functional perspective if your workflow is going through correctly. But what those tests do not do for you is the screen seen correctly on the device itself. Remember that Amazon example, right? I'm pretty sure the Amazon team ran all the tests possible on the page before doing a release. And the test passed because Selenium is going to pass saying that, okay, I'm still able to click on a particular element or find those elements over there on the browser. But Selenium will not tell you if there is a CSS that is broken which is causing that weird rendering issue where users cannot even use the product at all, right? So that's where you can really integrate visual testing along with functional testing where you use a functional testing to drive your application on the test to simulate the end user scenarios. And then you will use the visual testing aspects at relevant points, whenever you say I want to do a visual validation done of this functionality, the results will be captured, the data will be captured sent to apti tools where the comparison happens. So let's take a quick example. And here's an example of a cross browser test where we are comparing the baseline which is on the left-hand side. We're comparing the screenshot which was captured as a checkpoint on the right with that baseline. This is running on the baseline was captured on Chrome. The screenshot that was captured as part of the test execution was an Internet Explorer. Now, if I use a pixel matching algorithm to do this validation, and if I highlight the differences, you will notice that the whole screen is being shown in pink over here. And that is a big problem. This is a problem of pixel-based validation. In this case, of course, it is going to have the whole page different because it's one browser versus the other. But you will see certain types of false positives even between browser versions at times, depending on what has really changed in that browser. So this is a reason why pixel matching never works. However, if you look at a AI algorithm called the strict algorithm in this case, you will notice that all the region in pink over here that is highlighted is actually the differences that the human can see. The strict algorithm is, show me the differences what a human can see as differences, right? And as I switch between the baseline and the checkpoint, you will notice that all the differences highlighted in pink are actually what the human eye can see. Now, again, in this particular case, this is not a good example because you're comparing one browser with the other. And the strict algorithm also is not a good algorithm in this case. So if I use a layout algorithm, in the layout algorithm, you will notice that all the pink has gone off, but there is a difference highlighted, which is on the bottom right-hand side of the screen. And that is a defect caused, captured by the layout algorithm. This is actually a functional defect and a visual defect in that sense, right? So because this particular text is missing here, the user will not be able to use this functionality. If I have a Jira integration, I can just create a defect over here, report it in Jira, mark the test has failed, and I can move on with my next execution. The similar approach will work with your native apps as well. This is a dynamic data app, Yahoo Finance app, but the data is different. And you see that again over here, the strict algorithm has captured everything that is different. Now, there are techniques that you can use over here to say, I don't have control over this section of the page where the data is dynamic. So I want to use a different algorithm for that screen. And I'm using layout over here. So layout is ignore the content, focus on the structure of the page. And based on that, I still found there is some one difference that is there in the button over here, which is a problem from a layout algorithm perspective, right? So again, these are some of the examples of the power of AI algorithms that you can get to use, to see if your functionality as well as visual aspect user experience is working correctly. And this will work regardless whether it is web apps, native, or even PDF forms for that matter, for teams which are from a compliance perspective, it is very important to make sure that PDF documents are also validated. And the same algorithms will work for PDF as well. There are use cases that it will support from a localization perspective. For example, I'm checking a login page for Facebook in different languages. And because I'm using a layout algorithm, it is saying the structure of the page is valid. There's no problem. If I change it to a strict algorithm, you see that all these differences are shown. Now, the other, a very important aspect from this is what is important from a scaling your test perspective, right? The users are not just on one browser or one device. You want to scale that execution across all the different browsers and viewport sizes. But there is really not enough value in running your functional test against all the different browsers or viewport sizes as well. So you could use another feature from Apple Tools called AppAsGrid where you run the test only on one browser, but you configure it to say I want to do the rendering on all the different browsers as well and viewport sizes. So just by two tests executed, the visual validation is done for all the different browsers and viewport sizes that you have specified and you get the value from that immediately, right? So your feedback cycle reduces drastically. You're running the test just once in your CI machine or on your dev or through a laptop itself and you will be able to get visual validation results for all the different browsers that are important to you. There are other interesting aspects again from a shift left perspective, which is very important, is a root cause analysis. Now, what this does is if you click on any of the differences highlighted, Apple Tools will tell you what is the difference in the DOM and CSS, which has caused this difference to come up in the first place. Now, this again becomes a very powerful way to not just run my tests in CI, the test will fail if there is a functional defect or a visual defect in CI. So you get that feedback over there, but it's not just important to find what has failed. It's also very important to know why it has failed and take corrective action against it. And these are some of the features from Apple Tools that will help you find those, get that quick feedback and fix the issue as quickly as you can, right? I don't want to get into other features over here, but I hope this aspect you're able to relate to in terms of the kind of problems you need to, you'll be able to solve over here, okay? So moving on from the demo itself and we'll take the questions. I see there are some questions over here already. So it's good, we'll get to those shortly. I'm almost done. I want to have as much Q&A as possible and interaction with you as possible on this. So the test automation pyramid we already saw, right? So basically what we're saying is we need to add a visual component to this pyramid and that visual component basically is user experience validation and that has to be automated. And remember what I showed as a demo, right? The validation is automated, but taking a decision on that validation is still a human aspect required. You as a team member know if the difference that is seen is that because of a regression caught in your product or that is because of your product has evolved and you need to take a decision, you need to update your baselines based on that, right? So it's very important to understand where the value of tools can come into picture. It's not about AI solves all the problem. AI in this case, when used for visual testing is able to tell you the difference with accuracy, but you as a team member know why that difference has come in and how you need to address that, right? So tools have to be used in the correct fashion in order for you to get that feed. And overall, this is what the product quality is. It's not just about UI testing. It's not just about unit testing. It's all aspects of testing that is done including the non-automated testing. That all comes together to understand is my product of good quality or not? And you have to look at all the results of these tests or rather the results of all of these tests talk to all the different team members who have collaborated to implement these tests and get the results. Together you have to take a decision about what is the quality of my product. Remember, we always say quality is a team responsibility. How do you enable that? It's just a big word. This is a way that you can start enabling that. You have to ask your development team, how is your core quality? What is the tech debt over there? What is your branch coverage that you have from your unit test coverage before you get to your web service test and so on and so on? Together all of these results combined will tell you the value of your automation. That will help you make your pyramid really like a pyramid instead of ice cream cone which is unfortunately the case in many, many organizations. So in summary, what does this really mean? You need to think about a holistic quality strategy for your product. Don't think about it, I am a QA or I am an ASTET and this is my scope of influence. This is what I'm going to be doing or I'm a developer, this is how I'm going to do TDD or whatever implementation that I'm doing and then I'm going to throw it over the wall to some other role who will do whatever else is required. No, have a holistic quality strategy for your product. It's product actually not produced. Aha, interesting defect I found there. You need to think about how can you shift left to get quick feedback and shift left can happen at various different levels. You have to get all the different roles involved over there to make sure that you are actually getting feedback faster from your test and it is extremely important to think about if visual testing is going to be an integral aspect of your automation journey of your automation strategy. It may not be applicable in many cases but in many cases the brand reputation, the revenue loss or the risk of losing a user because of the short attention span of your users. If something doesn't work well they'll go to some competitive product. What is the risk to your product if any of this happens? Based on that you think about what are the risks you can take on and what are the risks that you need to mitigate for sure before the next release of your product goes out and including these as part of your strategy is going to be a very critical thing to look at. So here are some references that I have spoken to or referred to in the slides and with this I'm going to stop talking. So though we had some hiccups I've managed to cover the content in time I rushed a little bit in between but now we can get to the questions and we can have further conversations around that. So please do put in your questions in the discuss panel and we'll get to that. So let me start looking at the questions over here. Okay so the first question is why can't functional tests cover visual testing issues? There is a big limitation of what the functional testing can do from a visual perspective. How are you going to check from your functional tests if there is an image that is loaded correctly or not? Or if there is some overlapping content or not? So that is a big challenge from a functional testing perspective and in some ways maybe you can write a lot of code for validating those types of issues but you'll quickly run into issues when it comes to what about different viewport sizes, right? If I'm looking at a mobile web versus a laptop versus a 27 inch monitor the layout is going to change because my product is responsive. So can I really do justice by writing how much code can I write to get that validation if it is even possible? So that is the first part I hope that answers the questions actually that was a few quick examples why functional testing cannot really do visual testing to the level that is required. Also from aspects of color and everything, right? You need to think about that aspect as well. The next question is how to monitor KPIs across builds as part of CI CD pipelines that is login time, time to check out, et cetera. Again, a very interesting question not tied to visual testing but I'll take a stab at answering that right now is you need to be able to capture metrics which are going to add value to you. So some of these KPIs that you're talking about these are metrics, right? What are the metrics that are going to add value to you? And value is in terms of not to understand just what your test execution has been but clearly about what information does it give that can help you take decisions about your product quality or your test automation itself. So you need to look at, identify what are these metrics that are important to you? And then when you have identified these metrics use the 80-20 rule and see how can you automate capturing of these metrics into dashboards or reports automatically as part of a test execution. So you don't have to spend time in capturing that data and taking decisions on those. So the 80-20 rule is extremely valuable over here to think about it which what can you do to get maximum value automated results capture or metrics capture based on what you'll be able to take decisions and maybe the rest of 20% metrics for which you haven't been able to automate getting the values directly. Is it okay if we live with just 80% and skip the 20% because do you have enough data in that 80% that will allow you to take decisions, right? So maybe that is an approach I actually take many a times. The next question is how localization testing can be done. Now localization testing itself is a very different topic and I will add just one aspect to this right now, right? Localization testing it is very challenging for the team to do in terms of is this content correct? So I showed that example of the Facebook login screen, right? So if I just quickly go back over there. So in the Facebook, so in this particular screen, I don't understand either of these languages. So does it mean that I cannot do any testing over here at all? So this is where your strategy again comes in, which is very important about in which environment what type of testing will I do? So maybe in my dev environment or in my QA environment where I have control over the data for that fixed data if I use a layout algorithm, the layout should not break regardless of what data is fed into these screens, right? When I'm comparing between different languages. However, if I want to check for a particular language itself, that's a different strategy. You can use a strict algorithm and say, I'm comparing a French login page with a French language login page itself. And is there any difference when the data changes? I'll be able to capture that. So there are different aspects from a localization and automation perspective that you can think about. More in terms of automation for localization, we can talk about separately. The next question that are there first is a comment, I guess more is, I think all of these testings are still reactive until and unless people start working towards TDD, VDD. I do not think it makes sense with the legacy kind of automation testing. I would differ over there. You have to start somewhere. Not every project is going to start Greenfield. Or from scratch, you have to start somewhere. And how can you get the most value from your automation in that legacy product also can be a big win as that legacy product evolves into new legacies that it is creating, right? So even for existing products, you would start adding automation. You would start potentially adding visual testing if it makes sense in that context. And then as that product evolves, you will start seeing the value of that. The second aspect is even for legacy products which had limited automation and you add more automation to it after the fact, that becomes an opportunity for you to say, I have a test safety net now. Can I do refactoring internally in the product? And as long as my tests are still working fine, maybe that is a good value or that these tests are also bringing to me. So in fact, that's a strategy that I've taken in multiple projects in my past experiences, where for legacy products to enable certain change in the product functionality, the big risk is it doesn't have any automation. So first we build automation based on what we know about the product functionality to as much level of detail as possible. And then using that as a safety net, we start refactoring bits and pieces and make sure our test continues to keep passing as that refactoring is happening. So hopefully that helps answer that particular comment itself. Thank you for joining us today. Thank you everyone and again apologies for the technical glitch. Thanks everyone, bye bye.