 Here we are with David Bunce We mostly need an introduction, but anyway, I'll give a quick one David Bunce. He's the major of Python bindings a member of projects selling projects Technical Edition Committee is also the code intro for driver spec and the chair of W3C pros and tools and testing working group Stage is yours David Thank you. Thank you for that Manoj. What I'm going to be talking about is What happens whenever we make certain calls in Selenium and try to give you a better understanding at a very Low level all the way into the browser when it does an action and then returns all the way back to the client bindings We're going to go through Each of these different layers along the way. I'm going to try give a bit of explanation And the idea behind giving you all of this information is that if you understand how everything works Hopefully this will allow you to write better Selenium code along the way So before we get into that, who am I so my name is David I Head up the open-source team at within browser stack and our role in browser stack is to kind of look after key Open-source projects that are core to browser stack and kind of make sure that browser stack is giving back In a meaningful way. So kind of a lot of our work currently is supporting Selenium I am the co-editor of the web driver specification with Simon I am the chair of the W3C's browser testing and tools working group. So this is the group that a coordinates all the work between all the different browser vendors and other interested parties to make sure that the specification works and we can kind of make sure that it works between all browsers and all vendors Before joining browser stack I used to be the coder for Gekko driver and the web driver projects within Mozilla and looking after all of those as well as kind of looking after interoperability projects and things like that So I have a fair understanding of how browsers work and hopefully I get to impart As much of my knowledge as possible to to everyone and I hope you will enjoy this So the main Thing that we're going to look at today is navigation because this tends to be one of the Areas that people gets stuck in either like pages take too long to load or When a page is loaded The tests start failing because kind of there's other problems and so if I can at least impart how this works you'll will see how it all hangs together in a meaningful way and kind of You can learn to kind of make your tests more robust Remove a lot of that flakiness that we kind of everyone's hits at some point Because you know what to expect where the gotchas are and and everything But before we start with that, I want to make sure that we understand What happens when we have simple bits of code? So in here in this case We're just instantiating Firefox It looks fairly similar if you're in Java or dot net or Ruby And we know that the bottom line will have a browser started up for you But what actually happens underneath the hood? kind of in some cases Breaks a lot of rules when it comes to object orientation and things like that and because like By instantiating an object is kind of doing nearly all the work And we always are taught that should never be the case It should kind of get a basic setup and then go but in this case we know it's going to start up a Driver and It's also going to start up the browser and so We've looked at the left hand column the client bindings and now I'm going to start taking you into the driver and then from there I'm going to take you into the browser and Within the like milliseconds that kind of that command takes to execute We've got multiple applications up and running and started and working And this feature is even more impressive when you start to think about like are you running on a Selenium grid like are you using? Companies like browser stack to drive your browsers like having that quick instantiation and access super fast when you've got the internet is Quite impressive and so we try to always make a lot of this fast and it also helps with our scalability So when the client the first thing the client bindings will do is it will look for a driver on On your path Or if you've passed it in it will take what you've given and start that up So you'll find that executable starts it up now In starting up that driver It it will start the executable. It will create an HTTP server that is then ready and waiting for Commands to be sent through and the first thing that will need to come through is a new session request and So it will go it will send an HTTP request saying I need a driver And so in this case is fairly simple is going to ask for Firefox and but you can kind of set other details and the driver note will look for the browser again either on the path or kind of in It's usual places So kind of if you like if you if it always knows that it should be in this area or this area or this area like if it can find in The different versions of product it will try start one of them up and start working with it and Then once it it trusts that the browser is up and running It will return to you so you can now start sending commands And the way it kind of Gets to all of this is that it kind of it does its request its request is kind of rest-ish So we don't follow true rest But it's kind of rest-ish and it allows you to kind of do these things. So here we have a post To session this creates the new session and it we set the Browser name so we want Firefox if you were to send this Send through Chrome to To get a driver It would error and say it can't start up that browser because when when it starts it up Or it can look into in like do a bit of introspection into the browser before starting it up boost that can be quicker It'll go no, I can't do this and then error out but it allows you to kind of get there and up and running and When it's finished it because Selenium tries its best to be as synchronous as possible because when we thinking about our tests when we're thinking about our Like flows in code and things like that we tend to think in a synchronous way So it's always like I will log in. I will click this button. I will type in this box I will do this next step. I will then move to this next page It's always done things like that even though the browser itself is designed to be incredibly asynchronous so it will kind of Injects things into an event loop which will is constantly spinning and When it gets to the top of the the loop and is executed it might fire off events But if you sometimes things could take a lot longer running Sometimes there are multiple event queues depending on what you're looking for because like if you're looking at something that could be Being rendered that might be slightly different to the JavaScript event queue, which is kind of single threaded And it's off the main thread. So kind of in browser terms things that are on the main thread are Things that could block the browser and so like if you have something stuck in there That's when you might get the beach ball effects if you're on OS X or kind of your browsers stops responding on other Operating systems. So kind of you you never wanted that and you so you got all these things kind of off running around and so kind of Where possible we try to do things synchronously so that you don't need to think about it and you can start using it straight away But now let's get to the kind of the nitty gritty So, let's start looking at navigation Navigation is Kind of the cornerstone we need to be able to navigate Selenium requires that you give a fully qualified Ural kind of so it needs to have its scheme the full path and everything And this is important because kind of we never want to be making guesses I know some kind of other frameworks allow you to do that but Historically whenever we've tried to kind of introduce that and I'm talking like a good 10 years ago When we were working on Selenium it like it never really works and so kind of it needs to have its fully qualified the main and so What this does is if we kind of look think about the HTTP is that it will do a post and say take me to this URL It will go to the driver the driver will May mutate the the packets slightly and then send it on to the browser and the reason why I might mutate it is the The lines between driver and browser are different between each browser vendor So kind of with Chrome driver. It might it speaks CDP so the Chrome debug protocol With gecko driver it's speaks Like his own spoke transport layer into the back end, which we called marionette And with Safari driver it's it kind of speaks its own like Jason to DevTools protocol into the browser. So it's all everyone's unique and this is kind of why the drivers are Needs to understand everything that the client bunnies are going to send It sends it through to the browser and it says writes. Let's Go to this URL And we know that it needs to return When a page is loaded Unfortunately loaded has Different meanings to different people and so kind of if I were to take Browser then like if I was to go to a browser vendor and say hey, can you just make sure this page is loaded? Like they'll just look at you funny And so if you think about like truly just start thinking about applications that you've worked on the application You're working on right now What does loaded mean for you? And so it can be fairly Yeah It like it means different things and so like if we look at like the web 1.0 Where everything was rendered on the server Sending it across When you see everything that tends to mean it's done And that's assuming there's no JavaScript no Ajax no kind of progressive enhancements No, nothing like that as it as it arrives Then it's done perfect If we start adding iFrames into it done changes very slightly If you start adding The rel keyword to which In kind of HTML allows you to kind of download things asynchronously, but it also means that done Means and loaded means something slightly different and so you keep like all these features which are designed to allow people to Have web pages load super fast And become usable as kind of new and different things are loaded Mean that we kind of get into this weird space but Before we even get there we also have the problem of Certificates In my experience and I know a lot of the Selenium committers and core people have had this at their work at And while we've been supporting Selenium users over the years People need want Certificates to prove that their application works on HTTP. Oh HTTPS. Sorry Um, but You know QA departments are notoriously underfunded For the work that they do and so they go well You can have a certificate, but you need to sign it yourself. And so when you get to automation a lot of tools just don't know what to do and so kind of Selenium decided that we needed to have a way to get around this And it has has that all built in um, and so You would never see this if you had a self-signed certificate. It's all kind of Basic certificate errors There are certain error certificate problems that are Not supported, but those generally are kind of for a good reason. So it could be like, you know It's signed for the wrong place or things like that or there's just errors in the certificates and it's better to kind of Error outs then kind of support those when we're thinking of kind of people who are trying to use this every day And so oh actually I go into that one We've got all these certificate problems. We've got progressive enhancements and this leads to the case of like Get will return Uh, and so this is the second last line get will return Uh, when it thinks a page is loaded and so this like in general terms is kind of Ready status complete and uh, the load event has been fired. Um But if you're using react ember angular or kind of you've created your own and You're all you're using Progressive enhancement as kind of new things start get downloaded into the browser as and when It's always best to kind of think about The element that you're looking for Um and make sure that you're explicitly waiting for it um Because like You never know when it's going to be there and if it is there like, um, sometimes and In this case, I've just looked for the element But like you there's other ways that we can do it, but sometimes you need to wait, um until kind of other Other activities have happened and so like Always be aware that just getting something Uh is not as simple as kind of just getting a page and this is one of the main reasons why people regularly ask for um I'd like to Use selenium and see what the, uh hctp Result was for this And it's like, okay But you know, we could we could tell you that going to that URL was a 200 but what In reality, you're probably looking for is like, were there any 404s? Were there any 500s? In doing that get and that's what kind of why People have never, um Never allowed The access to this and famously there was issue one for one, um that kind of Made the rounds when everything was on google code and you can still if you searched up Selenium issue one for one, uh, you'll have some very interesting reading so Now that we've got kind of the basic case working we've got to make sure that There are no edge cases, right? So in this case, like I've navigated to a page um And then I've said Actually, I now need to navigate to some anchor on that page in this case because, um We're already on the page We're not forcing any reloads We're just saying navigate and browsers have a system called a bf cache And the bf cache allows kind of pages to load super fast because the browser will kind of make an educated guess of Should I be using the cached version? Should I be using the page that's already there rather than reloading it? Or should I be using? Something else And so kind of service workers comes into this like it tries to like get all of this and because Selenium's main role in a lot of Cases is to try emulate what a user would do We just say to the browser. Hey, could you just load load this and how it loads it is up to the browser and every browser is incredibly unique in this case um, and so in the in the case of like loading some anchor We've got um no load event will ever fire um, which means that like we we don't We can't just tell the browser can't just go. Oh because you've navig you've said navigate. We need to kind of Just wait for load events. We can't just do that um We can't just like wait 10 seconds hoping a load event will come and then error because like in this case It's a valid point that you've gone to. Um There is no error and so erroring is not an acceptable Problem and so there are different like little edge cases around navigation that kind of cause A lot of like things for us to care about and wonder and things like that. Uh, and so again, uh sometimes Just understanding that like how navigation works in a browser Can be important to kind of removing some flakiness in this case Like it should return near instant because uh browser vendors are kind of Understand how bf caches work and things like that. So you'll you'll be fine um so We've navigated we've done all of that. That's really cool. Um, and I'm pretty sure this that's the only way to kind of load a page, right? Right No, unfortunately not this clicking And so in this case, I'm going to Breathe over a few bits and pieces, but um so I can get back to the navigation part But clicking Does multiple steps. Obviously we need to find an element. We need to make sure the element has Is still in the DOM so that we can interact with it And so kind of you might have cases where if it's been removed from the DOM and then add it back So uh front end frameworks kind of do this as on data reloads. They'll kind of mutate the the DOM And it can create issues for tests. Um It needs to find it. It needs to check if an element is visible. Um, so Visibility in selenium is um A best guess of can a user click on this element and can the user see it and interact with it um and then Once it's there, uh, it will scroll to it Uh to make sure that the element is within the viewport So the viewport is like the visible part of a browsing Context and so kind of a web page So whatever you can see on your screen that is a viewport and if you scroll up and down the viewport moves through the page And then we get to the point where there's a click uh and Here we are. We have, uh We've navigated to a page. It's returned. We're now going to find an element. Uh, so kind of Again, this is simplistic. This is web 1.0. Um, but we're going to find a link. We're going to click on it Now, um The link in this case it will navigate to another page Uh Again, selenium will try to do its best To understand what you're trying to think of and so it will look for certain events that are happening So like when you navigate, uh to a new page and so by a new page i'm i'm explicitly not talking about single page applications I'll come to that in a minute Um, but this is like it will load up a brand new URL, uh and do everything and so kind of a browser at that point can look for certain events like Is the page being, uh, destructed? So is it being taken away and moved out? Uh, and like Is a new page being loaded and so we can kind of look for kind of basic things like that and for, uh, a large part of, uh, selenium's life That was done. Obviously, um as we get into The realms of single page applications and things like that. That's where um these edge cases start to happen and we can get into Areas where like there might be a page load event and we need to kind of understand Um Where that is how quickly that is going to happen Um so that we don't slow down tests because like if people are writing tests the thing that they Really genuinely want all the time Is fast feedback even if the test fails like they wanted to fail fast But if they wanted to pass they wanted to pass fast too and so we want this genuine Uh quick feedback loop that allows you to get information and so in this case we click a link it will do, uh It what in the web driver specification what we call post navigation checks it will check that like The your role has been loaded it will go through and check if there's a certificate It will allow the certificate like ignore the certificate at that point um It will then check that the page has loaded and so kind of It loaded it again is that incredibly overloaded term Excuse the pun, but it's kind of it it needs to know that the page is there visible um An accessible So the the dom is in uh has loaded Images for the most parts have loaded CSS has loaded JavaScript has loaded And these are all things that um if you wanted to kind of go look it up in the um html spec And I I encourage it because understanding how browsers work for automation is incredibly important to understand Like how all the different features work Because then you can get the most out of it and you never know you might even find issues in how uh People are developing applications by using those features Um But so We've seen the web 1.0 this we've got another anchor again. It's somewhat web 1.0 also kind of somewhat of a feature that we see in um Single page applications and this is interacting with anchors So here we find an element uh, we click on it um It is navigating so the URL is changing But there is no navigation events ever going to happen And so we need to understand that like this is important that we don't just Sit idle for kind of half a second to a second waiting for those things inside the browser Um, we wait for certain things or we just kind of go This is just an anchor of the previous place I'm sure the bf cache is going to care about this and I'm just going to kind of jump out And that's what tends to happen is that it kind of does this jump out works Accordingly um and moves on to the next page Or next part of the page Single page applications where it's kind of loading things. Um, this will then fire off some javascript That allow that kind of loads things And because there's this movement inside that kind of Effectively we we need to think of it as a black box Yes, we can look into it. So it's kind of grayish white, but it like We've clicked a link Some things have happened and then we need to do it. And so in those cases when we do this, um We might need to have web driver waits or Kind of just explicit waiting To know that the element that we're looking for is there that we've The next element that we're looking for is there That is interactable and that we can move on Um and so We've done clicking but like one of the things that I think it's important to point out is that when it comes to navigation and All of these other bits and pieces that kind of clicking in actions is slightly different to kind of the previous click So we like Simon and the rest of the Committers we like to kind of call it out Call out the differences kind of Do what I mean versus do what I say. So the previous click was do what I mean. So it's like I mean Click over there and do do things With actions is do what I say. So I'm being very explicit of like find this item find this item find this item Click on it drag here click on it things like that. Um, and so like if you from like a mobile phone analogy is like One is below the screen. The other one is above the screen. Um actions tends to be above the screen because it's like Yeah, it's kind of moving things a very explicit explicitly And you don't need to worry about scrolling and things like that um And so here in this case, um The clicks in actions and stuff like that will never Invoke any navigation concert like checks Those need to be done separately to kind of everything else This is because kind of You've reached a point where The driver thinks that Is anticipating that you know What is happening and what should be happening? Um Nearly down to kind of the second or millisecond, right? Like, you know that if I do these actions The the exact thing that it's going to happen next you're expecting Um, and you can either wait for elements or you can kind you'll know exactly what to do and so kind of the um The navigation checks are not done in this case. So kind of beware Make sure that your code it performs by itself and that it's um It does all the necessary checks um And then that's it. Um, I hope uh everyone's found this incredibly useful. Um, and uh I think I'm hoping there might have been a few questions that have come up but I will try my best to answer them, uh, and um Thank you for your time Thanks a lot David that's quite insightful Care in the panel folks if you have any questions, please post on our But before we begin that I'd like to thank both staff for sponsoring Selenium Pompf and the track at the moment There's no questions here. It's just a comment that possibly will demo what I've been helpful Uh, I I did try to think about a demo. Um, one of the Because a lot of the the cases of where like it goes into the browser and we try like separate outs Like when it's going to do certain things Um, it's very I didn't I found it very hard to write a meaningful demo so kind of like for the self-signed certificates like I could Do that and just show it going through. Um But I can't a lot of people have probably seen that and so it's it's very hard for me to kind of uh balance it out, but uh If I do this talk ever again, I will try to see um like How I could do that and if the commentator has any thoughts on what a good demo would be I'd love to hear it and kind of Can work with them to make sure I can show that up a bit Okay, so, um The so selenium 4 obviously we've been working really hard on and I'm really excited by some of the Work that's gone into it. So I've had some interns doing some work and I've got some of my team helping out And so I'm really excited by that the by die stuff um I think we're still trying to figure out a lot of that out. Um, it's moving around. Um, the Unfortunate like we're kind of basing the initial work of how Um, CDP works. Um, and so the chrome debug protocol Um, unfortunately the chrome debug protocol Changes with every version and so this is why kind of people need to download a chrome driver with every version Of chrome. Um, and you don't need to do that with firefox. Um, and so kind of trying to understand that um I think like in terms of my talk I don't think many things are going to change too much other than like you'd be able to access and set mocks for certain, uh, hctp's um Http requests and you'll be able to do that and so you can kind of have this hybrid Test that you can do like that some people do with puppeteer and cypress But you know have meaningful clicks and things like that and have proper actions You know, these are the things that puppeteer and cypress don't have So you can do kind of more advanced, uh, testing um So I'm excited by it. I'm I kind of we're still trying to figure out what it's going to look like in a meaningful way All right, I think uh, that's that's supposed to be a question Awesome. Thank you. Thank you very much table. Thank you for that monosh. Uh, thanks everyone for listening. Hope you have a great day