 My name is Narayan Raman, I am the author of Sahi, this is an open source tool for web application, automation and testing. This was started in 2005, about the same time that Selenium RC was being cooked up somewhere. What I am going to talk today is, see, Sahi actually has a lot of architectural decisions that it has made, which may be counter to what is very popular right now. What I am going to touch upon is like, how would you approach a problem of test automation on the web and to see what are the really important things that you need to be answering in that and based on those answers, what are the technical decisions that need to be made to be put into your tool. So what is the primary problem that we are addressing, contrary to popular belief, it is not web application, sorry, browser automation tool or library. What I am trying to actually say is, the primary problem is verifying web application functionality. It is not about building a library for automating the browser. Now that to some may be contradictory, but it is not. So if the primary goal is to verify a web application functionality, you can change anything underneath as long as the top main principle is held. So what does that mean? The first question is, I want to test web applications. Should I go with the browser or without the browser? Now, especially in this year, in this day and age, it is obvious that you need to test out on the browser. You may say that, I could do it on Phantom.js. Anybody heard of Phantom.js? You should check it out. Phantom.js is a webkit based browser, which does not have a user interface. What that lets you do is like, have blazingly fast test without a user interface. It will just work there and give you the results. But it is still a browser. Can you do it without a browser? People used to do that before. How did they do that? Actually, everything is HTTP. Everything on the browser is HTTP based. So you make a HTTP request, get back the response and on the response, you could just parse it for HTML and check whether what you are looking for is there or not. Then if there is a button for submit, etc., you fill in the form fields by adding it to the post parameters of the next HTTP request and then submitting it. Everybody understands post, get request, guest request, post request? Okay, sure. So you could do it at that level. But the problem is you need a JavaScript interpreter to actually be able to handle all the JavaScript that is on the web page. Not just the regular, like, jazzy stuff, but also all the Ajax and everything else that goes into a Web 2.0 application of nowadays, right? So it is quite clear and an easy decision to say that if you are wanting to actually check functionality of your web application, stick to the browser. Now, of course, there are lots of advantages to the browser and some cons. So what are the advantages? It understands JavaScript, so you don't have to do anything about parsing and all that. But the con is it's a little slow. Every browser is supposed to actually render stuff on the user interface. Now, if you want to render stuff, you need to figure out like where to place it, like how to paint it and all that, and that takes some time for a browser. But really, does it matter to us? Most probably not, you know? We need to check the functionality, like, if you checked it at the code level, but then like you do not know whether it's working correctly on the browser, you're still going to waste time actually checking it on the browser. So as well, run it on the browser. So you say, okay, we'll run it on the browser, browser wins. So if we go the browser way, there are lots of browsers, right? Like IE Firefox, Chrome, Safari, Opera. Anybody knows how many versions of Firefox came out last year? Yes, that's the right answer. So in the last conference, I said like, you know, Firefox 7 is out. Somebody said, nine is out. Okay, I was like, I'm not like. So but right now like, at least there is Firefox 10 that is out, and then Firefox 8, which is like mainstream. And they also have something called an enterprise release, which only will release every six months, because enterprises are saying, we can't assimilate as many releases as you have. Give me a release and change it then after next six months. So there's an enterprise release also of Firefox. Did you guys know about that? Okay, so given that there are so many browsers, if you start actually wanting to automate it, you have to pick something which is common across all these versions and which is common across all these browsers and all these browsers on all operating systems, okay? If you choose something that is common like that, then you have a huge advantage, because you will have very little work to do on that. So coming back to that, so how would you actually do this? So there are two ways, okay? One is work on something that every browser is supposed to expose. The other is to actually say, I'll work from underneath the browser and get into all those hidden APIs that they have, and what their developers may have put in there. And utilize them to actually simulate stuff on the browser, okay? Now, if you think about it, let's say that you say that I'll use the APIs that the browser exposes, okay? Now, if you're actually having to do web functional testing using those APIs, those APIs should be standardized across browsers. Those APIs should be guaranteed to work the same way across browsers. Anybody thinks there is any financial motive for the browsers to actually do that, especially for the older browsers? For example, IE9 is out, IE10 is coming up. Would they want to do it for IE6? Most probably not, there's no financial motive to do that. Anybody agrees on this? Anybody disagrees on this? What's the stupid point? So the thing is, the point is, there may be people who want to do it. Maybe you may want to depend on the browser vendor for that. But right now, it is too early to be able to do that, because nobody is actually working towards that, okay? So what does the automation tool need to do? It has to access elements on the browser, and it has to simulate events on the browser. So when you say access elements on the browser, you have an API. Anybody knows how to access elements on a browser, normally? Don. Don, right? So the don is the standard way by which people access different elements on the browser, and that's of course accessible by JavaScript, right? The other thing is, simulate events on the browser. Did you know that most browsers have equivalent methods like fire event, which actually can simulate a real JavaScript event on the browser? So they can actually, you can say that, fire mouse down at this, this coordinate on the browser, this, this coordinate on the screen. And it'll execute that particular event as if it was done by the user. But there is a difference, right? The difference is, when a user clicks using a mouse, which remains, I don't have a mouse. So when a user clicks with a mouse, okay? What happens is, the OS first queues it up that event to be sent to whatever, and then the browser picks it up and says, yeah, I got it at this particular coordinate. This translates to a JavaScript event, and then everything else handles that. So if you look at it, you always have on click equal to function, and then you get an event object, right? You can create that same event object, and you can simulate that same thing by just firing an event on the browser too. And almost all modern browsers allow it. When I say modern, I'm starting from IE6, pardon me, okay? So essentially, you don't want to see the detail at the end of it. You just want to click on it. And as a functional tester, you want your code to actually look something like this. When a tester talks to another tester, he says, hey, put the username in the text box, password in the text box, click the login. In fact, he doesn't even say this. He says, login as username and password, right? That's the abstraction at which one tester talks to another. Now, that's how the code should look, okay? And then underneath, you will go to the next level, even there, like you need to actually abstract out things. You say just set value into this username, set value into this password, click the button, login. You don't want to know about anything else apart from that, really? Or that's all you're doing anyway on the browser, right? Why should you actually code anything more? So that's what we need to do. So that's part of the premise of how SI is structured, okay? So possibilities, how would you solve this problem? You could write wrappers around every browser, okay? Past and present, which means IE6 onwards to everything else, or use something else, okay? So let's go down one path or the other. So let's talk about the constraints. One of the biggest constraints, if you actually were to write a wrapper around every browser, was developer time and expertise. Working in a team, you always have a limitation on that. You choose what you have, and based on what you have, you actually choose the best possible solution on it. This should not ideally be the decision that actually drives what your solution is, but this should be part of the problem, because you should be thinking, okay, tomorrow if I actually want to ramp up the team, or switch the team, or make it maintainable without the core people here, will I be able to do it? Yeah, you would be if there is not a plethora of technologies that are involved. So the knowledge required for automating various browsers and their architectures, unless you know that you can't really automate every browser from inside out. You need a lot of experts, and you need to be able to support and reproduce issues. If somebody says that hey, you know it's not working for me on LAN with Safari 8.3, okay? You should be able to bring it up, test it out, roll out a patch, and then like be able to satisfy your customer, okay? Now if you are not able to do that, then we are actually saying, you know, hey, that's like, you're a customer, but I'm not really caring about it. So in order to be able to do that, we need to have something which ideally is not so dependent on it, okay? And of course I mentioned the lack of standards around browser automation APIs. So is there anything else as I discussed? Like I told you about like JavaScript events, right? You use JavaScript. Running JavaScript is one of the core features of the browser. That's never going to go away. So even if like you come up with the latest version of, like think about it, every new browser that's coming up is actually like increasing the performance of its JavaScript. So it's gonna be the core all the time. It's not going to go away, right? Now, of course JavaScript can identify all the elements on the DOM. There is, you don't need to resort to any other technology to actually access any element of the DOM with some caveats, okay? And simulation of events is also very much possible through JavaScript. There are a few small implements. Events generated on different browsers are different. In the sense that the sequence of events on IE is slightly different from what it is on Firefox. For example, if you have like a blur event which kicks off after clicking on another element, in some the blur event may come before the click, okay? So there are a few quirks between the different browsers, but it is about like just knowing those quirks and then reproducing them exactly when you are doing the automation. You don't have to do it, the tool has to do it, okay? And the DOM itself may be different across browsers. Anybody knows like the differences between some IE and Firefox DOM differences? There used to be a document.all which used to allow you to access any element by ID in the DOM, okay? Now, I'm sure that it still is there but the thing is people no longer use it. So they stick to like document.getElementByID because everybody else also started adopting IDs. Now, again, the DOM needs to be slightly normalized for these browsers, but then it's a small thing, okay? So using JavaScript, you first need to create a normalized simplified APIs to access DOM. Why would you actually want something simplified to access DOM? So for example, you could actually do document.getElementByID and give the ID. How many of you believe that like every element on the browser actually has an ID in practical applications? Anybody says no, not all elements on the browser have IDs? Anybody? Yes, there's one. Does everybody else say that no IDs are all present on all elements? No, okay. Some people are just not raising their hands, okay? So the thing is while it is actually quite possible to add IDs to everything. In fact, like the last four, five years of automation has been that add IDs make it testable. Why? My testable tool is not that testable on that testable thing, okay? The point is like, there are other ways to access different elements, okay? Now, if you're stuck to the DOM, you may be able to access everything. Now, you first that, I'll use XPath and IDs and that's all I'll do. Then you obviously are stuck, you need IDs everywhere. But the point is this will work on small, simple applications. When you're using a third-party library, let's say, let's say you use EXTJS or ZKOSS. Has anybody heard of ZKOSS, ZKOSS? Anybody? Okay. So that's like, that's actually brilliant Ajax framework. I should look at it. There are quite a few, even dojo, like, but EXTJS and ZKOSS are like, are prime. EXTJS is fairly common everywhere, right? And they do build their own, what do you call? Let's say a select box, okay? They have their own select box. They don't use the browser select box. And why? Because you can actually do autocomplete in that. You can start typing something, it'll show up stuff there. This you can't do with the regular select box. So you need to build something there. Now there is an arrow button on the right-hand side, right? Like you have the text and then an arrow button. How do you click on that arrow button? It doesn't have any ID, you didn't give it. The widget made it, right? Now if the widget actually made it, it is going to actually relate in some particular way between each other. How do you get to it? Most probably you won't be able to, okay? So enforcing IDs on such an element is going to be a wasted effort really, okay? Now, normalize simplified APIs to access the DOM and normalize APIs for event simulation, which are tailored for different browsers. As I said, there are quirks between different browsers and we need to be able to simulate the correct sequence of events across browsers. So I'm gonna call this the Psi APIs, initial, okay? So is that it about automation? Is that it? Is that it? Is that it? I'm not going ahead. Is that it? Okay, so it's like this. Ah! Okay, so execution, okay? So the point is you were able to actually access an element, do an event on it, but really like how are you gonna execute it, okay? So one of the ways to get this JavaScript into the browser is to add a script.src tag, simple, right? Okay, and I'm gonna say like, let's say script.src equal to Psi APIs.js, okay? Where is the script that you're gonna execute? So this is the APIs that are there included, where are you getting the scripts from, okay? Now, let's say we add a js function which will read from somewhere and then get a list of all those events that, list of all those steps that need to be executed, that takes it. Now let's consider a three line script. Set value of text box with something, click the submit login, and then click on compose, okay? Now let us say that all this was on the browser, using JavaScript. You execute this, yeah, I did execute it. You execute this, execute it. The page refreshed here, okay? Now who remembers like how to actually click this? It'll start back from here, right? Somebody has to remember this. Is there anybody who didn't follow this? I'll repeat this, okay? So basically when a page refreshes, everything in the DOM, everything in the memory of the browser goes away. When you reload the page, it's something else that comes up, okay? So it's all a fresh, clean slate. You have three steps that you want to execute, two steps to execute the page refreshes. It will not know that it has to execute now from the third step. It'll start back from the first script. So you need something else apart from the JavaScript to hold the state of the execution, okay? Now, how do you do that? One of the things you can do is, you can have another server running something. So what I'm doing is, to open one link in another link, all you do is like window.open and pass in the iframe's name, okay? You can also do it in other ways, but like this is one of the ways. So I'm gonna click on execute. What it's going to do first is load up that webpage in the iframe and then execute the JavaScript from the top, okay? Right? So it loaded the page and then said login.value is equal to abcd and then like put the password as aaa, okay? So it was as simple as as putting in a text area, putting in an iframe at the bottom and then using some JavaScript to access different elements on the iframe and just execute some JavaScript there. Now, if you notice, if I actually clicked on this, even if I actually log in into this, okay? So whatever. So basically, even if this page refreshes, the top iframe is gonna be around, okay? So since it is gonna be around, you have a memory space which is like permanent and with which you can manipulate the page underneath. This could be a good enough solution, but the thing is, all web applications don't behave so nicely, okay? Have you heard of frame busting code? Anybody? When you were like, I'm sure this was like very popular a few years back when you would have, when JavaScript security was not very well implemented across browsers. So what they would do is, you know, you have your ICSF Bank website opened up in an iframe. Your actual website, the actual website which loaded is like in a zero-sized frame and it can access everything you're putting there. Your username, your password, et cetera. Previously, there were no security restrictions around it. Then browsers added it. So what that means is, people used to bust the frame in which it was. So for example, what you could do is, so I have this frame busting code here which says if top.location is not equal to window.location, just like take it to the top.location. So just for people to see in the back. So window.top.location is not equal to window.location. They're like, set it there. So what it'll do is something like this. I've uncommented this, right? So let me refresh and then execute. I think I'm doing something wrong. Just give me a minute. See what happened? The iframe just blew off. So let me do it again. Feels good. So here we are. I say execute. It loads that page here and when the page loads, it'll see whether there are any iframes at the top and then just load away. This almost every financial organization worth its salt used to do this. So it meant that if you use an architecture like this, you will never be able to automate such websites. Now one of the things that we say is, you know, the tool should be able to handle anything that the guy actually wants to code. So if he does a frame busting code, you should be able to handle it. Now that means that this kind of thing is gone. What else can we do? There's one other way we can do this. We can actually open that page in a new window totally. So if it's in a new window, there's no iframe to bust, it'll bust. It's okay. You still have this frame talking to another. By the way, you can talk from one window to another. You just need to have the handle of the window and then like you can access the DOM of it. So let's do that. So now if I execute, what it's gonna do is, it's going to open this page in a new window and then execute. Now we don't have the problem of frame busting code, right? But the thing is, this will only work if this page and this page are from the same domain. If you are from a different domain, so let's say that you wanted to automate google.com here, you wouldn't be able to. Because this JavaScript code which accesses from one window to another or one window to another iframe will only work if these are from the same domain. So this solution is also not possible. So then, so we saw how we could have actually kept it in the browser itself with plain JavaScript and tried it but it doesn't work. So now we're gonna do something else. Now what could that something else be? We saw that iframes won't work, this won't work. JavaScript security will actually mess you up. So the solution we took was, we said that accessing the DOM, anything but JavaScript way is just stupid because the browser gives you a very good way of doing it. Now if you want to capitalize on that, what should we do? We need to get into the browser's page to be able to do that. Otherwise we actually attack it from outside the browser. We say no, we'll get into it. How do we do that? We put a proxy in between. How many of you understand proxies? Anybody who doesn't understand proxies? Okay, so the way it works is, yeah, there were a few people who raised their hands. So you have the browser, you have the end website. You always talk to it directly. So you make an HTTP call to that, it comes back. Now what it could also do is, instead of directly hitting that, let's say in a corporate environment, you don't want everybody to access YouTube. So what do you do? You actually put something in between. The browser, the end user normally types YouTube.com but the request doesn't go directly to the end YouTube server, but actually goes to a proxy. And that proxy figures out whether you have access to it or not. And if you have it, it'll allow you to see it. Otherwise it won't allow you to see it. The way proxies are done is, they kind of like seamlessly, to the end user, he doesn't ideally see that there is a proxy. The way you configure a proxy is actually on a browser, you would do something like tools, options in the network settings. You would go and say that, I want a manual proxy configuration or like some auto detect the proxy or something else. Normally it is no proxy. If you set up proxy, then all requests from there on, even if it's whatever be the URL, all requests from there on are going to go through a proxy. Now which means it's, you can do, if you put your own proxy there, you can do whatever you want with it. You can change the page. If they actually asked YouTube, you can say that, you send back a page that says, you're not allowed to view YouTube. Of course that's not coming from YouTube, but they see it as YouTube.com and here it says that you're not allowed to, you're not allowed to see YouTube. So basically you can impersonate any website if you're a proxy. Now, if you can do that, then why is that important? Because now you can actually do the script src into every page. You can include the script into every page. Is that all? No, we actually include it, which means that like all APIs are accessible from the browser. And now the second problem of saving the state remains. So what you do is, you make a call to the proxy and say that, hey, you don't save the state as like I'm on third step, I'm on fourth step. I'm gonna explain this a little more, but for this, I'll actually show you the real Sahi. And then I'm gonna show you how Sahi does it. But before that, let me check if they have a few things to tell you. So good thing about proxy injection, it behaves as if it's the page itself, so there is no concept of a domain restriction. Apart from this, there were a couple of things that we consider. Should it be aimed at testers or developers? For us, the answer was like, we should aim it at testers. It was not really aim it at testers, it was rather that aim it at getting your testing done. Initially, it started off like that. So we said, it should be simple to use and learn. It should use as little as code as needed for the end user. He should be able to write whatever scenario is like with very little code. And helpers like recorders and object spies are necessary. So how many of you believe that recorders are bad? Why so? Why so? Yeah, go on. Inconsistency. If it recorded consistently, right? Otherwise, like you'll be, so, let's say that you, you have like 10 different steps that you want to actually have in your script. Would you rather code it or record it? If I say that, you know, you'll be able to consistently record it without any errors and you'll be able to play back. Oh, no, you mean to say that if the test case changes, the data changes, right? Is that what you mean? Yeah. So let's check the first script. Is it good for the first script? Maybe, right? Yeah, okay. So for the first script, like maybe, yeah. The thing is, right now, like most recorders are so bad at the first script itself that you need to do a lot of effort in getting that working. Now, there is another point that even if you recorded it correctly, okay? The next time you played it back, if the only thing that you needed to change was your data, then it's fine. It's not that bad. But if you had to change the actual recorded thing itself, then it's bad. And when does that happen? When IDs are dynamically generated by some web applications, okay? So in those cases, like even if you record it, it's gonna be like weird. When the next time, it's going to be a totally different application as far as the code sees it, okay? Now, that is one thing that like, actually it has, this is another thing that's been propounded a lot, which is to say that recorders are all bad. You know, recorders are fine. Think about it. Don't developers actually use a lot of code generation tools? It's good for a first cut. Come on, like you don't want to actually like do all the CRUD applications again and again, write your SQL queries and all that. You just build it off like on something that you've built before. And then you actually use that, modify it and go on to do that. How many of you are testers actually here? Anybody? And the others are developers? Developers? Developers especially hate recording, okay? You included. So the thing is, developers feel that you know, testers like they'll just like mess it up, don't record it, just type it all in. People are different. People like actually are like domain experts. They know more about testing than like actually like programming. So they just like want to get the simplest possible thing going. So for us, like we've seen that recorders are like absolutely important, but even more important than a recorder is the object spy, okay? This, nobody denies. Everybody uses Firebug to figure out like, hey, what's the X path? What's the dom thing, et cetera, right? So you need an object spy, definitely. Recorder, okay, maybe. If it's good, it's good. Otherwise like you definitely need an object spy. And one of the things we, when we started like there was, there was no concept of an object spy. Only Firefox, Firebug had just come up. IE and all didn't have anything. So Psy used to have an object spy from 2005. So you could actually go to any browser and look at different elements on it. And we thought that, you know, it was a big time saving thing. One of the reasons was, you know, doms were different. You tried something on Firefox, it worked. You tried something on IE, it failed. Now it says like this element is not found on the browser. Now I have to go back and one easy thing is do something on that particular element, see what it shows and like compare it with what you have. That would have been so much easier, but no, you'll have to do a view source, figure it out, do use some JavaScript to figure out whether it works or not and then go back. So all that is like, objects spies are good. The next thing is, should the Psy language be scriptable? Psy, we tried to make it like a DSL, domain specific language. A lot of DSLs. DSLs, DSLs, DSLs. Everybody is sort of DSLs. DSLs are like where big buzzword about two years back. Are they still? So I just caught on, okay? I used to call it something else. So not only should it be like a DSL, but it should also be scriptable because there are always times when people want to write meta code with your code. So they want to actually build frameworks out of it, where they hide off something like, you know, I don't want to see the banking stuff, like I want it to be hidden and stuff. So yeah, it should be scriptable. You should be able to modularize it into functions as if possible. If needed, you may have some if conditions, loops, et cetera. By the way, how many of you think that if conditions are correct in a test case? Anybody has an opinion on if conditions on a test case? Yes, what's your opinion? This is terrible in a test case. It's terrible in a test case? Nice, yeah. Actually, almost most test cases are deterministic, especially the functional test cases. You put in the data, you do something, you refer it. If you have an if condition there, you could do it based on some parameter that you pass and saying that, hey, you know what, this is the login path and this is something else. But if you're actually doing an if condition on the data on the browser, that's absolutely wrong. So some people do this. If the login link exists, then click on log out and then log in. This is okay because sometimes the cookies are wrong. But if you do this for anything else, it's wrong because you are saying that, you know what, I do not know where I'll be. Like I do not know why it is going here, but if it is going here, go somewhere else. That is going to cause you trouble because you don't, if it takes another path, which it shouldn't have taken and goes on to succeed, you'll never know what happened. So you set up the data, you do it, and then at the end you verify what it is and it should all be deterministic if possible. So let me start again, okay? So we chose language of choice to be JavaScript. One of the reasons was also this. If you're doing web application automation, you are going to get into JavaScript at some point or the other. You cannot be away from JavaScript. Now if you're going to do JavaScript, you as well do JavaScript well. So you as well like actually use it for the whole thing. So we say that, you know, our whole size scripting will be based on JavaScript. Then, yeah, so size script basically interacts, seamlessly interacts with the browser JavaScript and also follows the JavaScript syntax. There's one other big advantage of using JavaScript. Most people actually, especially in companies which are not product companies, which do projects, okay? You have a testing team, like now, yeah, agile everybody should be tested and all that, but yeah, you have a testing team, okay? And some people actually, these testing guys are, they are good at their domain knowledge, let's say, okay? And they're good at articulating the business functionality and checking whether it's working or not. Now, if their core competence is that, okay? They may as well stick to one language which works across all web applications. So you have a Ruby on Rails application, you have Java, J2E-based application, you have any application. At the end on the browser, it's just HTML and JavaScript. So you should be able to easily automate using that. So JavaScript is our language. Now, the question is, why not pure JS? I said that, like, CY is based on JavaScript. It's not purely JavaScript, okay? So, now this takes us to the demo, okay? So I'm starting up CY. It starts up a dashboard on the left with all the browsers that are available on your system, okay? This is configurable. I, let's say, I click on Firefox. It opens up Firefox for me. Now, anything you do on this is using the controller. So there's an alt double click key combination on the browser which brings up the CY dashboard. And here, I'm going to, so this is the sample website. This is like really simple website. I put in test secret, click on login. So this moves on to the second page which has like a few books that are to be added to a shopping cart. This is a really simplistic example. It's more to actually show you like what CY does, okay? So you add books here and some quantities, okay? Click on add. It shows up at the bottom and then you verify this. The bottom part is actually just JavaScript. It loads it up here and you need to check the, verify that the total is 150, okay? So let's log out and do this. Again, bring up the controller and start recording it. We say, I say record, test, secret, okay? Okay, and put in some values here, click on add. You see all the recorded steps that I did right now here and verify the total. To verify the total, what I do is like press the control key and move the mouse over that particular element. It shows me that in the accessor field. It also shows me a lot of the alternatives that are there, okay? And I click on assert. It gives me a few suggestive assertions. I choose the one that I want and add it to my script. So let's say that I just care about the 1550. I append to the script and then log out, okay? So let me stop the script, go back. So I take agile one. So I'm gonna play it back now, set it and play, okay? So it played the whole thing, right? At the end of it you would like to look at the logs. So but before that I would like to point out something that you see no weights in this particular script, right? This is the script that we just ran. It shows you the logs and you could click through to go to the line of script, okay? Now this is actually a recorded and played back script. It's fairly simple. It doesn't have anything complicated about it. Now let's look at just one other thing. This is not how you ideally leave the test case, okay? What you would eventually have is something like this. You open up the script in the script editor, okay? Then say that I want these three things to actually be a login function, okay? So I chose the first three steps, clicked on create function, and then say I want login here. This is like the refactoring tools in all the different rows, the NaClips, et cetera. As you continue. So what it does is creates a function, moves it off into that. It has parameterized it correctly. These are things that a tester actually finds quite useful. Since they like it that, you know, I do not need to move away from here. It's fairly simple to do it. Let me just like another function that says add books, verify, log out. So what you have here is a function, a script file, which has different functions abstracted out into something that your business and domain understands, okay? So you save this, and let's execute this one, okay? We look at the logs. So it tells you that these functions were called with these values. You click on it. It tells you exactly what values were passed in. You click on it to get to the line of code. So this, let's see actually trace back to what exactly happened. And this is also part of the design decisions that we took, which allows us to do this. So far fine, like any questions on this? Yeah. How do I make it data-driven? Okay, I'll come to that later. But it's about like reading from a database and putting the data in. So there's a function called data drive into which you can pass a function and the two dimensional array of the data it'll automatically execute in a loop, catch all the exceptions, log in, et cetera, in the demo, okay? Yeah, yeah, it'll work. So it needs a start URL so that you can figure out like what doesn't, so I set it and play, it's on. So it'll work on IE2, but like IE without a network is a slight problem, yeah? Okay, so I'm actually, I'll have to get into the architecture bit too. So I'll come back on that. Yes, like if it's a browser on a mobile device it'll work. So as long as any browser supports a proxy and JavaScript it'll work. So now I want to actually show you a few things. Now, quickly, okay? Let me open this up on Firefox without this. So if, what did the dashboard do? Basically what it did was it went here, changed the network settings to use localhost and double line, double line as a proxy. That is size proxy, okay? Once you configure it like this and say, okay? You can automate any browser by configuring the proxy to be localhost, double line, double line and then go into that URL. So now, if you look at this, I'm doing a view source of this. You'll see that psi inject some code and that starts here, psi inject start and ends at somewhere where it says end, yeah, psi inject end, okay? So inject some JavaScript code into it. Part of it is actually the APIs that are there and the other part is actually the code to maintain state, okay? Now, one thing you'll see is, let me click on this. Look at the, look at what it shows with the title. It says source of my, whatever domain I was actually working from, slash underscore s underscore slash spr, okay? So there are all requests are made through the proxy but some requests need to be processed by the proxy itself. For example, the files that are being injected by the proxy, they need to be like replied back by the proxy, not by the end server because there is no concat.js on like, let's say Google.com, right? So what happens is based on the URL pattern which is underscore s underscore which is specific for psi. If it's underscore s underscore then psi will pick it locally from the proxy and return that content, okay? So in any page that you go through the proxy this much amount of code is injected and in this most of these files are picked from the local server, okay? Now, of course, like you don't inject it into every page. Anybody thinks why it shouldn't be injected into every page? Can you inject this into JavaScript pages? No, basically you can only inject it into HTML pages, okay, which renders HTML. If it's JavaScript, like if you have some code like this it will cause a JavaScript error there. So you need this only in HTML pages so the proxy has the smartness to understand that you know what the response is actually HTML so I'll inject the content otherwise I will not inject the content, okay? Yes, so what's wrong with that? Right, the question, this actually comes up all the time. So if it's encrypted content do you have access to it, right? So it is like if the browser has access to it anything can have access to it. So basically this is what is called a man in the middle. So the browser asks me, I ask from the server I send it back here, okay? But when I send it back here, right? I could be anybody, somebody who's actually malicious but when I do this I have to sign the response with my sign or my signature or my, yeah? So when I sign it then the browser knows that hey you know what, this signature is not correct, like the Google signature is actually signed by very sign I can verify it. This guy is saying this domain is Google.com but it's signed by Sahi, so it's not correct. So what you do, I have to do in this cases you have to accept the Sahi certificates. So basically you say that you know, I know that there is a man in the middle but I trust that guy because he's my guy, he's on my server, so it is possible to do that too. So now we're saying that part of the content is going to be picked from the proxy, part of the content is going to be picked from the actual server, okay? Now let's look at, okay, so the URL pattern. So the URL that is being sent, so this URL pattern always has underscore S underscore if it is a local thing, the proxy. All requests from the browser, once you say network properties is like local is double nine and all requests go through the proxy. Let's say that I am the proxy and that's Google, okay? Every request comes through me, you do whatever, okay? Now I decide whether it goes there and I pick it and send it back or I decide like whether I just like send it back from here itself. I actually pick a content from my file system and send it. Okay? Yes, a proxy is a web server but it also knows how to make requests and make requests to other servers, okay? A proxy is always a web server, okay? And it also knows a little more, yeah, yeah, yes. End request only knows up to me, so like he authenticates to me, I say, okay, okay, give me the response, okay? Now why, when I'm talking to that guy, that guy says, okay, I say like take this response, he says, no boss, like you're not Google, you're like Sahi, so the end user out there, like the browser will throw a pop up saying, hey, you know what, the security certificate is not accepted, is not correct. So the end user says that, okay, I accept it because it's coming from Sahi and from that point on, the communication is fine. See, the whole point of the encrypted communication is that like both sides say that, you know, I know who you are and I trust the content that's coming through and if it, sometimes like if you, if like, I can say that I am Google, but like the browser says, no, you're not Google and throws a validation and the user can override it saying that, hey, you know what, like, it's okay. You don't panic, like I'll use this content, so. So the browser and the user can actually override that, okay, yeah, so some part is actually processed by Sahi, okay. Now, this is one. Now let's look at the script itself. So one of the things I said was when you actually execute scripts for Sahi, you do not need any weights in it, okay. Did I say this? Okay, let me say it. So you do not need weights in a Sahi script. Any amount of time you want to wait, it'll wait, up to a maximum threshold, okay. So the thing is, how does Sahi know how to wait, okay? So there are like various JavaScript DOM callbacks which happen when you have actually loaded. So because Sahi has injected JavaScript code into the browser, it has a way of knowing whether the browser has fully loaded or not, okay. And only if it has fully loaded, it will actually go on to the next thing. Now, this also happens for Ajax, okay. That is actually something that other, most of the tools are like totally baffled about, like what do I do now? You put an if condition, put it into a loop and check it. No, man, you don't need to do all that. Every XHR object, it's available to be overwritten in the JavaScript DOM, okay. You just like put a wrapper around it, do whatever the older thing was doing, but actually set a flag saying that, okay, I'm done, I'm not done. And keep a count of all that. Sahi does that and takes care of making, waiting for all the Ajax request. Once it is all done, it'll say, okay, go ahead and do that, okay. So that is how it actually waits. Now there's other things. Sometimes it's not always like Ajax and page loading, which actually delays something. There is this thing in JavaScript called set timeout, okay. What happens in set timeout is like, there is no activity happening. You say a set timeout after five seconds actually do something. So in the five seconds, you may be wanting to actually assert on something where it doesn't appear until the five seconds, only after five seconds comes up. Now how do you handle that? So you need a failure mechanism in Sahi that actually says, you know, I actually click on this, but I couldn't because I didn't find it. Now what do I do? I wait for one second and then try again. And I do that for 10 or five or whatever number of times based on some configuration. Basically saying that, you know, don't fail it the first time. It's, you're automating something which has variance in the amount of time in which it responds. So handle that internally. Make sure that you wait for the right amount of time. So you say, you know, I always wait optimistically till that thing goes on. So I say, you know, it's coming up. It didn't show up, okay, wait again and try, wait again and try, till it actually happens. This is the same thing with assertions too. Normal assertions can also fail. You say that, you know, assert that this thing exists. Now it may not exist at that point of time. So when the assertion is actually negative, you retry it, okay? If you had said assert not exists, then it would have written true. If you say assert exists, it will retry, retry, retry. Till it actually succeeds. So in order to be able to do all this retrying, what we do is we actually take the size step. This is the parscript that you're seeing here. It takes each, okay, by the way, like I never told you like where the JavaScript is getting executed. You have a script, where do you execute it? It is all JavaScript. You don't, you are not going to do it through the browser because you can't do window.open or iframe. So what do you do? What you do is there is another JavaScript interpreter called Rhino. Have you heard of Rhino JavaScript engine? Okay. So a Rhino JavaScript engine is a Java-based JavaScript engine. You can embed it into a JVN, okay? So the proxy also runs a Rhino engine and the script itself executes in this Rhino engine, okay? And this Rhino engine says, you know, I execute the script. So after the parsing, what every line becomes is a string, okay? The Rhino engine says, I've executed this line. I got this side of schedule, okay? Look at this. And it says set value, text box, user and the actual value of the user. So this is all parscript. And at this point, it takes that, makes it a string and puts it at one point, okay? So all the Rhino engine is doing is like parsing it, creating strings of each line and each line when it executes, it puts it at one point and waits, okay? Till when does it wait? The browser starts polling for it, okay? So the browser keeps polling and it asks, do I have a step for me, okay? It says yes, then it'll come here, pick the step, take it back and execute it, okay? And once it has executed, after all the failure mechanisms, all the waiting, loading, et cetera, it'll eventually say that yes, I have finished it, this is the state done, this is the result. Now, the Rhino script runner will actually say, okay, there is a result there, let me pick it up. If it's done, go on to the next step. If it is not, mark it as failure and then go on to the next step, okay? So the thing is, if you look at it in this kind of an automation scenario, unless you do a lot of try-catches, et cetera, you won't be able to do this kind of like failure handling mechanism without doing something intermediate like parsing this, okay? So what we did is, you know, parse it, create a string, put it somewhere, let the browser pick it, execute it, get back, saying pass or fail and then like we'll go on to the next step. So this also means that, you know, all the Rhino engine knows is to actually put a string somewhere, okay? All, this is all it knows. You can do this using a Java driver, using a Ruby driver, using a PHP driver. All you need to do is like put a string at a particular place, the browser will pull and get it and do the necessary things, okay? Any questions on this? Okay, I think, okay, let me leave it like this, that, you know, when you eventually see the script running, like let's say like I show the Firebug thing, how much time do I have? Time out, one minute, okay? So what I'm gonna do is, so this is how it works, okay? So I can continue like outside if you want. If you have any questions like do ask me, basically the whole thing is there is a proxy sitting in between, the script that you send is parsed, converted into strings, put in a place, the browser pulls and picks it up because every, so even in the browser there are like frames, iframes, windows, et cetera, each one pulls saying that, you know, I want to pick a step, I want to pick a step. If your credentials are right, if it was for this particular window, for this particular frame, it can pick it up and execute it and come back, otherwise it'll say there is no step for you, those guys wait. So at the end, everybody picks, executes and comes back and tells it's over. Because of this, you can actually go across domains, across multiple subdomains, anything at all you want, like actually you can just like, there is no JavaScript restriction in this because you are actually talking one-on-one with the server, taking it back and executing it. There is no inter-JavaScript communication between frame to frame. So you can actually go across frames and iframes, thank you very much. See ya.