 I help people get jobs. So to start with, the word selenium is so closely associated with test automation. So the moment someone says selenium, the next thing that comes to mind is test automation. So at Indeed, we try to think of selenium a bit differently to help people get jobs, more specifically on smartphones. Let's see how. So before we do that, let me tell a bit about smartphones. We all have seen the rise, the advent, and I guess we are all immersed in the era of smartphones. This statement is not an exaggeration. There is literally an app for everything these days. Now at Indeed, we have an app for job search and we were trying to understand the penetration of mobiles in the job search. So this was the graph we gathered for the last one year and if you look at the percentage of traffic coming from the mobile, like percentage of traffic of people searching for jobs, the traffic from mobiles is like 2% higher than desktops. Right now it's around 52%, whereas the job search happening from the desktop is around 48% and it's not just searching for jobs. People are also actively applying for jobs. So these are some rough numbers we were trying to get over the last year. So around 200,000 applications are being submitted through mobile phones and around 51% of the total applications being submitted on Indeed are from mobile. So all of these might be a bit of surprise because we think job is a serious thing. Like why would someone apply for a job from a mobile, right? But mostly in the United States where the jobs like truck drivers, people working in restaurants and people doing part-time jobs, right? So there is a huge flux of jobs. People change jobs very often and it is not an expectation that all of them will sit in front of a computer and search for a job, right? These days with mobile phones, smart phones and 4G, people are inclined to search for jobs on mobile. So that's the reason why we are seeing such a very high numbers for searching and applying for jobs on mobile phone. Now let's look at the way people used to apply for jobs. Like even I apply for job like this. I sit in front of a computer, I open up an employer website and I open up the career section, I browse through the jobs, I select a job and then I start the job application. But the scene is changing. People want to apply for jobs from mobile phones. Well, that's an option always. Now what's the problem with this, right? So there are two typical problems with employer career portals. The first one is they're not mobile friendly and the second one is they're slow. These employer portals, like this is a real example of a job description page if you see it on a mobile phone. So the problem here is these employer websites were designed keeping desktop in mind. They were not designed keeping mobile in mind and I'll make it even worse. So this is a gigantic form that people are supposed to fill in when they apply for a job on mobile phone and the experience is atrocious if you try to apply for it. So coupling to this, there is this other factor that these employer websites are kind of ancient and they take very long time to load. I have not picked up a 99 percentile latency or something to make it look worse. The mean latency of an employer page, career's portal page is typically around five to eight seconds. So these are the two problems that job seekers face when they try to apply for jobs on mobile phone. So indeed's idea was to fix this problem. They wanted to design a mobile friendly website. That's the core problem that they wanted to solve. Now to translate this problem into a software requirement, so indeed wanted to fill this magical box called mobile which had to generate a simple, fast, easy to use mobile friendly website from a heavy, clunky, slow employer website. Now there were two options for us. The first natural option that comes into mind is use an API. APIs are fast, well maintained, they're versioned, right? But APIs are also costly, right? And there is a bit of learning curve when you have to learn an API in its entirety. So since we wanted to start it as a prototype, we didn't choose the API option. Instead we chose the second option, scraping, right? So scraping, yes. On its face, it sounds, it's slow. Scraping may be flaky. Scraping may lead to un-maintainable code. Yes, there are all these costs associated with scraping. But at the same time, scraping is quick to start with. So we decided to go ahead with scraping. So now the problem statement is a bit more refined. We need to develop a system that generates a mobile-friendly website out of the employer website and use scraping as our methodology, right? Now, before we do that, we need to be aware of certain responsibilities that a typical careers portal generally fulfills. I'll go to each of one. So the first one is about jobs. So these careers portals have numerous number of jobs. We have to make sure all of those jobs are also present in our new website. The second thing is when you apply for a job on the careers portal, you'll be taken through a series of pages where you'll be asked about your personal details, your education information, your previous work information, and certifications, if at all you have. So you'll be taken through a list of pages and each page will have certain set of questions. So you are supposed to answer all of these to completely submit the application. So you should basically emulate this behavior as well. The second thing is any website that does a serious job has to generally tends to authenticate people before they do something. So employer websites are also not an exception. So the question is, do we want to set up a new authentication mechanism? We said no because authentication, we didn't want users to remember one more password. So we try to leverage on authentication mechanism of existing employer website. Now when we say that we leverage on existing system, we should act kind of like a proxy between the job seeker and the employer website. When you say you're proxying authentication, privacy concerns come into picture. There is this big three-letter word PII which makes people scary, right? So the third thing is applications. So this employer website captures a lot of information about the candidate. We should make sure we capture all the information about the candidate as well. And again, since we are gathering some information about job seekers, we should respect their privacy. So that's a factor we need to take into account. And you also, right now, you are also providing a solution that claims to simplify the life of job seekers. Now in case you as a solution fail, you should provide job seekers to fall back and use the existing solution. So falling back to an existing mechanism was another design criteria that we should be sure that we are emulating. So the problem statement now gets a bit even more clear. Generate a new mobile-friendly website out of an employer website. Let's use scraping and you have certain responsibilities to also be emulated. Now with this, let's start, let's start building it together, right? So we are saying that we want to develop a mobile-friendly website, which means we want a web application, okay? Let's start with a web application, right? We've added a web app. Now web app is empty, right? Now the job seeker comes to the web app, finds nothing, he can't do anything. So what's the next thing job seeker wants to do on a web app? He has to find some jobs there, which means we need a database to store jobs. So let's add jobs, a database for jobs. Now, again, this database is empty. The web app is empty. We need to fill in some jobs. So we need a system that can get all the jobs present in the employer website and put it in the jobs database. So let's designate a service for it. Let's call it job service. Now, the role of job service is to get the jobs present in the employer website and put it in the jobs database. For this, we use Selenium. So this Selenium code basically opens up the employer website, types the job URL, extracts all the metadata about the job and the entire job information we talked about and puts the job in the database. I'll give an example of the output of this Selenium code. So for a specific job called installation technician, we store the URL of the job, we store which the current page it is, we store the actual question that we're trying to extract called first name and we also try to extract what type of widget it is. But in this case, it's a text box and we try to identify this particular text box. We say that the locator is first name, like yeah, we say the locator type is x path and the locator is first name. Now, why do we need to gather information about the widget type and the information about locator types? So this information has to be shown on the web app, which means web app has to understand what type of a widget that it has to render for this field called first name. So web app sees the widget type to be text box and it renders a text box and with a label called first name. So we are good here, right? We extracted all the information from the jobs, like from the employer website and put that in jobs database and now the job seeker is able to browse through the jobs. Now he selects a given job and wants to apply. Now when he wants to apply, just like the employer website, we also have to perform the authentication, right? So which means let's designate a service called Auth service for it. Now the role of Auth service is to take the username and password from the job seeker and validate it against the employer website. Here again, we use Selenium to do this. So this Selenium code basically opens up the employer website, clicks the login button, takes the username and password that the job seeker gave and tries to go through the authentication page. So either the user will be successfully authenticated or the authentication fades. So the Auth service conveys this information to web app and web app returns this information to the job seeker, right? So the moment a job seeker successfully authenticates, now he will be taken through a series of pages where he's answering questions like, what is your first name? What is your last name? What is your previous work experience? So he fills all of these pages and in the end the application is completed. Now to save the completed application, we need a database. So let's add a database, right? Now we have a bunch of applications sitting in the database and these applications have to be written to the employer website, right? Let's designate a service to do this. And by the way, the moment the web app is putting an application into the database, it fills the answer that the job seeker gave for the first name question. So the web app's job is to show all these questions to the job seeker and collect his answers and save it in the database. Now, yeah, as I said, we have to write all of these applications back to the employer website and the service that is supposed to do this is apply service. Again, apply service uses Selenium to do this job. So this way, we have kind of developed a parallel website for an employer website. We kind of take the entire job information, application information from the job seeker and we rewrite to the employer website. So now the question is how does apply service try to write to the employer website? It has, if you see this, it has the entire information to load that particular job on the employer website and to specifically answer the first name question. So this is how the apply service goes through every question belonging to that application trying to answer every answer on the employer website. Yeah, so if we do all of what we talked so far, perfectly or near perfectly, that's how the site will look like. We call these sites as mobilized sites named after the product. So the site that was looking like this, like we'll develop a parallel website that looks like that. And you remember the gigantic form I showed you that will translate into nice cute little forms like this. Now, so we have solved this problem for one employer. Now what do we do to solve this problem for many employers? Obviously we don't want one solution for each employer. We want a single solution for all the employers, right? So this is the system that we have developed so far. Now we need to extend this to multiple employers. When you say multiple employers, there will be multiple blocks of employer websites there. Now just the natural process of extending this model is for every employer website you want to implement, add one more chunk of Selenium code, right? So that will change the design to this where each block S here is a Selenium code, right? Now this is where a bit of software engineering comes into picture. When you do a same thing, similarly many number of times, right? Then there will be a need and there will be an opportunity for you to extract out that part that does not change across all these. Like these Selenium codes will obviously be sharing a lot of Selenium related utilities and a lot of general page navigation patterns, right? So we try to extract out that part that is common across all of these Selenium codes and we call it back-end framework, okay? Now still there are certain parts that cannot be, it cannot be obviously 100% generalized. There'll still be information about each client that you have to maintain. For example, the locator for the X path for the first name question for an employer one will definitely be different for employer two, right? So that's the part that is still unique for each employer. So all of those parts go into these blocks called C, which we mean configuration. So you have a framework and for each client you're trying to implement, you will have a specific configuration for the client. Now we do the same thing for back-end framework and we do the same thing for, do the same thing for app service and we do the same thing for apply service. Now this is how we extended mobile for many employer sites. I am not supposed to reveal statistics. Yes, the CCC part is manual for, it depends on the website. Like for example, a lot of these employer websites share a common back-end called applicant tracking system. So if four or five clients use the same applicant tracking system, the locators of each client will be identical. They'll have some standard prefix and they'll be an unique identifier. So we do it manually for one client but we try to understand the pattern and for all the subsequent clients, it's like split second, right? So, right, this is how we extended for many employer sites and I can't share the numbers but this model really works. Okay, so some of the challenges I faced, like we faced as a part of this problem, I'm only highlighting some and neither am I going to explain the solutions for these challenges but I'm trying to give you a glimpse of how complex it can get. The first thing is staying in sync with the employer website, which means, so these employer sites, it's not that you extracted jobs from the employer site, you build a client and you go home, right? These employer sites keep changing. There are more jobs that are being added. Existing jobs can be removed. The employer might decide to add a new page in the application process for a given client or there may be new questions added for the same job. So we need to understand, A, when this change has happened, B, what's the exact change and C, how soon can we get that change into our system and into our database. So this is a constant process. You need to have a separate pipeline for this to automate all of these. So that's the complexity in staying in sync. The second thing is we have written Selenium code for everything, for authentication, for writing the application to the employer website and for scraping jobs. Now what if the employer has put a capture on the signup page? How do you do? That's a challenge. And the third interesting thing is about dependent questions. If you have applied for a job on a careers portal, you would have come across a question like, do you know anyone working in this company? It'll have a radio button yes or no and the moment you say yes, you'll be popped, a new question pops up saying, enter the employee ID. So the automated script has to understand, the automated script written in Selenium has to understand that by selecting different options for a given question, your DOM may change and it has to be trained to understand this dependency. So not only just understanding the dependency, it also has to save this dependency in a specific format that your web application also understands. So your web application also has to again, ask the same question and if the job seeker answered yes for that question, it has to show the second question, right? So this is dependent questions and it gets even complex when the page flows are different. And let's say if a user answers, if user selects a particular answer for a question in page four, suddenly page five appears, right? So your Selenium code has to understand that this is a possibility and B it has to be trained to understand that this is a possibility and store this again page level dependencies also. There are other challenges like, how do we test our implementations? Because we don't operate on staging environments of clients, we directly run our application against original employer website, which means we cannot spam, like create a test applications exhaustively to verify our implementation. So that's a challenge. And some of the current work that we are doing is we want to reduce the drop off rate. Like for example, if you've seen the slides, it's, let me go back, right? So let's say if this particular personal information form itself, the user feels that it's long enough to answer, then he'll drop off obviously, right? So we want to reduce these sorts of dropping off behavior from users. That's a challenge we're trying to solve. And the other thing is reducing the amount of time we take to implement a particular employer. In the beginning, when we started this, we used to take 30 minutes for implementing an employer, sorry, 30 days for implementing an employer. And for certain kinds of employers, we were able to reduce it to 10 minutes. But there are still certain complex employers where employer websites where we take around like 15 days, 20 days to complete the implementation. So reducing the implementation is a problem that we still have not solved perfectly and we're trying to solve it. That's pretty much it from my side. And since, if you see, there is a lot of selenium happening here, right? And all of this requires you to create browsers, which means we need a browser service. So see, we'll be talking about the browser service that we have set up in the next part of the conversation. I work as a software engineer for Indeed. If you see the mobile target set, so we have three components. So job service, web app, and apply service. So all of the three components have one thing in common. They all need a browser to perform their action. So at the scale at which we are operating, we need hundreds of concurrent browsers at a point of time and thousands of browsers per day. So for two designs such a service, we came up with a list of requirements. So first thing, it has to be a HTTP browser service and then it has to be, it has to support the returning sessions, which means for example, if you take the scenario of CAPTCHA, where we need to send the information of the CAPTCHA image to user and user takes the user answers it and we take that user answer and submit it into the employer website to check whether the CAPTCHA is correct or not. So we might need a session which can extend across multiple processes and we should be able to purge expired sessions. So every session will have expired time and our browser service should be able to kill those, kill such sessions and there should be no single point of failure in our service. And we need state and health APIs to reflect the state of the current browser service and like any highly scalable system, we need logging and real-time monitoring to maintain it in a better way and browsers tend to release memory. So we wanted to make our browser service resilient to memory leaks and it has to be horizontally scalable. So with these set of requirements, so the immediate option that we had was the native Selenium Grid, I mean, which is the open source Selenium Grid that we have available on the internet. So it has, so I'll talk about the architecture of the native Selenium Grid first. So it has a hub and it talks to nodes, nodes are the places where we execute the, where we run the browsers and client code talks to hub. So hub is a kind of single point of contact for all the nodes. So it looks good, but it comes with its own set of limitations. So what are those limitations? So hub is a single point of failure. So if it fails, we lose all the access to the nodes and it's not trivial to add logging and monitoring, which is a important requirement for us and we can't restart node gracefully. So let's say for some reason, if some node is running low on memory, we can't restart it because some users might have some active sessions. So it's a very critical need for us to drain all the active sessions and then restart the node. So this is not available in the native Selenium Grid. So what we did was like we came up with our own implementation of Selenium Grid, which is mobile Selenium Grid. So where we have a load balancer and behind the load balancer, we have set of grid servers. So client code talks to grid servers like ask CLB and ELB will delegate that request to grid servers and grid servers will in turn select the grid nodes based on the availability of the browsers and we have something called as grid scheduler, which does some periodic actions. So I'll talk about the each individual component in detail now. So the first component is a grid server. So this is a web server bundled with JDK and it has a state and health endpoints. So which reflects the state of the browser service and we have a node endpoint, like whenever there is a node, it registers itself with the grid server. So for that, we needed a node endpoint and for example, if we need a browser session, Chrome browser session, we have a session endpoint where we send a request to grid server asking that I need a Chrome session and the grid server will understand it and delegate that request to the appropriate node. So what grid server does is like, it maintains an inventory of nodes and sessions, like whenever there is a request for a session, it stores it in the MongoDB. So every grid server is a stateless, like even if one grid server goes down, so it doesn't impact the sessions which are running on the nodes. So the important difference from the native selenium grid is like, here, grid server is not the single point of contact. Once we create a browser session, we pass that information to client code and client code will talk to node directly. So in the grid node, so grid node is a common piece of component from the native selenium grid and it runs a selenium server and which creates the browser based on the grid server request. And so the problem with grid nodes is like, we don't have a proper monitoring mechanism to monitor the health of the node. So for that, we wrote some set of shell scripts which monitor the node health and if some metric is going low, like if the available memory or available sessions is going low, then we restart that node and we face this problem of unresponsive browsers. So when we create a session, there are some cases like where browsers can go unresponsive. In that case, it won't respond to our selenium APIs. So the only other option that we had was to kill those unresponsive browsers. And the third component in our architecture is grid scheduler. What it does is basically pings the nodes periodically to check whether they are active or not and if something is not responding, then it just removes that from the fleet of the nodes. And it checks for expired browser sessions. So if some browser has crossed the expiry time, it'll just try to kill it. So we host our grid in AWS. So right now in production, we have five grid servers and 25 grid nodes. And per grid server, we use C3.Lodge, which is two virtual CPU units and then 3.5 gigs of memory. And per grid nodes, so grid nodes are usually the heavy lifting machines which run the browser. So we needed a high config machine there. So we have 15.75 gigs of memory there. And this is the comparative analysis of the native selenium grid. So if you see the top three hard beat returning sessions and the purge expired sessions, these three things are common in the native selenium grid and mobile selenium grid. So in the mobile selenium grid, so we added state and health APIs. So there is no single point of failure, which means even if a grid server goes down, nothing will happen. So, and we since we wrote the grid server by ourselves, so we added logging and monitoring. And whenever there is, we detect a memory leak, we just put the node in the maintenance mode so that won't affect the existing active sessions. And we have a graceful node restart. So we train the active browsers in a node and then restart the particular node. And in our case, so grid servers and grid nodes are independently scalable. So in case of native selenium grid, we can't extend a hub. So in this case, we can extend both hub and node. So what are the, like no system is perfect. So these are the current challenges that we are facing the rogue sessions. So sometimes browsers don't respond as expected. So though this is a small percentage of browsers, this is still a concern for us. And we face this problem of gateway timeouts. So sometimes if there is a huge load on nodes, session creation takes, if it takes more than one minute, then we respond with a timeout error. So this is still a challenge for us. And these are the grid traffic statistics. Like in a minute, we handle up to 625 requests per minute and the maximum traffic per day is 89,000 requests. And in the last 30 days, we handled 2 million. And in the last one year, it's 8.7 million. So the future work. So we wanted to implement a browser pool. So where we have a pool of browsers and when out there is a browser request, we allocate a browser from that pool. So that, I mean, if there is any possibility of failure in the creation of a browser, this will avoid that. Autoscaling Selenium Grid. Like right now we anticipate the traffic and we make sure like there is sufficient capacity in our grid, but it's in case of any unusual spike. So auto scaling will help us. So this is one of the future action items that we have right now. And this is indeed engineering blog and talks. So where it shares it's the technologies and systems. So if you're interested, please go through this. Yeah, thank you. Yeah, so yes, that's a cooler option that that would work out very well. But in general, these employer websites are, they do not invest so much in this part. So setting up a web service and adding the API is necessary for us to get all the jobs or submit an application. Yes, it's an option, but most companies do it when they want to do it, not when we ask. Okay, yeah. So for you, like when you're running Selenium Grid, some of the nodes die, right? How do you identify those nodes when it dies? So basically, we try to open a socket to that particular question and if the socket creation fails, then we treat that node is not responding, then we remove it from the free one. Okay. And you told you are handling like, like in a pool of roses out of that you use. So that's the current, like that's the future work. So that's not in production yet. So we don't know how it will work. So that's still in still in recent phases. So I mean, that's one thing that we feel like it may address some of our current problems, which very well, but yeah, right now we don't have any stats. So we hope to mitigate some of our problems. So the current model is when some, some client asks for a session, we start creating the session then and there. So this process usually takes somewhere between like five seconds to 35 seconds. And if it fails, it fails very late. Like it fails after a minute. So we want to like, and also we need to understand that, for example, if the job seeker was trying to sign in, we have to create the browser then, right? Which means all the time we take in creating this browser, the job seeker is sitting there waiting for his authentication to go through. So we wanted to reduce this time. And hence we wanted to have browsers readily available. So when the job seeker comes to authentication page, we take a browser and he fills the answers there. We immediately, so there is no overhead of creating the browser then and there. It's kind of like we maintain a cache of sessions. So that way we try to reduce around two to three seconds of job seeker's time, which is quite precious on a mobile. Actually, your example is very good. You are using certainly not for testing for other purposes. Right, correct. Yeah. Now we also use like similar to our production server. Some of the like main trees. And usually the first option we had also was like, using some crawling libraries, but we also want to write the information back, not just read. So that gets a bit trickier if you want to use crawling libraries. We also use Phantom.js as a part of this whole thing. It's not just only Chrome's and Firefox's. We found Phantom.js offering us good performance as well. The amount of time you take to boot up browsers is like quite small with Phantom.js. Come back to Selenium and Chrome where you have to set up a new process all together on the OS. Phantom.js, you use Unix everywhere? We use Unix everywhere, yeah. When we say it's great, we check for memory. So if it goes with the available memory is lower than some x percent of the total, like some threshold. So we keep that node in maintenance mode. So memory is one such thing and number of sessions. Like if the number of sessions available on a particular node is less than particular threshold, it goes into maintenance mode. So right now, the criteria are the low memory and low available sessions. What do you mean by node circulate exactly? So actually we, since every node by itself is a Selenium server and Selenium server exposes this URL like WDE slash sessions, we actually query that API from the maintenance script which is written in shell, we parse that JSON response. There is no guessing here. Since we know the capacity with which we registered this node with the server and when we query this API, we get the current number of sessions being served by the Selenium server. Then we just do like if we are crossing around 85% of the active availability of that node, then we think that it's a good time to restart the node. The second point that you mentioned is, why don't we add it as a part of the servlet in the node? You can add it as a servlet in your Selenium server, right? The problem with that is if we added that, we would not have needed grid server at all, right? So the question is it was easy for us to, like the Selenium servers keep changing. You get a new version of jar every time, right? So every time there is a new version of jar, you have to basically change your code or verify your code, but instead, it was easy for us to actually query the output of the Selenium server through HTTP API. Yes, exactly. The grid server's role is just a bookkeeper of monitoring the absolute capacity that all your nodes can offer and their current capacity and it tries to identify which is the best candidate node that can serve a new session. It's exactly what you said. What we noticed was like, even though there are a few browsers running on a particular node, so like let's say we are using a 16 gate machine. So even though the number of browsers running on a particular node is not very high, we have seen instances like the memory is going down. So I mean, when we say resilient, I mean, we wanted a system where memory leak should not create a crash of node. So whenever there is a memory leak, we detect it and we stop any new traffic to that particular node and all the current sessions will still be able to continue to run. So also when you say memory leaks, you're basically talking about memory being bloated at the node level, right? Right, right. So I have seen mostly that is basically due to, let's say either your node is basically running for a really long time without being killed or within a short span of time, it basically starts supporting a lot of browser requests, especially if it is Firefox. It tends to, I mean, I have seen that it basically tends to eat up a lot of memory. Right. Okay, yeah. Thank you. At the current rate, every node at two-hour traffic, every node goes to a memory hungry position in around three to four days. So a node gets restarted every four days and yeah, it's primarily due to your browser sleeping memory. Any other questions? So in your presentation, you have mentioned about a single point of failure and how in your project you have avoided that. So in that you just mentioned about redirecting the test directly to the node instead of the grid. That's what I'm sorry, instead of? So you just mentioned about redirecting the text that is the test directly to the node instead of the grid once the session is created. Is that what you are doing? So yeah, so what we basically do is in the traditional hub model, the hub creates a session on the node and as far as client code is concerned, hub is a source of truth. So entire communication happens through the hub and to the respective node. But in our model, what we do is the hub's role is to create a session and just return the handle of that session to the client code and the client code can now talk to the node directly. So even if one of the grid server died due to some crashing, it's the actual communication between your client code and the node so it's not affected. So that means you have done changes to the client code. That means remote web drive or things like that. Yes, correct. We establish a remote web driver client on our client code. You have implemented your custom remote web drive. Exactly, you're right. How do you know that the browser is unresponsive? And good question. Some rule is if it doesn't respond to our Selenium request, we try to kill sessions. Like let's say, I mean, if the browser is unresponsive, we don't have a way to know that. We don't have a way like, okay, at this point of time, browser has become unresponsive. So usually we detect it during the time of killing browsers post expiring. So we generally give a driver.fit call. So if it doesn't respond for three calls, like right now the logic is if it fails to respond within three calls, we browse the session. So since we exactly know the session ID and what node URL this session was created on, we use remote web driver to create a driver object out of this. And the moment we were not able to create a driver object, we think there is something wrong with that particular node, sorry, particular session. And we tried this three or four times. And if we understand that we were not able to successfully connect back to that session using remote web driver, we understand that the particular session is like screwed up. We call it rogue sessions in our term and we basically don't do anything about it. And when this node goes into this regular maintenance mode, like restarting after three to four days, that particular process is totally purged. So we kind of keep those lying there or fund till the node goes into the maintenance. And to be honest, we were not really able to identify what causes such unresponsive sessions yet. Right, so there are two ways to do this. One was the question was how do you know that a new job was added to the employer website? So the way which we have been doing so far is we run a periodic process where there'll be a worker machine that kind of tries to scrape all the jobs from the employer website every four hours. And we try to compare it with our snapshot of jobs. We identify the delta. And now, like if a particular job was no longer with the employer, we remove it from our database. If there was a new job, then we try to go and fetch information about that job. So this is one model versus the second model that we're trying now actively is, Indeed also does job search, right? Indeed has a huge cluster of machines constantly scouting the entire internet and trying to identify new job for every employer. So the model we're trying to do is, Indeed was already doing job search. Then there is no point in mobile system also doing the same job search and identifying the delta. So the contract that right now we're trying to establish between these two systems is, whenever the Indeed's job search engine identifies a new job in a particular employer, it pushes a notification to mobile system to update mobile's database. So we're trying to get to a model where both these two systems do not do the same job and we get notified about these job updates. As in the same job being updated? Yeah, that's a good question. So for this, what we do is we kind of do a periodic job reload of the entire jobs in the client. So we kind of forget all the jobs that we have fetched a week back and we scrape all of them again and we try to fetch all the information about the job, the pages in the job, the questions in the job. So that is how we get to know, like we try to keep ourselves up to date with the employer website. And I said that as a challenge because there is still a possibility that during this one week period after which the employer has made a change to the job, we are still unaware of that change. So that's still a challenge and we need to figure out a solution that does not, that is correct and yet that does not affect the performance of the original employer's site as well. Because we could do the entire job reload every day also, that's an option. But we don't want to spam the original employer website so often. Any other questions? Great, thanks guys.