 It was a pretty interesting day on our counting day for us, and this is going to be a talk partly about our experiences on that day But more importantly on how we scaled the election visualizations to serve the entire country Those of you who saw CNN IBM's TV coverage the CNN IBM website or the Bing election website That was the visualization that we created. That's what I'm talking about The whole thing started during the other time when the elections were just about getting kicked off. We Had this large screen that Microsoft had put together They taken it to the CNN IBM Noida office and we had this installed somewhere in their office Yeah, so that's kind of what it looked like The studio was behind and we just had this large screen with our software running on it. It's a tough screen. That's a huge one Beautiful when you start playing around with it. So boop and show where in this introductory broadcast starts going to the screen touching things Showing what the election is like. This is a history of all of the parliament elections so far This is these are the parties that have been winning these are the parties that have been losing drilling down to the next level Who are the candidates that are winning in which regions are they winning and so on and while all this is happening? You might see one bearded character with a red t-shirt in the background So that's me my first appearance on live television of various kinds and What I was doing there was very urgently calling home and colleagues saying look turn on CNN IBM watch me on TV So Now this was the time when the elections were starting there was almost a month of people going around doing the polls and so on And in this process we found a number of interesting things So one of the things that we discovered just by playing around with the data Is that there are some constituencies where the elections have had a fairly interesting history to take an example if I were to take the election history of If I were to take the election history of Tamil Nadu as a state This is the assembly election history each of these circles represents one constituency the size of the circle represents a number of Candidates that stood for election and the color represents a party that one So they were totally what 700 odd candidates that stood for election and in this particular constituency We just pay Rambalut there were as many as 10 candidates the next year about the same number of candidates a few new constituencies popped in 1977 you see two things firstly there's a sudden increase in number of candidates almost doubling from 700 odd to almost 1400 and There's also a shift in the party results DMK in yellow Which was winning until then now has given way to ADMK Which is the largest part and this goes on but the thing is look at this constituency in mother on the gun We have as many as 90 people contesting in one election That's quite a bit. How am I gonna read through a list of 90 names and figure out which is the right candidate to vote for right? 1989 this problem seems to be all over the place, but in 1991 We have two constituencies Pallipet with 264 candidates and Avrakurichi with 249 candidates. No, you don't have a ballot paper. You have a ballot booklet Which you flip through and say, oh, where am I was this in the right place for me to be in but all of the spales in comparison to the 1996 elections where Mordekurichi had 1033 candidates Now if you start looking at the names of some of these candidates, right? I mean so when we examined the actual list of candidates that came in There were about 40 odd Pallini Sami case So there's a Pallini Sami K Pallini Sami K Pallini Sami K how do you even figure out which Pallini Sami K you are if you were going to vote? And which is in fact what happened because what we found was that the total Fever to put in the list of candidates out here So those are all the Pallini Sami is the bees the C's the D's and you realize at this point This is no longer a booklet This is a telephone that you are effectively giving us the ballot paper saying pick the guy you want to vote for And looks like 88 of them couldn't find their names because they got zero votes They didn't vote for themselves their friends and family didn't vote for themselves Now obviously there's a lot more fun in the bottom right I mean where there are zero votes and stuff like that the news is in the top But the fun is at the bottom another case that we found was there was this party Which you probably never heard of called the Durvarshi party Okay, now this is a remarkable party in the sense that it has contested in more elections and most parties in 1984 They found in 1980 80 I think and in 1984 when they contested for the first time they participated in all of the areas indicated in black That's all about 90 odd constituencies and they won in exactly zero constituencies Just fine. It's a large party, but it's still one in zero. They would have given up In 1989 they contested in 298 seats at this point There's a second largest party only Congress has had more seats than the GDP and they won in exactly zero seats They did not give up in the 1991 elections. They contested again this time with an expanded coverage of 321 constituencies bringing the total number of seats ever contested to 700 plus zero wins Not only zero wins. They were never even the runner-up in any of the Constituencies that they contested at best. They were third and that was only in the places where there were only three candidates So very persistent now one could go on to the details of how this happened why this happened, etc But I'm gonna fast forward to the actual counting day because what happened there is a lot more interesting and relevant to this Scaling discussion that we're about to have see that was Rajdeep sir place. I walking in the evening How the previous day saying have to get this up and I got to get this ready The plan was that nobody was gonna go home that night He would apparently walk in the next day at around 5 30 so everything's got to be ready for him So all the staff were out there all the reporters were out there And we were all busy trying to get this set up and the key thing was that we were expecting to cross over 5 million visitors the next day It was going to be one of the largest media events that had ever been covered So how do we get this to scale now the thing was this bit about 5 million visitors? I got to know approximately at around 3 p.m. The previous day So it certainly helps There's warning at least it doesn't come out of the blue So it was a last-minute half-day scramble to try and see if we can get this Rather sophisticated visualization working for counting day for this many users This was the design of the visualization you can check this on a variety of websites including IBM dot If my connection is good enough Doesn't matter I'll walk you talk you through this you get to see who the parties that are leading in the NDA in the UPA and All the others you get to see where they won in the map either as a map view or as an alliance I mean the party view or an alliance view and you can see the whole thing as a list Plus there are a whole bunch of filters you want if you want to see where the Congress won where they lost the last time Or where they've taken over the BGP or vice-versa who won in the Muslim constituencies who won the SC constituencies who won an X Y and Z All of that is available as filter. So there's a fair bit of dynamism and interactivity that's going on here So it's not a simple visualization to render So question is how do we get this hand? This is complicated by the fact there are multiple devices. We are rendering it on the web We're rendering it on television. We're rendering it on mobile And since we were doing it on television for the first time it gives a whole lot of headaches For example, this is the actual visualization that you saw on television You'll notice that it's brownish green This is what actually looks like white on TV On the screen and proper white looks like fluorescent green In fact, I got very confused on the first day that I came because Vinay Talwar whom you see here on the first day he had come after a session on Camera and he looked like he does on TV fair and all of that the next day I saw somebody who I could square was his brother much darker Completely different looking. So I said, okay fine looks familiar, but I don't quite know this is until he walks over and says Hey, how are you? The amount of makeup that you have to put on to look like what you look like on TV is terrible on men It kind of looks okay on the women. It looks ghastly face to face You stay away from them. Okay. Fine. What is this white face that you've got in your You haven't removed your makeup or what so we had to make all kinds of color correction So they took the camera in there showed it to us as to what it looked like in camera and what it looked like on screen We dynamically so that's me on the bottom right looking very carefully at what it looks like on the screen turning around To see what it looks like on camera and then trying to sort that out So having done all of that what we did was put together the following architecture The data comes from Nielsen, which has the data updated directly from the election commission very rapidly And that is updated on a CNN IBM server It's a Microsoft SQL server. They've got into which all of the data gets pumped in and we then had this windows XP laptop really ancient system that is sitting in a very very cold data center and we Installed a bunch of scripts there that would take data from the SQL server every 10 seconds Break it up into a variety of ways do certain amount of processing which I will describe and then send this data on to a server in Singapore Which is an Azure Ubuntu server In which we had installed the grammar visualization server. That's a proprietary software that I'm not going to be talking about That's what creates all of these templates and renders them and finally put this on to an nginx front-end Proceeding system that would serve it to the end users What I'm going to be doing for the rest of the session is work backwards and talk about how we scale each of these systems Let's start with nginx. Firstly why nginx? That's an easy one. It's a server that scales up the most compared to say Lighty PD or Lighty or Apache it does an order of magnitude more requests almost. So that's a pretty easy one The question is how exactly do you go about configuring it so that it's optimized? One of the things that we did was make sure that it's load balanced So we split the load across four servers. We installed four instances of our back-end web server and nginx Whenever it gets a request randomly sends it to one or the other and this also comes with a bit of fail-safe Command so in case there's an error. We just send it to the next upstream So in case at any point any one of these servers gets over it Well, it just goes to the next server goes to the next server and so on The other thing that we did was make sure that it's cash You don't want the application server to be constantly hit with requests that it need not have generated Especially the content is not going to be dynamic so at least the static part of the content and The content that does not change whenever a request is refreshed is going to be cashed for a reasonably long period again with Fallbacks now this is reasonably straightforward stuff You can read about this in the nginx wiki the other thing that we did though was to explicitly create aliases for static files see the normal process by which A default application server is configured is it's got a way of taking static files And sending it to the front-end proxy and it has a way of generating dynamic files Most people don't change the setting What we found was that about 95 percent of the requests that were coming in were for the static files Not that it was a problem and we didn't think it was a problem until 3 p.m The previous day when they don't look 5 million visitors are going to be visiting So we said okay in that case the application server is doing nothing more than taking the files and sending it to nginx Why not let nginx do that? So we made this last minute tweak to say all of these files Which are just plain static take it directly from their actual locations The next bit was having tweak the front-end a little bit was to try and reduce the payload You obviously want to have as little content delivered to the client Mainly because this slow bandwidth I mean this was to have been a mobile application this was a mobile application as well So on a mobile broadband connection at best you get 100 Kilobytes per second and on a normal mobile connection you get as little as 10 kilobytes per second So how does one go about compressing that content now? One obvious thing is to do is to gzip the content I guess almost all of you would do that in our case It brought the payload off about 1.5 mb down to about 380 kilobytes Which is a almost a factor of 4 compression which meant that on a relatively slow connection Which is what I was doing this benchmarking on it would get served in a little over three seconds Now three seconds is not ideal, but we didn't really have that much time to get it less than that So and it was okay. We left it at that Now what we did though, however was to tweak the level of gzip compression Nginx by default sets a gzip compression level of one or two depending on the version Which means that it doesn't try too hard to compress now Why is that because the aim is to make this as fast as possible? But in our case the compression has enormous value and a lot of the content is cached anyway So we didn't really have a problem by increasing the level of Uh gzip compression more importantly we could Get the total content size down considerably given the processing power that we had these are these are fairly powerful processes So as I would strongly suggest that if there's one tweak that you want to make to an nginx configuration It is increase the gzip level Add gzip to begin with and increase the default gzip level that you find Now you can find more documentation about this at say the nginx wiki which will Detail step by step what exactly is going on in each configuration But I would not suggest that if you're looking for performance optimization that you start from here Where you really want to start off is more place like the html5 boilerplate Just do a search for html5 boilerplate and you'll get a site that gives you a starting point for most web applications And one aspect of this web application, which is available at h5bb.github.io are a set of server configuration files They'll give you the nginx configuration file apache configuration file node js configuration file, etc That does all the stuff that I talked about and a hundred things more based on a lot of experimentation So you really want to use that as your starting point The other part so first we've gzip the content now question is can we reduce the amount of content itself, right? We only have one image. It's an svg file, which we can then color dynamically Unfortunately, it's a 3mb svg file to start with and that's what we had generated So can we get that size now if it were a raster image a png or a jpeg or whatever I don't just toss it into kraken.io, which I find has the best compression online right now And I would be very happy to be concentrated if someone finds me to a better site So I would have taken a raster image and just dumped it in here and you'll give me anywhere from 90% to 50% compression and used it. But what do I use to compress an svg file? So let's dive in into the contents of the svg file. So this is what it looks like It has a series of paths each path says this is the coordinate of one particular point draw line here draw line here And you'll notice that It specifies this to more than 10 decimal places I don't need that kind of ideas On the screen at best if I specify it with maybe Two digits three digits max four digits. It should be fine Question is how do I go about compressing this rounding of the numbers svg doesn't have any default rules Good part is tools like inkscape will support that. So when you do a file save as in inkscape There is a drop down that asks you what kind of file you want in case you're not able to see it from the That don't worry. I'm not able to see it from here either But there is a drop down out here that says optimized svg Which allows you to set the position Which is effectively the number of decimal places that you want to round it off Firstly always save as optimizes svg. I don't see why you would not And set the number of decimals to what you think is appropriate So I set it to and this is what I got Two is not good enough, but the great part is the file size is extremely small It came to 95 kilobytes all the way from 3 mb. So that's at least we're going in the right direction Then let's take three decimals. So this looks reasonable and at 145 kilobytes And this is not before jesus. So at this point it starts looking reasonable. So maybe it's okay, but if you zoom in, this is what you get It's still pixelated now. I don't expect people to zoom in when they are looking at election result But there are a few people who do so Four decimal places at this point. We are up to 613 kilobytes uncompressed Which when compressed comes to about 100 or kilobytes and given that this is going to be sent only once We took a call that we probably want to go for this level of quality rather than Going back to previous level now we could have based on the bandwidth Pick the right kind of resolution to serve and all of that. We did not have time this happened at approximately 4 30 a.m so Now this has a pretty decent Resolution when you zoom in it looks kind of okay. So that was one aspect of content compression In other words to start with the biggest thing Make it smaller and then go for the next biggest thing make it smaller and so on. This was our biggest thing the Next part of it and beyond that the rest of the content was pretty small The second largest piece of Content that we included was the data itself So we had to compress the data and I'll come to that in a bit beyond that the rest was pretty small So which brings us to how do we go about optimizing the rendering? See there are so many filters out there that what you want is whenever each of these filters is clicked You want the response to be that generated dynamically almost instantly? You don't want to use it half the way now the round trip time is huge the amount of time that it takes to generate need Not be large, but the round trip time will be at the very least About 100 milliseconds. It's very tough to avoid that and that produces a lag click on it It's a little laggy and on a mobile device. That's going to be one second five seconds Whatever can be on the other hand Make sure that these filters are all on the client side meaning I do not even need to go to the server To generate this this is a paradigm which is different from where Complex content is generated on the server. What you do is send data to the client and let the client do all of the rendering That's a completely different paradigm. And what I'm going to do is walk you through some elements of options that you have in here and how we applied it Now firstly, how does one go about generating client content using data? Well, if you look at just how content is written today, we either do it in a declarative way Which is like html. You just type the content as is or in a procedural way where you generate the content using a programming language You can take javascript put a long string in it and display So those are two paradigms by which you can generate content Now if you want to take data and apply that to generating content There are a couple of ways one is templates create a long string or bindings You say I want this particular data element to be the width of this element I want this particular data element to be the cover of this element and so on You take various columns and map those columns to attributes of the data and that's effectively what does binding mean Now on the client side, I have only one option javascript So I've got to go for that but from a combination of these perspectives declarative versus procedural templates versus bindings What are the options that I have? Well All four combinations are possible and here are popular libraries that sort of represent the Way in which this programming this kind of programming is done You can see I'm going to show you some code which you can see at the repository that I've listed below I will some point Today after lunch or before lunch tweet these repositories and the location of this presentation But what we're going to do is create a simple bar chart out of all of these The bar chart takes 10 numbers zero to nine and Cross bars in orange out of these. That's that's it Let's start with underscore. What underscore does is let's you define a template just like in html Except that inside the template you can use the equivalence of service side includes You can effectively write javascript code put a for loop out there and as you loop through all of the numbers from 1 to 10 That's out here for bar i is equal to zero i is in 10 i plus plus You put in a div with a width that is computed. So in other words, you're embedding javascript into html and then There is an underscore template function which takes this stuff converts it to html Effectively evaluating all of the javascript directly on the client side and renders it. That's one paradigm Let's take the procedural way of doing the same thing which you would in jquery Out here we loop through the variables 1 to 10 and take the chart element Append a string which is dynamically constructed the width of the string is constructed using javascript and you say the width is i times 20 So doing the same thing except not as part of the html, but as part of the javascript But still constructing the string and dumping it out there Both these are template based approaches. You also have the option of bindings knockout is a pretty good example of how bindings work What you do is say i'm going to have a bar Firstly, you say that i'm going to have a bar chart in which I bind this to a data set called numbers Numbers are an array from zero to nine And then within this within this for each you loop through create a bar in which you bind the style Setting the width to the number times 20 Out here, you're not quite creating the html. You're letting Knockout do that. What you're saying is I have this attribute this attribute maps to this function This attribute maps to this function the function could just be take the data from a column or any set The other approach is d3 where you do the same thing not in the html But you do this in javascript in d3 you say I want to Create a bunch of tips based on this data Which is the range of zero to 10 Which is effectively zero and all the numbers from zero to nine Set that attribute of class bar set the style to a function, which is the number times 20 And we done with it All four options are possible now I had to weigh in the relative pros and cons of each of these There were two factors that were concerning me one the size of these like these d3 is huge 143 kilobytes underscore is six kilobytes There's just no question of each one wins in knockout and jQuery are in between The good part was I already had jQuery. I had a dependency on jQuery in any case So by default I would have gone for jQuery But there's one other problem Which is that I also want a certain amount of animation What I want is for these to move dynamically. I want as as I click on a filter I want the progress bar to gently slide which if I were connected to the net I would have shown But I'm going to skip that Uh, so because of this we had to pick something that was dynamic We finally went in for underscore But today if I were to go back and relook at this choice Sorry, what we did was went in for underscore and use an alternate version of knockout that we created ourselves A very small library that permits animations today If I were to go back I'd probably have chosen knockout or e3 if I could afford the buffer But at that point we could not So we have client side rendering and this is optimized from a responsiveness perspective There's I'm not reducing the amount of data. I'm just making it feel faster Because what I'm doing is sending all the data out there and the client can just completely disconnect from the internet and still play around with it Which means that they're like going to be slightly less worried about when the result is going to be coming in next They're at least slightly going to be busy playing around with these filters and that's going to occupy their attention All of these gimmicks you try and use to distract users right comes in handy But we did have to optimize the data and what the data looked like is this Another one and a half megabytes of data every second Which is how often it's finished. There's no way I'm going to send one and a half megabytes of data By flying apart from the fact that it's going to be a lot of bandwidth for the client We were paying for the bandwidth on the server Pretty expensive But some of it is static. See the constituency names are not going to change Further, this is also repeated. We'll come to that a bit So some of it is redundant. I don't need peddappally repeated 50 times one for each candidate if it were motorcoach it would have been repeated a thousand times We don't need that. We just need some identifier that tells us that this is a constituency And the worst part is some of it is misspelled or just plain wrong. Now there are wrong comes in two parts Um to begin with we thought that this last column out there titled winner is going to be the one that tells us whether the person's won or not And we did this week. There was a session where we were actually testing out whether we got a number of candidates Right or not means insect a feed or a span of four hours for us to test and we found that every single one of these numbers were wrong The winner is just not updated real time What they do is update the number of votes as rapidly as they can and then a few hours later They come back and try and fill out the wins Helpful good to have known that beforehand. Thankfully we did a driver. So we knew that so the way you calculate the winner Actually, you don't calculate it. You cannot find out whether a person's won or not Okay, pretty much cannot what you can do is say that the candidate is leading And you don't really know when the election has been called until firstly VCA calls it And then Nielsen decides to take the data and put it in here So you would have seen at least on the CNN IBM and a bunch of other sites that they would have Made the assumption that a candidate who's leading has won That's pretty much it if you've gone, you know at any point you only show the leading candidates and the undeclared candidates You don't bother showing whether they won or not And that's in the heat of the moment. That's actually the kind of selection But we definitely had to take care of the other kind of problems, which is the uh, which are things like misspellings Etc. So for example at the bottom the state codes were different between Nielsen and the election commission So you start mapping some of those The other part was how much data do we really pick from the server? There were all kinds of columns what I showed you earlier was a subset the 1.5 mb is a subset There were several more columns out there that were created by various people to help them tag Whether this particular party belongs to this particular alliance, whether this candidate belongs to a certain alliance Even if a party belongs to an alliance, there isn't a necessity that a candidate necessarily always belongs to that alliance under a few circumstances So all of those tags whether we didn't really need that now the thing was when we reduced the number of items that were queried Instead of querying select star we just said select what we need from the database that improved the querying time dramatically Remember, we have to get these queries up and running extremely rapidly We have less than a second to run these and we constantly want to run it every second So every millisecond and micro seven second counts The other thing was to start normalizing all of the static data We need a list of candidates and when we have this list of candidates you Can send this can the list of candidates one time And then if the list of candidates is ordered in a predefined way The next thing that you need to send is the only the following information. How many votes did they get? In other words, if there were 8000 candidates, then I sent an array which contains all of the candidate information Which party they belong to which constituency they're contesting in all of the stuff And then the next time I only sent the number of votes in an array of 8000 characters Effectively partitioning the data into what is updated versus what is static and we start we created a candidate file Therefore, which had all of the static data. It just said in this particular constituency You have this particular candidate who stands Who is in this particular party and whether he's winning or trailing But the dynamic file looked like this where we had some redundant data Which we put in because mobile devices don't really like to compute a lot of stuff So we put in a certain amount of cap recalculated stuff that could have been calculated on the client But the rest of it the bulk of the information was simply how many votes did each candidate get as of a certain point in time And that is what constantly gets streamed in Now you notice here that the choice that we've made is json as opposed to csv or a bunch of other things that you could send over the web Why is that? Well, firstly, let's take csv versus json in terms of file size csv is smaller No doubt about that But when you gzip it you'll find there's no difference What gzipping does is takes all of the common stuff and knocks it out And what json does is constantly repeats the column names. So these two roughly cancel each other out So the overhead of gzip is negligible when you say overhead overhead of json is negligible When you gzip it so that wasn't a consideration and json's flexible I can have hierarchical structures which I cannot have in csv So in vast majority of cases I normally go for csv in this particular case It was a no-brainer It had to be json and each of these ended up being a 27 k gzip json 5 in other words That's all I need to send Down from 1.5 megabytes. That's all I need to send for a person to know who exactly is one at any point in time Which definitely helped So all of this was set up by around 6 30 and we were there ready waiting for the 5 million people to come in And it was pretty tense In the first one hour we hit about half of 0.6 million So if the election were to go on for six hours then already we are at 30 the first half Yeah We'd get to easily 3 million and through the day it looked like we're going to get to the target of 5 million very easy By which time tweets started rolling in and these were mostly good Things like probably the best done website for tracking elections Grandma seems to be the company running the elections must watch link blah blah blah and finally So bings done a much better job than google at the election results This ended up being a big theme almost every visualization or every tweet that mentioned The bing website and the google website seemed to be in favor of the bing website This is being laid to them and Microsoft team is completely thrilled is completely thrilled An hour later, sorry. Yeah, but in the first hour we had a couple of features The first was we have an admin page in our server And the admin page lets the person log in gives them various roles, etc We deployed this at 645 or so Not much sleep and we forgot to set a password for the admin page Which is not a good idea Usually despite this big warning out there that says no administrative roles defined We just went ahead and some nice chap in another bar found this So he spent a good 10 minutes playing around with this and tweaking it trying to see what would happen And ended up shutting everyone out So at around 840 we found that the site was not accessible And it said 401 not authorized Panic We didn't really care what was happening or what was going to happen Just kill the login restart the server get it there go first to this Set a password move on Total downtime of approximately one and a half minutes Which most people did not notice because of the nginx caching setting that we had said We said that if the file is not Proxy servers not available just take it from the cache and send it We counted the number of errors that were not authorized that we served It was less than it affected less than 100 people and we were lucky that we were one of those 100 So a message to uh the amadabha chap who's uh done this if you're seeing this video Please contact grammar.com we wanted to recruit you The other thing that happened was at around 855 our server was at Was getting close to about 65 percent utilization at this point. We were worried seriously worried because Uh This is a relatively small server We had planned for load balancing and all that but in the heat of the moment and all of these changes We did not turn on the traffic manager. We did not turn on the backup server And there was one machine slightly less capable than my laptop running the socialization And there wasn't an option to turn it on at that point 855 we started praying and it was really worrisome 1.3 million visits the next hour At this point the server ought to have crashed the one thing that saved us Was the fact that mr. Modi won decisively The election was called the interest in the elections gently started coming down So it was dawn hill after 9 30 If it were not and this is probably the weirdest reason that anyone's thanking the bjp and the nda for thank heavens Otherwise our server would have crashed The tweets were Consistently positive thanks being for being dot com slash elections is the first time I'm more active on being than google for once being the google check out is awesome Digitalization being the election graphics are superb, etc. And we hit not 5 million but 10 million on that day That was how that day went this was a team that created the visualizations and Without doubt one of the most memorable days that we ever had And the lesson that I took away from this was that firstly you don't really need Large you don't need big infrastructure to handle big data You can get away with a machine that's smaller than my laptop if you're lucky enough And hopefully follow a bunch of fairly good principles First principle is to test test test data is often wrong Data is often incorrectly transformed data is often verbose data is often Structured you don't necessarily need to structure the weights given to you And all of these are assumptions that you assume that what you get is right and correctly structured and so on All of these are assumptions that are worth questioning Lastly remember that no matter how far you think you've optimized someone can always beat you very Just remember that do not stop until you literally run out of time because there's always more optimization to go for And there is no such thing as perfectly optimized piece of content Thank you, and I'll take questions Okay, so we're going to open it up to questions right now if you have a question Please raise your hand and then we'll have somebody bring you over a microphone my question is um Why didn't you use the content delivery network? Because uh, you know your the users are geographically dispersed a lot of diaspora is that who's interested in your content And if you just hosted at one place and I think it's at singapore, uh, the round trip time You know from across different cartons Would be fairly high. Absolutely. So we set up a content distribution network Which would serve traffic out of the us as well as out of singapore This was from azure traffic manager. We split the load across two servers in the us and two servers in Singapore and the plan was that all of these would get served at six forty five in the morning Which is why I'm saying we got lucky So the short answer is yes, absolutely you would next question Here, so I'm not it's proprietary or not. So what's the back end technology to use? The granular visualization server is the application server. The rest of it is not appropriate If I can find this slide that shows the architecture Uh, that chunk on the right side. That's his granular visualization server. That's proprietary. The rest is not Meaning all of the all of the code that I showed you is not and it's on the top of that repository that I sent I will be committing the nginx configuration file And you can always do a view source fair bit of it is in javascript not even compressed We didn't need to jesus was good enough not even minified jesus was good enough Okay, next question raise your hand up high Next question Okay, we have about another five more minutes for a question and answer. So keep the questions coming Yeah, it's not I got some stuff to show you You said when you look back with news d3 or knockout what why it provides better animations The point is when you're rendering on the client side, you have to decide whether I want to take the html Scrap what I have redraw it or just take the attributes and modify it the advantage with I with modifying attributes is that if let's say I change the top Style element now you don't move from the top to the bottom in a jerky way Whatever position I want, but if I add a webkit transition transition most transition effectively all the css transition Which looks nice To begin now not just that but many of these frameworks I can actually start using javascript animated where css is not provided So that's that adds a bit of Jing bang Question regarding knockout versus angular angular comes with animations the support that we have for css transitions these days is pretty good This is rendered on svg in any case. I'm knocking off i8 and below because they can't render svg The vast majority of the browsers today that can render svg also support css transitions So which means that if I change the attribute, I don't have to worry about how exactly it's going to get transition So therefore that differentiation between knockout and angular is slowly vanishing You know in the sense that it uses javascript animations on the other hand knockout is small So over time it will People would probably gravitate towards knockout for this particular aspect and by no means saying that angular is worse Question I have is uh, why wasn't there a load balancing server? We forgot Oh, you forgot about that or that's not available on azure Uh, uh, so firstly, uh, was there a load balancing server? Yes, that was nginx It turns out that there was only one server that we used which was also a load balancing server by nginx and That had that sent that balance a load across four different systems What we did not do was balance a load across multiple servers for which we had planned and tested using azure's traffic manager Which is there in terms of load balancer network beautifully That is the equivalent of the elastic load balancer. That is correct and traffic manager And and what what do what uh By the server side javascript on the client side Right, I don't have the question the framework that we used to be the kernel utilization server Hi, uh This side up So this is mahesh So I see there is data and there is rendering on the client side. So what is that visualization server doing anything? Um, firstly, there are a bunch of other visualizations. Secondly, what it's doing is uh acting like a server side template It is what takes the data from the database converts it into the json files number one number two That's what stitches together the arguments that come in from the ur and renders the appropriate Uh version of the visualization that comes in and it also Stitches together the list of filters. Uh, you may have you may remember seeing a list of filters on the right side We didn't want to send a static list of filters up front. It is quite dynamic So what we do is on the server side generate the list of filters In other words, what's happening is there is a database which contains a raw data And there are a set of files that get sent to the client which include html css javascript json All of these are being created by the kernel applications The transformation is entirely happening there using a templating mechanism. So there is nothing custom We need to do to render a visualization for elections. So the visualization has all the abstractions to That is correct. Uh, I wouldn't go I wouldn't say though that it has all of the abstractions We did have to customize on the visualization server a reasonable bit for this election context But yes, it has the underlying layer for us to be able to easily build what it takes to render that out Sorry, uh, it's about spatial data. Uh, it looks like we have pretty decent Data with respect to looks of our constancies assembly constancies Um, so you were able to use it What about other like census data where you're using this of course on the last day maybe not but prior to that When cnn ibn was putting out their telecast where they're trying to use Literacy rates populations normalizing data you pulling up census data where you needed to map census Codes with the looks of our constituencies. Okay, so, uh, this actually the map that You see on the side did not actually exist before we created it and this was using the painful process of taking pdf file zooming and tracing the regions and coming out of it took a few months to get it done And the only reason we actually went forward and finished it was because we were being paid enough So to be fair now on data meat, there is An open source version of the election commission maps for those of you who may not be aware data meat is Mailing list group of people in India that are very enthusiastic about data. There are a lot of mapping discussions that go on So, uh, it did not exist. We created it and there are ongoing efforts on data meat Which many of you may be aware of to create these census maps specifically to the question of whether cnn or ibn or any of the other media companies are using it No, uh, by and large, no, certainly not as an organization A few interested and enthusiastic people in these organizations are playing around with these International listing of the times as an example Uh, but by and large in the media side the amount of time and effort that it takes today to create something that is map Is huge India today for instance is doing a piece of work which has made that infrastructure much simpler for states But not further down It will take time for us to get there the infrastructure in the open source community For mapping is a whole lot better. Interestingly, we are getting As much support from outside India as we are getting from The open source community in India and of course more support whatsoever from the authorities such it did touch upon that Earlier mapping is unfortunately from a government perspective. It's likely closed If we wanted census map today, unfortunately, the only way is to Try and take the various bits and pieces of it that have been crowdsourced or put together a massive commercial effort to Create But what about the mapping between census codes and spatial boundaries? It has been attempted. Uh, I Do not know of any reliable attempts. So I know of one nothing effort the exact URL escapes me Which takes at least a list of census codes and parliament codes as it were in parliament constituencies And says this particular parliamentary constituency is made up of seven percent of this Of this ward, so eight percent of this ward, etc. It is approximate because the underlying maps are approximate Yes, I know of at least one such effort that exists given the reasonably accurate maps that can be automated It has not to my knowledge is the mapping between census and constituency has not been done and it's not Hey, uh, I think I can unfortunately that is all the time that we have for questions right now If you have more questions once again sp dot lk slash has geek You can ask any other questions right there. I know that you guys have many many more on your minds Please take the discussion there if you see him outside ask more questions then Everybody give a big round of applause. That was a fantastic discussion right there You