 Good morning, everyone. I am Kalyan. Welcome to PyCon India 2020. Our first keynote speaker is S. Anand. Anand is a co-founder of Grammar, a data science company that believes in storytelling and its power to influence decisions. He is recognized as one of India's top 10 data scientists. Anand, I am very pleased to welcome you today. The stage is yours now. Thanks, Kalyan. I'm going to be talking about how to make apps seem faster without optimizing and to explain why this is the topic for my keynote. Let me take you back a dozen years ago. This was 2008 and I had just about learned Python at that time and I was working on a project and one of our clients, he said, Anand, there's something that I'd like you to do for me. They were buying raw materials from many countries and currencies fluctuate. So he said, look, I want to figure out what my next month's cost is going to be. I know roughly how much I'm going to buy, but the currency could vary. Can you predict what that cost is likely to be overall? And they were buying from China, Mexico, Canada, India, Japan, variety of different geographies. Of course, pretty large data, not just for those days, but even in today's terms, but I had solved similar problems before. So after about 20 odd years of coding, I figured, look, optimization is the least of my problems. So I went ahead and I'm going to show you what exactly I built very briefly without going into the details. I'm going to use the modern equivalent of what I did. The code in those days, obviously, was very different. So today, I would effectively load the purchases data using a library like partners that it didn't exist then. But the data effectively said, these are the currencies that we're purchasing in. And this is the value in US dollars that we are purchasing. Then for each one of those currencies, I tried to figure out what the best model was to forecast the prices. At that time, there weren't that many fancy models. Auto regression was one of the few models that was reasonably reliable. It basically says that I can predict the price of a currency next week by looking at the price of the currency today, yesterday, day before, etc. And figure out what is the weightage I should assign to today, yesterday, day before and how many of our days of lag we use. So what this code does is effectively goes through each currency and reads the historical currency data. It goes through various possible lags. Maybe we should look at the last three days or if you want a better model, maybe the last five days, maybe the last seven days, maybe the last 10 days. And we don't really know which of these is an optimal model. If we take, for example, three years, which is 1095 days, then it's a very likely accurate model. It may overfit, but it's still likely to be an accurate model, but it will be slow. So I'd ideally like the best trade-off between speed, which is where optimization comes in, and accuracy. And then fit the model, figure out which of these seems to be the best model and use that for the forecasting. That's the same approach that I took in 2008. And then once I found out what the best model was, I then predicted for the next 30 days, figured out how much it's going to change after 30 days versus today and wrote a print statement that said the sales value will move from whatever to whatever the previous purchase sum to the forecasted value. If you didn't understand the details of it, what you need to know is basically I wrote a program fairly quickly that would tell them that there's a 4% increase in the next month's purchase price for all of their commodities. And took it to Gobi. This took about a day and a half for me to put together a test, run, build, et cetera. Let's run this. And this is exactly what I did with him as well. Here I am. I'm going to show you what this program does. It takes historical data and runs it through a forecasting pipeline and effectively while the program was running, and this is exactly how it was in the original demo, I was just saying, yeah, this is what it's doing, blah, blah, blah. And Gobi was very patiently waiting. It was actually a reasonably fast program. In the sense that it crunched a hell of a lot of data. But you don't really want to stare at a blank screen for anything more than 10 seconds or even that long. What you really want is for the results to start coming through quickly. And that's when I realized that it's not enough if I learned optimization, no matter what language it is in. What I really need to learn is also to make apps seem faster without optimizing. And that's what this talk is about. All of the slides and the code for this talk, you'll find on github.com slash srn0 under PyCon India 2020. This is the one URL that you need to know to revisit any of the contents from this talk. And if you want to catch me on Twitter, then it's srn0, that's the handle. So let's dive in. How does one go about making apps seem faster without actually doing the optimization? Well, I was doing some research at that time and over the years I've picked up a few tips around this. One of the best tips comes from a book on human-computer interaction by Alindix and others. They talk about three things that an app should have for it to be usable. Learnability, flexibility, and robustness. For my purposes, the robustness was the relevant part because that included four things that I thought were really important. The first is to prioritize. What do we want to show first? Let's make sure that people see the most important things first. If I sorted the data alphabetically. So let's take all of the currencies. All of these currencies are sorted alphabetically and I was processing the currencies in this order. But they are buying $3.4 million worth of items in Chinese Yuan and only $1,000 in Danish Kroner. Why would I ever bother processing Danish Kroner ahead of, let's say, the Mexican currency where they're spending $2.3 million? I should really sort this in descending order and predict the most important ones first so that they get the best forecast early. That's an example of prioritization. Second was just keeping them updated. What I was doing was running a verbal dialogue saying, yeah, this is what it's doing. It's a effectively entertaining copy while the program was staring at him with a blank cursor. We need to show progress. We need to tell the user what the state of the application is, how long it's going to take. We also need to recover from errors. The number of times I've pressed control C or the number of times I've had a network error or the number of times where I was writing an application, debugging it, and then I had to restart all the way from the beginning. I can't even count. What we want is an application to be robust by saying if it gets stopped or you stop it midway, you can continue from where you left off. And pre-compute. If you know that some part of the answer is not going to change, just calculate that up front. No need to recalculate it every single time. And these are four principles that go into making an application more robust, making it seem faster without actually doing any optimization. And in this talk, what I'm going to do is firstly, ask you to remember the word purple in mind. P-U-R-P. As the acronym that will help you remember that by prioritizing, updating, recovering, and pre-computing, you can help make things seem faster without actually being faster. And I'm going to go through each one of these step by step with examples to show how you can optimize the same application that I had shared. You're welcome to follow through on the code in more detail on the GitHub repo that I shared, but for now it's probably best if you focus on where I'm pointing to in the code. I'm going to start with updating users, not the first one. In fact, originally I thought of whether I should reorder it in the way in which I'm going to explain it, but then the acronym ended up being IRP, which isn't exactly a great acronym. So bear with me, please. I'm going to start with making sure that the users know what's happening. And the obvious way to do that is print statements. So in this particular piece of code, which eventually, after heaven knows how long, prints that the sales will move up, well, in this case down from 11,046 to 11,077, if we just introduced a print statement, one here, as we loop through every currency so that it prints the currency, and then as we loop through all of the lags to say, this is the lag number of days I'm computing. Let's add that. Now, if I run forecastprint.py, the good part is that at least it constantly keeps me updated on A, the fact that it's doing something, and B, what it's doing. Even printing a dot or any kind of progress is good because people know that the application is working, but printing the specific action that it's doing helps us be more informed. So in this case, I can see that it's processing Danish crores. What it doesn't yet tell me is whether it's gone far or there's still a long way to go. I don't know the percentage completion. That's a gap. Also, it's a bit difficult now to scroll through. I mean, there's a lot of printing. Maybe I don't want this level of detail. Sometimes I want this level of detail because otherwise it's not fast enough, but sometimes I don't want this level of detail. How do we solve for that? Well, the next logical step is to use a library like logging. But before we go there, the main takeaway that you should keep in mind is that if something isn't printed every three seconds, then the users are going to worry. I'm talking about console applications, but on the web, if they've started something and it's a long process and they haven't received an update in three seconds, they'll assume something is wrong and tag it to bad user experience. So what can we do that's a little bit better? Use the logging library. Now, the reason you want to use logging is, one, you can put multiple levels. So for example, let's take the currency. We can print the currency at an information level and we can print the lags at a debug level. Now, what this lets us do is, firstly, it's printing a little more detail and I've configured it to show the times and I'll quickly explain why that's important. But it segregates the prints into whether it's printing for info or for debug. Now, I can choose to say instead of logging at a debug level, just log at a higher info level. And in that case, it logs a little less. It just logs the currency that is currently being processed, not as rapidly updated, but still at least I can comprehend the progress and see what the flow is. And also I can see that it took three seconds to do this. It took three seconds to do this. So it's roughly taking three seconds per currency, which is also helpful. And that's a feature that the logging library provides by default, where you can specify a format and a daytime specification. But another reason why I always prefer logging over print of any kind when deploying in production is that it's very easy to rewrite it to log files. It's not just when running it live that we want to see what happened during in production. We usually want to save the results elsewhere too. So that's the next logical step. But remember, we still don't have a percentage completion. That's the next logical step and EQDM is a perfect library to do that. It allows you to show the percentage completion. It just requires a couple of changes. After you import EQDM from the EQDM library, take your iterable and wrap it around EQDM. All I did was put EQDM off and the original array, similarly for the list of lags. And what that does, in fact, let me start by only adding the log to the currency. So when I do that, it starts by showing me the percentage completion, which is currently 0%. And now it's moved to 6%. Now it's moved to 12%. I also know that it's taking about 3.2 seconds per iteration. And at this pace, there's still 40 seconds left. So I can use that to decide whether I want to step away from the system or I want to do an alt tab and look at something else, set a timer, whatever. I know when I need to come back. And that is very powerful. So when you're showing progress, if you show percentage complete, that gives as much information as possible to the user. So that's the next stage in updating progress. EQDM also supports this at two levels. So if you put in a EQDM under that, then what it will do is show you the progress, the first row for the first overall progress and the second row for the second. Normally it would just show two lines and the second line would constantly get updated. But on Windows and currently I think in VS Code, there seems to be a bug where it's not able to move one line above. So basically look at the bottom two lines and ignore the previous lines. The last but one line shows how many currencies it's processed and the last line shows the progress of the lags. All of this works fine when we are dealing with console applications. What if I want to create a web application in which I want to keep computing things and show progress? WebSockets are a great way to do that. WebSockets allow bidirectional communication from the server to the client and from the client to the server. So the server can, as it computes any result, push something or write a message to the client and tell the browser, this is what I've done. So let's show you an example of this in action. And then I'll explain how this was created. So I'm opening an application that prints the currency that's being processed and the lags that it's processing the calculation on. And it effectively looked like a series of print statements. Take a look at it. Looks like I have... Oh, okay, sorry. Forgot the name of the application. This is currently processing it for fewer lags than I was expecting. Let's reload that. So it's processing and effectively doing the equivalent of printing for every single currency and every single lag. The only thing that I did here was instead of print, wrote a write message. This, of course, requires a WebSocket application that handles it. Currently, I'm using GramX, which is a framework that we've built that's open source, and you can look at that on grammar.com. But you can pick either directly the WebSocket library or you can use tornado or pretty much any WebSocket library. And the important thing here is that communication with a WebSocket is simply a matter of writing a message. How exactly does that work? Let's look at this a little more closely. I'm going to go to the Network tab and select the WebSocket section here and reload this page. You'll see that a WebSocket has been opened here and this particular WebSocket, we should make it a little smaller, it's constantly sending some text. This is exactly the text that is being printed by the server using Handler.WriteMessage. And that's being received by the client. What does the client-side code look like? Very straightforward. We create a new WebSocket at whatever is the URL. What I'm doing is taking the current URL and replacing HTTPS with WS, and then changing the .html. So there's a forecastsocket.html and changing that to forecastsocket.ws instead. So the URL is simply the same as the current URL that we've got replaced with the WebSocket. And when this WebSocket is opened, we send a start message to the server. Out here, whenever a message is sent, the way GramX is constructed here, we've said whenever a message is received, run this function called forecast. And forecast gets a handler and the message. Now, we aren't really doing anything with the message. What we have to do is to the handler, just write whatever message we want. Browser then receives this in the onMessage function of that WebSocket. And what we do, taking that message, is simply take the list of messages and add to the HTML whatever is there in the message data. And what that message data has is literally whatever strings were being printed here. Now, this is an opportunity for us to make the front-end application as responsive, as console applications, in fact, potentially even more so. And we'll talk a little more about it. But so far, the one principle that I've talked about is how to update progress. There's one more thing that I'd like to highlight here, which is when doing this, set expectations well. So there was this application that we were building for the Ministry of Finance a few years ago. The data was on a really, really slow network and it was quite large. So I didn't want to take risks. So using the equivalent concept of WebSockets, what we did was set up a flow a bit fast. Now we've really optimized it, but let me pretend I'm on a slow 3G network. Reload this. And what this does is it's loading the data and it's showing the progress of loading the data. But I was worried about something and you may have also noticed that it got to about 20% and then suddenly started showing the results. This is a hack. What I said in the application was it can take up to two minutes to forecast the results. Now, in all probability, it would probably take only about 10, 15 seconds, max 20 seconds. But it set the expectation and also set the progress bar for two minutes. So people are expecting that it'll take long and then if anything, they're pleasantly surprised that it was running faster. This is one of those cases where we are actually tricking the user. We don't want the application to actually be faster. That's the topic for a different talk. That's optimization. If you want something to seem fast, one of the most powerful ways is to set low expectations and you can either set accurate expectations and take the risk of randomness or plan for a certain amount of buffer. Plan for things to take a little longer. Set expectations that it might take longer and if you beat those expectations, cool. User's happy. Let's move on to the next topic. How does one go about recovering from errors? So if we are midway through an application and we stop, we press control C, network stops, the application crashes, we have error in the code, how does one continue from where we left off? Let's look at what we've done in this application, the forecast recover application is added a library called SQLite dict. SQLite dict allows you to create a SQLite database that's persistent, in which you can store a key value pair and use it just like a dictionary. So, for example, I said Python from SQLite dict import SQLite dict, that loads the library and it should happen faster. Let's say this object is equal to a SQLite dict which I'm going to store at some test.dv and I tell it to autocomment that every time I put something in the object, it stays. So if I say x is equal to 1, y is equal to 2, treat it exactly like a dictionary, update of a is equal to 3, b is equal to 4. That's all. Now, this creates a test.dv file and this test.dv actually contains exactly the same contents that I had. Now, you could instead store this in a database. Yeah. I suspect the data is too small for you to see and I'm not able to make the screen bigger, but just recognize that this is a SQLite editor where I'm seeing the values in a pickled form. As long as the contents that you have are pickable, SQLite dict will be able to store it and what that means is that the next time you load import and then load, the obj will actually be able to access all of the keys that you had earlier. So if I converted obj into a dict, it contains x is 1, y is 2, a is 3 and b is 4 from the previous session. So that's the first thing that we're doing here. Just create a cache that we can recover from and then what we do is, if as we loop through, if the currency is not already cached, that's when we'll do all of the calculations. If the currency is cached, then just pick up the model that we want to use from the cache and in here, we take the cache, which is the place where we're saving all of these models, at least the best models, and store it if it's not already stored. That's it. We just made these four changes a bit of a error here. This should have been changed here. Now, given this set of changes, how does the application work? Supposing I run forecast recover and it's doing the first round of calculations for Australian dollar. It's doing the second round for the Brazilian, what, Canadian dollar, and let's stop it there. Now, if I run it again, it skips the first few very quickly. Australian dollar, BRL, CAD, it's already done. How does the Swiss Frank Chinese Yuan check Kronor? And if I stop it and restart, it continues from where it left off. This allows us to debug applications remarkably faster. And if there's any error midway, then we allow people to recover. This is what's called forward recovery. That is, it allows us to go forward in contrast to something called backward recovery, equivalent of undo. Undo becomes important when we have write operations, but I usually deal with data and read operations are the more powerful or more important ones. And it's forward recovery that ends up being the critical factor there. Now, the other thing is, what if there is a change midway? See, we've cached it. Now, what that means is that the next time I run it, we're just going to zoom through and say, where is all of the data? Here's the answer. In the middle of the day, supposing we got new data for the Hong Kong dollar and we want to rerun it, how are we supposed to just rerun for the Hong Kong dollar instead of having to rerun for everything? That's easy enough. We can check when the file was updated. So if the currency Excel sheet, which is where we're getting the data from, was modified at a certain time, we can see if the previously modified time, let's not worry about the exact structure of the code, was earlier or later. And if that file was modified earlier, then redo the calculation. If it was modified later, then no need to just jump to the end. The way we do this is by caching, not just the model. Last time we said cache the best fit model, but this time we're saying cache the best fit model as well as the time at which it was computed or the time at which the file was updated. The next time it's rerun, we check whether we get the model for that particular currency. It may or may not exist. If it doesn't exist, then assume that it was computed really long time ago. Get the first value, which is the file time and compare it with the new file time. How exactly does that work? This is in forecastchange.py. So the first time it runs, it does the same thing as before. It goes through every file because it hasn't yet created the cache. Now, the second time that it runs, we've cached at least a few of the values. So Australian dollar, Brazilian, whatever, and so on. The first three, it had absolutely no problem and went on to make changes. Supposing I got a new version of the BRL data. A little worrying, so Brazilian currency. What's the name? Real. Okay. The Brazilian real. So let's say we have the Brazilian real data and we have a new update. Let's say we have data as of the 30th of September, 2020. That one USD is 5.65 BRL. Oh, if we run the forecast changed application, it's actually recalculating the BRL, but not the Canadian dollar or the Swiss franc and so on. That's because the file timestamp has changed. So this is another way we can make it seem faster by just avoiding calculations that we don't need to do and recognizing that applications aren't really meant to run once. They are meant to be run multiple times. But secondly, and I think this is perhaps a more important one, there's no way I would have known this to be a scenario. Meaning I had no way of anticipating that we may get new and better currency data midway, unless in this particular case, that's what Gopi told me. He said, look, we get currency updates midday. There are corrections that happen in the market and there are human errors, there are data errors. So in the morning, the file that you get may be accurate 80% of the time. But the remaining 20% of the time we want to make this update. And that only comes through feedback from people who know the domain. So one of the really effective ways of figuring out how to make your application seem faster, more robust, more responsive is to ask people what scenarios they anticipate, what problems they face, nothing like feedback, which brings us to the third part, which is pre-computing. What we're doing here is when we are running it the first time, we are just doing the calculation at one shot and storing it. Is there a way of improving it? Again, this is one of those cases where feedback was important because when I spoke to Gopi about it, he said, look, Anand, you're ultimately using auto regression and you're trying to figure out what is the best lag for it, fair. But why do you need to recalculate what the best lag is every day? So for example, if you find that for the Canadian dollar, a two-week lag has the best prediction. That's not going to change tomorrow. It's not likely to change in the next week, two weeks, next month. The periodicity is something that generally stays stable. So this inner loop is really just doing that re-computation. Why don't you avoid that? So let's see how a program like that would work. I'm going to show you the output first and then explain how the code works. So the first time, what it's doing is going through all of the lags and trying to figure out which one is the best lag. By now, it's figured it out for at least three currencies. The next time I run it, it only runs lag five for the Australian dollar. 365 day lag for the Brazilian real and the 10 day lag for the Canadian dollar because it's already figured out that this is the best lag and we store that and there's no reason to rerun that. So now let's do the next round of computation. Since we have stored these as the most efficient lags for all of these, it just does one run. This is one of those cases where we are effectively doing the equivalent of partially training a model and doing the equivalent of temporal transfer learning. That is the learning that was valid yesterday is still valid today and will be valid until we retrain it with new data. So pre-compute things where it doesn't really change that much and the way to do that is to split the code into parts. Some parts that change often, some parts that don't change often as long as you know the difference, they're good. Which brings us to the fourth or, well, in our order, the first thing that we were looking at to make the app feel faster. Remember, I mentioned that these are currently being computed in alphabetical order and there's no reason why it should be. What I really need to do is figure out which is the most important variable and that would be the Chinese Yuan because that's where they're buying the most followed by the Mexican. This is one of those cases where I should have done some pre-computation. It's a Mexican peso. So the Mexican peso is clearly the second largest and I guess the Canadian dollar is the third largest and so on. So what if we just sorted these and put them in descending order and did that computation upfront? Of course, if you're printing the result only at the end, then there is no point because people anyway have to wait all the way to the end. But what if we could show the progress and allow them to see what the result is as the application progresses? So last month's cost was $11 million. Now we're forecasting an increase of about $7,000. Now the application's number fluctuates, but after a point it stabilizes and Gopi would say, okay, look, see by the time you got to 75%, that's good. So I don't really have to wait much longer. It's more or less stabilized. And that gives us the ability to cut it short early. Just focus on the most important bits because things like the Danish Kroner, if I'm only spending $1,000 there and who cares about that? So that's what the app that I'm going to show you now does. When I click on start forecasting, it's pulling the data from the backend and showing the progress is also a little panda that I'll talk to you about that gets happy when it gets close to full completion. And as it does it, it shows what is the change in cost that is anticipated and it stabilizes after a certain point in time. All this does is takes the data from the web socket. Let me explain how this works. We sort in this ending order and then as before we loop through all of the purchases. We compute the models, we predict what's the best result and we take the forecast for the current currency. Put in a few additional parameters such as what is the progress in percentage, what is the total value that we need to show? What is the total forecast that we need to show and send it all on a web socket? On the client side, we receive the data from the web socket and we write a little bit of code that says there is a DOM element called total value. Just put in the total value there formatting it. There's something called total forecast put it in. All of these are just coming from the JSON parsed data that we are sending from the server, which is what creates this application. Of course, the Pandas, they are mostly just for cuteness, but it's also a powerful way of seeing of indicating emotion through data. There's a library called comic gen that you can use just to do a search for comic gen and you'll find it, which has a series of characters. Some of these characters are, all of these characters are programmatically createable. So the character I used is Panda where you can copy paste this code and this character gets created. And one of the parameters is the face, which continuously goes from the Panda saying, oh, oh, to the Panda smiling. And that's all based on one parameter here, which is face is equal to. So what I did in this code here was for the comic gen character, I set the face attribute to one minus the data progress. One minus because face is equal to zero makes the Panda smile. That's face is equal to zero. And one means it's unhappy. So I had to swap it around. So the net result is that we have an application that seems fast, even though in reality behind the scenes, it's taking something like 30 seconds or 40 seconds and we also are able to stop short midway. So remember that one of the best ways of speeding up an application is optimization, but that is not enough. You've also got to make the application seem like it's optimized. And to do that, prioritize by showing the most important things first. Update the users by showing them progress. Recover from errors so that they can continue from where they left off and pre-compute so that you can calculate anything that can be pre-calculated upfront. Remember purple, that's an acronym that you can use to keep all of these items in mind. With that, I'm pretty much done with the bulk of this talk. And for you to get access to the code, you can go to github.com. Slash is on and zero. And under that, my repository for all of the files that I just showed you as well as the data is under icon India 2020. Before I go to questions, though, I did have a question actually for you. And while you ask your questions, actually I'm not entirely sure what the mechanism for questions is. So I'll let somebody in the organizers help me with that. But while the questions are coming in, I'd like you to fill in a short survey. I'm curious which of these resonated with you? Which of these four will you be using in your next application? If you had to pick one, would you pick prioritizing or updating the users to show progress, recovery, recomputing? And I'm also curious why you'd pick that. So you can go to bit.ly slash icon survey. And you'll see a survey that's bit.ly slash icon survey. And do fill in this form. I'm very curious to see what the results are. I actually will also be looking at the results live. See what exactly are people preferring. And it will be good for all of us to know whether one of these techniques resonates more with the audience compared to the others. So the link is bit.ly slash icon survey. I'm also conducting a workshop later in October. It's an open and free workshop. Anyone can join at any point. If you want to be informed of this, let me know. In this workshop, we'll be going through the same set of techniques in some more detail, but to actually help build an application that seems faster so that you can practice some of the principles that will make an application more responsive. So with that, I will open the floor to questions. Oh, I'm not sure we take questions for keynote. Oh, got it. Okay, fair enough. Well, in that case, we'll just give people a minute or two to see if the results are starting to come in. Interesting. What we see is that by far people would prefer to show progress. That's like 2x, 3x. I guess that's a principle that we are literally seeing as the screen progresses because we are seeing the progress live. Almost 50 odd people are saying they would want to show the progress. What's interesting is that the second most is recovery from errors. And that's arguably perhaps not too surprising. People I'm sure when debugging are seeing that they would want to recover from errors. But what definitely surprises me is pre-computing is coming up close to the last or actually the last. Because one of the most effective ways I've seen in the past is recalculation. It looks like an application is doing work, but in reality it's not really doing that much. It's already got the results and it's just doing the minor tweaks related to that. Prioritization is at the moment at number two. Overall, updating users, that definitely seems to continue maintaining the lead with recovery as the clear number two on this. Incidentally, if you are wondering what technology is used for this survey, you can see that it's as low-tech as it gets. All it takes is Google Forms that saves the result into a Google Sheet. And as these results come in, this chart just picks up the data and puts it in. Incidentally, I set the chart so that it stops at around 500 rows. So by the time we get to 500, we'll probably end up not updating the chart. It's good to know what people feel. Well, I am really looking forward to see how and looking forward to your users seeing how the progress of your applications can get bigger. If you have any inputs, comments, whatever, you are more than welcome to reach out to me on Twitter. I'm at sanandzero. If you have any suggestions for this talk, which I'm hoping to repeat in a few other places, please submit that as an issue or a pull request on github.com. One of the best ways to contribute back and to practice is by trying it out and learning. And I do hope you'll get an opportunity to do that. With that note, wish you all the best in making applications seem faster without optimizing. Thank you.