 All right folks. Welcome to this session. Wednesday morning. Did you have your coffee? I hope so. Thanks, Erty. We have Leo here and he's going to talk about sequel alchemy and performance in sequel alchemy, I assume, from the title, right from days to minutes, from minutes to milliseconds with sequel alchemy. Warm welcome to Leo. Thank you very much. Okay, so Leo, I'm a tech lead at Jiru, which is a fintech in Brazil. It's a company that provides credit for people who need a quick loan. I'm an expert, not a particular expert in SQL or SQL alchemy or Amazon, in general. But I've learned a few lessons optimizing some code in my company last year that I would like to pass along. So Jiru, as I said, is a fintech in Brazil. Our stack is mostly Python, Pyramid, SQL alchemy, Postgres. There is some MongoDB, some Celery, some Java somewhere. We have been carving out this monolith that we started it with in different services, and so some new services get written in some different languages. But most of the time, we end up picking Python, Pyramid, SQL alchemy, and Postgres. And I really like SQL alchemy. Well, it has two aspects. It's the core, which is basically a DSL for constructing SQL queries using Python constructs. And then on top of that, you have the ORM, which helps you map tables to classes and records of those tables to instances of those classes. And for a programmer who mostly knows Python, it's obviously a lot more comfortable dealing with the instances of an ORM than it is juggling SQL. And we at Jiru use the ORM most of the time, and we tend to use very little the SQL alchemy core for writing SQL constructs directly. And I think SQL alchemy is awesome, however, whenever you choose a framework, it's usually good if you know what the framework is doing behind your back. Because frameworks still require you to make decisions about how to use them, and knowing the underlying patterns is essential if you're to make wise choices. So using the ORM is really comfortable. As I said, you declare your class, you get instances from it, and then you're manipulating it with the attributes and the methods like there is no SQL database behind it, and that's really comfortable. But it's also a problem because the database is an external system. Most of the time you should actually treat it as if it were an API of a foreign system. Because it is, really. It is an external system you're talking through a TCP IP connection or a UNIX socket connection. And the API to that just happens to be SQL. So what happens then is that you're writing your Python code, and it looks like perfectly normal Python code, but ends up doing bad performance access to your database. It's noticeable in low volumes like when you're developing or when you're just going to production so you get glued into this false sense of security that everything is going fine. But then after a while, your database starts to crawl, your application takes too long. The fix for that, of course, is to let the database do its job. You need to be aware of the implicit queries that your ORM is doing whenever you access your instances in your classes, especially when you have relationships between your instances that map to different tables in your database. Those are the ones that tend to cause the most bad pattern of accesses. You should try to do approximately only one or a constant number of queries to your database per HTTP request or API request or background job. And you should avoid as much as possible looping through instances of your model in Python code. Because the database is a lot better at doing that. You should also be mindful of the amount of round trips you do. You should try to do only a fixed number of queries per request, because every round trip costs time. But you also should be mindful of the amount of data you're pulling out of the DB when you do some requests. So I'm going to talk about specific cases here that I've optimized. We had this report that we ran one that we still run about once a month. And in the beginning it was really fast. As years passed, it was taking over 24 hours to run. So let's talk a little bit about Jiru. I don't know if it's readable. But Jiru is a credit company. It provides loans for people who access our website. We do everything online. You snap your documents and after you're happy and do a credit analysis, we send money to your bank. And then you pay back your loan by paying bank slips. And you never have to actually face. We are designed so that you cannot actually visit us to talk about your cases. Everything is done online. An early funding model for us to get money to lend to people was that Jiru created this funding company, a separate funding company, that issued the ventures. The ventures are like an official loan from a company that the market can buy to give money to the company without becoming a partner, without being a stock option. So the company would issue new ventures every six months. And these ventures had a contract that said that whatever the loans, the borrowers would pay back, would be the payback for the people who purchased the ventures. So the company was never insolvent. Of course, we still want to have a good credit model, otherwise people will not want to buy our ventures. So we issue the ventures every six months and the venture holders, by these ventures, they put money in the company. And so we grant loans without money and the borrowers pay back those loans. And at the beginning of every month, we look at what we got from and pay back once a month the venture holders. Of course, it's a lot more complicated than that because part of the money is the amortization, which is the money that the venture holder actually lent us. And then on top of that, you have the premium, which is what you are paying on top of what the venture holder put in so that they can be happy with their investment. And taxes affect only the premium, not the amortization. And taxes, you pay less taxes, the longer it takes you to pay amortization. So we do dances with the numbers like we are only paying amortization for a while instead of being premium. Then we save a little bit because in Brazil, you cannot only pay premium, so we have to save some of the amortization and then start paying back the premium and then paying the amortization last, so there's a whole bunch of numbers. But that only happened in later, the venture issuances, the first venture issuances, we were paying back from the principle of the loans that we granted to the amortization and from the premium of the loans that are granted, the premium of the venture holders. And the entity relationship model looks somewhat like this. You have the venture holder, the venture belongs to the venture holder and to the venture series. The venture series has an account. All the operations from borrowing and the payout come from this account, so we have a bunch of operations. And if that sounds awfully complicated, I don't know where it gets a lot worse. So let's look a little bit at the code. How do we actually code that? So first you have this base class from SQL Alchemy. We declare it by putting some, creating our own base class, which we called here ORM class. And we added a convenient class method here. It's actually class property. Yeah. So what we do here, we take the SQL Alchemy DB session and make it easily available inside the class so that you can say, oh, I have my model.query and then you can apply filters, order by, join with other classes and things like that. So it's a little convenience attribute in the class. And so this base here, there's this method, this function from SQL Alchemy that makes this, the reps, this ORM class and you have this magical base class here, which we then have to use in all the models that we declare. So, like we said, the debenture, which is the thing that we sell to the debenture holders, which is like the loan that the debenture holder is granting to us. It has its database key. It has a serial number, which is very different from the ID because the serial number repeats inside every debenture series, whereas the serial number repeats inside as the ID is constantly different for all the debentures no matter what. There's a sale price because if you buy at the beginning of the series, you buy it close to the official debenture price, but if you buy it later, it costs you more because it's the price of opportunity for being late to the party. And so we have a sale date. Here we are continuing the same class. So, those here were regular attributes of the class which get mapped to columns in the database in a table called debenture. Here we are talking about the same class, but here we see some relationships. So, we have this holder ID, which is also a regular column, a regular integer column, but then we add a declaration that it's a foreign key into the ID column of the debenture holder table. And on top of that, we declare the holder relationship, which has as foreign keys the holder ID that we just declared. So, this way, I get a holder attribute on my debenture instance which fetches me the debenture holder instance transparently for me, but this transparently means it's doing a query to the database at least once. During the session, SqlLock me actually will cache that instance so that it doesn't fetch it again every time I access the attribute. But still I need to be mindful. The first time I access this holder attribute, there's going to be an SQL query, unless I play some tricks on it. So, as I showed in the graph, the debenture also has a relationship to the series, to the debenture series that is issuing this debenture. And again, we have a series relationship mapping to this column . There's another interesting aspect here, which is this background. We saw it here at the holder declaration as well. What this does is create a debenture's attribute in the debenture holder class that points to an iterable kind of a list of debentures that I can conveniently access from the point of view of the debenture holder. Same thing here in the debenture series, I get a debenture's attribute there, which is an iterable of debentures that points to that debenture series. There's this lazy dynamic here. What does it mean? That actually tells you what kind of iterable it is. If you don't say anything, it will create lazy attributes that the first time you access it will make a query to the database and bring back a Python list. But if you do like I did here and say dynamic, then instead the debenture's attribute inside the debenture series will be a query object, pretty much like this query object here that allows me to apply filter, order by and other things like that. By doing these dynamic relationships, I enable the application to lazily get a query object, apply other operations on top of it, like filtering, ordering, joining with other relationships, and only then when I try to iterate over it, it goes to the database and fetches the data. So I had to debug an issue that was found by the financial people that we were not paying exactly what we should be paying the debenture holders. There was some discrepancy, and I started debugging this hours-long report. It took hours because, well, the report itself took about five hours, but it also depended on another process that cashed some information, and this other process also took about four hours, and that was for each debenture series that we had, and by the point we had about six debenture series, so the whole run of reports took more than a day. And I was starting debugging that, this is taking too much time, so I enabled SQL Alchemy to do debugging for me. If you take the standard Python logging, you take the SQL Alchemy and gene logger and set the log level to info, it will log every query. If you set it to debug, it will log every query and the results, and when it logs the query, it logs the parameters that are used by the query as well. So I enabled the logging and started running the report, and suddenly I'm seeing gobs and gobs of the same query repeated over and over and over. So what I did, let me find the diff here. So let's see some code. That's not readable, right? The people in the back, can you read it? Nope. How about now? The people in the back can read it? Okay. Let's see if I can show the things. I think I'm going to switch to mirror mode. So this is going to be kind of hard to fit all the code in this space. But the kind of things that were being done here, so we have this total pading, what does this method here do? It gets all the money that was paid in the debentures of that debentures series. This is a method of the debentures series class. And so you can call it by specifying a period if you want, and it gets the debentures that are owned by someone, the debentures that are actually sold, and this is a property that brings the debentures related to this debentures series that actually have owners. And it's adding filters here, if you pass the start date or end date, and then it's doing some of the sale price of all these debentures. The moment where the code actually goes to the database to fetch is the moment where it is called into this object. And what this is doing here is taking a huge number of debentures, pulling all of their columns from the database just to sum their price. Now when you look at this, it's perfectly reasonable Python code, right? You're summing the sale price of a bunch of debentures. That's exactly what you want. But here you are pulling a huge amount of information from the database just to get the sum of what is essentially a column. And if we had done something, if instead of sale price, we had done something like d.series, not only would I be fetching a bunch of queries, a bunch of information with all its columns, for each point in the loop, I would be fetching another record in the database with all its columns and then summing it. This is called the n plus one select problem. So instead of doing that, what I did here, I created this function called get column sum for query. What's the query? The query is debenture, so you see it's the same word here and here. But then I'm going into the class instead of the instance to get the sale price column. So I did not actually have to write SQL by hand to have SQL Alchemy do a performance query for me. What does this get column sum for query here do? It's right here. First, it gets the query. It assumes the interval is a query. It drops whatever order it has, I'll explain later why. And replaces all that it was going to fetch with a single expression, which is this coalesce sum of the column. Why does coalesce sum? It's right here. It's using SQL Alchemy funk sum, which is a representation of the sum function of the database around the column. Now, this doesn't need to be an actual column. This could be a column in our expression or anything that feels like a column to SQL Alchemy. So it's using the sum aggregating function from the database. And then it's calling coalesce with zero to it. Why is it doing that? Because in SQL, if you do a sum of a bunch of records, of a column of a bunch of records, but this sequence of records has no records in it instead of getting a zero, you get a null. Or if you have records, but all the columns in those records are null, you're going to get a null back. But most cases, when I want to use a sum, I actually want a zero back in those cases. So I create this coalesce sum function to return a sum of a column or a zero if there is null in there. Please ignore the filters and labels for a while. We're going to go back to it later. So I replace all the entities in the query with just the coalesce sum of a specific column. And if you remember it, it was the banger's not sale price. And now the reason why I drop whatever ordering the query had is because since I'm calling an aggregating function, either I have to have a group by clause on the SQL or whatever ordering I have needs to be part of this group by, or I cannot have an order. So I make sure there is no order in this query, especially since I'm not using a group by here in this case, which means that the database will return a single record. And in this case, a single record with a single column, which is the sum I'm asking. Because of that, I'm calling the scholar method from SQL alchemy, which does exactly that. It gives me the number that is in this, the value, not necessarily a number, but the value that is in the single column of the single record of the query that I just did. If the query is not a single record with a single column, it raises an error. But this is a very convenient function when I want just a number, which is a sum of a column. So coming back here, instead of looping through a huge number of objects, with all their columns just to get the sale price, I'm asking the database to bring exactly the sum that I want. And I got that in the diff with a single line that's very readable. So in this calculate total paid out, which is everything that I have already paid to the debenture holders, they have the same issue. But here, instead, they were looking at the payouts, which were the operations of payment operations. And then I replaced by the same thing. Going back a little, going under a little bit more, there was this whole method here. It was calculate total values involved, which was doing a huge amount of those things, getting the sum of a bunch of operations. These was taking a very long time. And the operations were all a result of calling these methods here. Payback operations, earning operations, earnings tax operations and things like that. So when I looked at those methods, let's search for one of these methods here, for example, earning operations. So what it was doing, it was looking at self.operations, which is one of those query relationships that I showed, filtering by a specific tag, filtering it by some specific dates, and returning them. And all these operations, earning operations, earning tax operations, we're doing the same thing. So what I did first was to factor out the data filters and create this class that's really just a record that's collecting filters. So a payback filter is this criteria, source type, it goes notes, tags, it will not payback. And what I did with those was I refactored those other methods to get those operations, call these operations in period, which is where I figured out the data queries, the data filtering of the operations, and replace those with the filters applied. Why is that useful? Because then I could also do this other method here called calculate operation summary, which returns a single record. What record is that? It takes all those operations the same period, order by none, replace all the entities with the summary columns. Why call them summary? Because just like a bisterior is a collection of bists, a summary is a collection of sums, and that's exactly what this method does. It takes a coalesce sum of the same columns that we were looking at, giving them a labels that are like the variables that were being collected in that method above, and returning those as the columns that I'm going to put in the query. And because this is a coalesce sum and I'm not doing a group buy, the result of this query is going to be a single record, which is why I call the one method that returns that single record. So when I use this method here, I get back this operation summary, which is a record that has as attributes the labels that I passed to the columns. So when I get back this query, when we get here, that's why we had this label parameter here in the coalesce sum so that I could give that specific label to my summing column. And what is this filter here? There's this characteristic in SQL where if you have an aggregation function in a column like sum or average, you don't necessarily need to do that over the whole of the records or the whole of the records of a group. You can actually apply further filtering. What does that look like? So that isn't readable, right? Is it readable now? Okay, one more, just to be sure. So yeah, this is nice to show as well. I have some of those models that I declared. I created the manager holder, added it to the database, flushed it so that it has its primary key. And here, I show what a query for the manager looks like. In SQL alchemy, you can print a query and it gives you the stringed file version with the parameters there in place. But it's nice to know what the parameter is going to be. So I created this function that formats and colorizes output and tells me what are the parameters. So if I do the venture query, but filtered by, created the bigger one today and holder equals a certain holder, that variable that I created just above, it generates this query to the database. So we can see that it has created here and the manager holder. And I can see what parameter is going to pass. So created is a date time with this value. And the parameter one here, which is the manager holder, has this integer value here, which is the primary key of the manager holder that I was doing as the filter. Now, the nice thing about here is that I'm actually comparing the relationship object, not the ID of the object here, but it translates me into comparing the ID in the query. So here I created the venture added to a series. The venture series object is actually complicated for relationships, so I created a factory for it. And here I can check that the holder ID in the manager holder is exactly the ID of the manager holder that I created. So when I look at the ventures of the manager holder, just like I said, it's a query that selects all the ventures that match that the venture holder. If I also filter by the venture state and the sale date, then I get this other query here with all these parameters replaced. So going back to our optimization, the venture series operations is this query over operations here. And when I ask for Jiruf operations, which was that method, it's the same query over operations, but with added filters in the where clause, right? Ten minutes, okay. So here is what I did with the operation summary. I replaced all the columns in the operations with those columns. And since the formatter wasn't very good, I did my own formatting here. Let me show that instead. So instead of... Since I added those filters at the column level, instead of putting those filters at the where clause, it's actually putting those filters along with the sum. This is very useful for things like I want to know the percentage of clients that have a certain characteristic over all of the clients. I can do that by doing a sum filter and dividing that by the sum without the filter. And the database fetches that for me. I don't need to do two queries, one with the where clause and one without, and then dividing it at the Python side. I can have the database do that for me at the database side. So here we have the operation summary. It creates a bunch of columns which have these filters and doing the respective sums for me. So here we have payback operations. And here we have... So you have these tags here and these tags there are here. So we have note paybacks and fees and earnings and things like that. So it's filtering them at the select clause. In the where clause it's just making sure it's selecting the operations with the right state belonging to the right account and in the right time range that I asked. So with a single hit to the database, I selected subsections of other operations, did a bunch of calculations for them and got exactly the information that I wanted. And with these kinds of things I got completely rid of the cache and brought down the time of the report from nine hours per series down to four hours per series. Why four hours? Because the rest of the time it was not just a report. It was actually inserting the debenture payments into the database because next time I wanted to run this report I wanted to run a difference. And anything that was not collected correctly or any rounding error should be paid in the next month. So to optimize the insertion, I had this... I had the system... Let me locate it here. So the report already had some optimization done before, which was to calculate the insert query for each debenture payout manually. But it was still calculating one insert per debenture every month. Of course, most of these inserts were... They look exactly the same because all the debentures bought by the major holder on a certain date have the same calculations. So they were caching the calculations by sale date and holder ID, but still they were looping through all the debentures. And then creating those inserts one by one. Instead of that, what I did was to loop through only each integralization. An integralization is a set of debentures bought by a holder in a specific date. So I created this criteria, which is the integralization columns, which is the holder ID and sale date. I'm adding a count of the debentures so that I can do the proper calculations. I'm joining the debenture holders and the series here so that it fetches everything at the same time. And I'm asking for a distinct query so that the database only fetches one debenture per integralization. And because of the distinct, I need this ordered by the integralization columns and the debenture series number. What does the query look like? So because I asked for a distinct, S.K.O.A.K. may render this distinct on query here. And then selected all the information from the debentures, but only of one debenture per integralization. And then it joined that information, it's kind of hard to read here, but it joined that information with the summary information I needed, like the sum of the debentures paid and things like that. So it goes with a single hit to the database, fetches all the information and then I can loop and it's all calculated. But the most important part here is that instead of calculating the insert by hand, I get the debenture payout table and ask for an insert query. And then instead of a regular insert, I do an insert from select. And I select from the integralization debentures, replacing the entities with the columns that I need to populate one debenture payout. What does that look like? So here I do the same query and what it does is it does an insert into the debenture payout, all the columns that I selected, and what it's inserting is a select of all those parameters. Some parameters are constant, some other parameters are from the select. And then it's mapping all those information. So instead of doing inserts one by one, manually reading an SQL from the Python side, I'm actually asking the DB to do the inserts for me. And that brought down the report time down to 15 minutes from nine hours per debenture series. The last optimization I did, and this is the one that justifies from minutes to milliseconds, is I optimize a page rendering based on how much time it was wasting authorizing the user. Let me find it here. So during the authorization phase, while rendering a specific page, it was checking if the user had permission to see certain bits of information. This is server side rendering, so it's not so fun to do these days. But it was doing user.admin.haspermission, except that admin is actually a property that does a query to the database. So it was doing a query to the database every time and then asking if, and getting an ORM record and asking if it has permission, which also goes to the database to fetch if that permission belongs to the user. It was doing that all the time. So just looking at this code I already said, well, just pull that has permission out. But the biggest part of the optimization actually happened, well, I replaced the property by Reify, which caches the property and fetches only once. But also, instead of having those profiles be dynamic queries, I did the opposite. I pulled all profiles at once, also through a scale walk in a relationship. So we have this all profiles relationship here that is not a dynamic query. It goes to the database once when it's accessed and pulls a list, and so we have a Python list every time you access it. And then I created get matching profiles, which is a Python file that loops through this list. Now the thing is, this is a tiny list. The roles of our user is a tiny list. So it actually makes sense here to pull the calculation out of the database into the Python side, because I'm minimizing the amount of data that I'm circulating. And the other big part here is that while fetching the user for the authentication, no, it's not this, it's here, while fetching the user for the authentication, I'm actually instructing SQL Alchemy to join the load all the profiles and join load of those profiles all the permissions. So it does a single query pulling the user all its profiles and all their permissions in a single hit to the database. And now when I access the properties of this linked data structured, I'm not going to the database anymore. And that brought down the pages that took minutes to render down to milliseconds. Now, so in conclusion, and yes, there is a conclusion. Okay. Yes, to figure out what was happening, I used this, there's this Python package called slow log, which is good for WSGI applications. When a request is taking too long, it starts dumping stack traces into a log file. It's perfect to see what is wasting time where without you having to be there when things go slow, you can just go to log file later and retrieve it. And then I saw that that function there with admin was the slowest one. So, in conclusion, our aims are very nice to get started if you're a programmer and not very familiar with SQL. It's a good way to get started. I'm not dissing on RMS here and I love using RMS. But you should understand your SQL. Read the select documentation of your database and try to understand what every bit does there. I figured out that you could apply filter where inside some columns by reading the select documentation. Understand group by and aggregations and how aggregations reduce the cardinality of your functions. Learn about aggregation functions with filters. Learn about distinct on and window expressions that will help you write SQL that fetches things very efficiently. And then studio SQL alchemy. Be aware of the underlying queries that it does. Purchase as much work as possible to the DB but not too much. Because sometimes your query is going to spend hours. And that's it. Thank you very much. But I'll be around if you have any just a few minutes. Welcome to the second talk of the morning. We have Samuel here from the UK. From the UK, awesome. And he's talking about Python parallel programming, folks. Welcome for Samuel. So I guess I better start off by introducing myself and how I got to where I am. So I built and run a SaaS platform called Tudocontra which is a monolithic Django app running on Heroku using RQ for doing worker tasks. We've now, as we've grown, built a bunch of microservices outside that using AOHTTP and a bunch of other libraries. As part of that in that process over the last three or four years I've become a contributor to AOHTTP and to RQ and built a bunch of libraries of my own, in particular ARQ which is an async.io successor to RQ or I would say successor. And Pydantic which isn't really relevant but is quite popular. So I wanted to give this talk because I got a long way as a developer without really understanding the landscape of how parallel programming works within Python and also in general. And so I wanted to kind of give a high level introduction. So I'm going to talk about the four levels of concurrency or the main four levels of concurrency that I see. I'm going to demonstrate each of these two levels of concurrency. You might use them and why you might not use them and I'm going to try and keep it mildly entertaining. What I'm not going to do is try and prepare you for a computer science exam on distributed computing or read the spec to you or talk about the protocols. This is going to be quite high level. You're going to have to bear with me on that. So why is parallel processing important? I think this graph kind of works. What's interesting is that Python was conceived pretty much on the left-hand axis of this in about 1990 when most computers had one CPU and when CPU speeds were increasing really quickly. In around 2005 that effect plateaued and suddenly we started getting multiple CPUs in computers, both in servers and in desktops. I guess that was partly because people wanted it and it was needed something new to sell. At that point, Python had to adapt and implement multiple processing, parallel processing but the kind of interesting thing is that it didn't start off with that. It had to retrofit it later and you can still see a few of those bug bears now in the Gale and stuff as I'll speak about later. The right-hand side of this graph, the kind of pick-up recently may or may not be right. I think this data they might have used, they might have done different benchmarks on high-end processors more recently than another one so that might be why there's that uptick at the end but I'm not sure. So no talk would be complete without a math metaphor and so kind Tom who works for me has built this metaphor in Minecraft. The principle here is that we're thinking of a factory like a computer. We're thinking of a process like a conveyor belt within a factory. We're thinking of CPUs as a bit like individual people working on those production lines and then we're thinking of networking as the trucks coming to and leaving the factory. So the highest level of parallel programming is multiple machines or computers or virtual machines or containers. Anything where the code sees itself as running on a specific computer and this is demonstrated here with factories as multiple different factories. So instead of building one factory bigger you have multiple factories all running independently but perhaps networking between each other. So in this case they are they're not networking between each other they're just doing their own thing. You can imagine scenarios where we do that quite a lot so for example front-end servers on a web platform would generally talk to the database and talk to the client and use things like cookies for state but they wouldn't actually talk to each other about how many other machines were running around them. But quite often they do have to communicate and that is where communication comes in. So to get to an example we're using RQ here. I promise you this is the smallest text we'll go at any point in this presentation. I hope you can all read it. So RQ is a queuing library built with Redis as the name indicates in particular it uses Redis's lists to do the enqueuing so to enqueue a job we push it into one end of the list and to execute a job you pop it out the other end of the list and then execute it. So the code we're going to use for most of our worker examples here is in the top file here at worker.py. It's very simple it just downloads a web page in this case from Python Europe for one of the last few years takes the text and counts the number of words so splits it and then counts the number of elements in the resulting list. Very simple in reality you wouldn't need multiple machines or even multiple anything to do this but that's our example. Below you see the code we would use to enqueue those jobs using RQ so we take a completely vanilla Redis connection in this case I'm demonstrating it here on my laptop so I'm not actually running it on multiple machines I'm running it on multiple processes to demonstrate the principle but bear with me and then we're for each of the last four years enqueuing a job where we where we run count words. Now one of the interesting things to see here is that even in our RQ example which is running on the main machine not the worker we have to have access to the count words function so we can import it so we can enqueue the job. So if we look at running that below first thing we do on the right here in our two workers is we call RQ worker and that starts the worker which is doing a blocking pop from Redis waiting to enqueue those jobs we simply call our example here RQ worker that bangs those four jobs into the queue and prints out the result which is what RQ gives us that we could use to get the actual result later and then you see those jobs being executed in RQ here and if you can see closely enough you can see the years and how many words it's not very interesting but there we are so the advantages of multiple machines scalability is the big thing you can add machines very easily also adding machines has a linear cost if you have 10 machines and you want an 11th it gets 10% more expensive and lastly isolation which is demonstrated here with our factories if one of your factories were to blow up you can simply add another factory on the side or in the case of my graph pan to the left because adding a new factory in real time was too hard this advantages of multiple machines well mainly complexity you have to set up all your networking between your machines that's made very easy by platforms like Heroku and others but it still can be a problem particularly during development and so as you saw earlier quite often we use multiple processes to simulate multiple machines so to go on to multiple processes this is within a single factory running multiple production lines in our analogy processes are an operating system concept and they're designed to keep different programs isolated from each other whilst running at the same time so they were developed for I guess originally for desktop applications where you were running two completely separate things but you can use them to run the same code in parallel so here's our example you see immediately it's quite a lot longer and the other main difference here is we're not using an external library for the queuing we're using Python's standard multi-processing library Python's standard process and joinable queue so we have exactly the same code here for adding up stuff for counting the words then we have our very rudimentary worker which is just a loop that runs taking jobs out of a queue and either executing them or if the job is none that's our queue to quit and so to enqueue those we have to create our processes the really interesting bit here is happening on line 20 what Python's doing in the background there is forking the main Python process to form multiple sub-processes which at that point share memory but any further changes of memory would be copy on write so they would be changed so we now have completely separate processes and that new process is set off to run the worker function we just saw the argument in this case is just an ID to tell us what worker we're running in queuing our jobs is simply as simple as doing put on our queue object that Python has helped for given us we can then wait for that queue to be empty for all of the jobs to be finished then we have to go about putting the none job into each of those queues to stop them and then we wait for our processes to finish and you see there it printing out our words as it did before again, not very interesting so the advantages of processes they're really easy to run no networking required you get this OS level guarantee that your multiple different processes are isolated they can't share memory after they've been forked and they're pretty fast to communicate either by doing networking on a machine or into process communication all very quick compared to multiple machines I should say these advantages of processes are quite significant you have very fixed limits to scaling if we go back to our factory analogy and we want to add another production line into our factory there's nowhere to put it if we want to have four production lines we need to build a whole new factory I guess decommission our old factory and start running our new factory if we want to go back to having three then I guess we have to ignore our new four production line factory and go back to the three and secondly we have to build a big factory so we can make it five times bigger or ten times bigger but it gets prohibitively expensive to have a thousand core machine so it's not linear to scale whereas you saw with multiple machines it was linear and again we don't have isolation if our machine breaks the whole thing's broken so next we go down to multiple threads so threads are a way of achieving concurrency from within one process they come in kind of two variants kernel threads and user threads or green threads when we talk about threading in Python we're talking about kernel threads so it's important to remember that kernel threads are the only way from within a process to run a task on two different CPUs at the same time so we can do lots of things that look like parallel but unless we have kernel threads we can't be running things on two different CPUs at the same time and in our analogy here we have three of these boxes that technically have faces and they're supposed to represent the workers so we're running multiple things on the same process so our threading example looks suspiciously like our multi-processing one that's not a coincidence Python's tried quite hard to keep the interfaces the same between processing and threading so we have the same function as before we have exactly the same worker except we say quitting thread so we're importing here from Q and from threading to use those versions rather than the multi-processing variants this is all basically the same except obviously line 21 where we create the thread we're creating a secondary sub thread within the same Python process instead of creating multiple processes again we bang those bang the years into the Q to run the workers wait for them to finish and then we get the result again so the advantages of threads they're lighter, even lighter than processes they're faster to create and faster to switch between and they share memory which can be an advantage but it can also be a big disadvantage and so disadvantages is kind of exactly the same thing they share memory so memory locking is horrid to use a go proverb communicate by sharing memory do not communicate by sharing memory so we can do that with Python threading Python provides some primitives for doing communication between threads but if you're not careful it can all go wrong and you won't get a nice warning it'll just burst into flames the second and bigger problem is the global interpreter lock so from the Python wiki the girl protects Python objects preventing multiple threads from executing Python code more than one thread from executing code at once so the whole idea here was we would run stuff in parallel and now we've heard about this lock thing that prevents us from doing that at all let me try and demonstrate that with another example we've taken pretty much the same code but instead of doing a network request now we're doing something CPU bound so in this case counting a bunch of numbers and we're using standard Python sum for doing that and we're going to do that in two ways one we're going to do that in a normal for loop and we're going to go through all the plover of creating our threads and running them in parallel what happens? well it's not very exciting we actually get exactly the same time in fact it's even slightly quicker to do it without multi-processing without multi-threading because we don't have to have the overhead of creating the threads all is not lost you can do this same task with multiple threads and make it quicker we're using in C C in turn can release the global interpreter lock and so here we can get the advantage of multiple threads so it's going to be quicker because it's done in C but also we see here that we nearly half the time by doing it in multiple threads so I guess the not quite half is that we have the overhead of creating those different threads and so anything where we can release the global interpreter lock because we're executing in C or where we're doing file IO tasks or we're doing networking threading can help but in pure Python CPU bound tasks doesn't really help so lastly we come to the fourth level of parallelism within Python but also not necessarily unique to Python which is async.io I think this is really cool I am obsessed by async.io, I think it's wonderful and I will try and persuade you that it's the way to go for lots and lots of things that we do in cooperative scheduling so we have one kernel thread but within that we have some wonderful tools that allow us to seem like we're doing things at the same time when in the background we're actually only executing one bit of code at a time to do this we have an event loop that's effectively scheduling tasks in a way to keep something happening all the time so I promise you I won't carry on pushing the metaphor any longer after this but without async.io you see here when we're doing networking our thread has to stop because it is waiting for the networking to come back and give us a response that thread and perhaps that whole process has to stop and wait for the networking to have finished before it can go on and do something else with async.io on the other hand our thread can carry on processing as networking tasks are going on because our event loop is doing a clever job of scheduling tasks to fill in the gaps so an example first of all you immediately see it's already shorter than our examples before we don't have to do half as much FAF and setup we do however have to call our coroutine using in this case async.io.run if it was JavaScript you could just set off your coroutine and hope for the best and it will finish in the end and no one seems to mind in Python you have to either await a coroutine or set it off like this if you haven't got an event loop running it's simply calling our coroutine which I'll get to in a moment and putting the results of that which is the future into a list and then lastly we use the special coroutine async.io.gather to wait for the results of those four coroutines and once they've finished proceed so how does count words work and here we get to the big problem with async.io we can no longer use requests we've had to rewrite this function entirely in this case we're using aio.http we have to create our session explicitly in this case requested it implicitly then we do our get request, we get back this is a context manager or an asynchronous context manager we get our response we can then await reading the text off the network for the response to that and finally we can do the same thing as before and we can write the number of words on our page and you see the result again so async.io even lighter than processes and threads we can quite happily have say thousands of web sockets connected to a single host processing all of them without enormous amounts of CPU or memory usage they're a lot easier to reason with because you are explicit about where you're going to go and do some networking where your current piece of code is going to release and do an await and so other code might get executed when you're doing networking and when you're not doing networking and there's technically less risk of memory corruption because we're only ever running one bit of Python at a time this advantages we don't get any speed up of CPU at all by using vanilla async.io but the real problem is a whole new way of thinking and in general you have to rewrite applications it's possible in theory to adapt them but in general I think you basically have to abandon an existing project and start again if you're going to start using async.io all over the place you might be able to get away with using it in a few places but in general it's a whole new rewrite the point here is the whole brilliant thing about async.io is that it's explicit but that means it can't be implicit you can't have some library that wraps around async.io there was someone asking it on Python Python ideas recently can't we make it implicit the whole point is it's not the point where it gets really tricky is where all of these four levels of parallelism interlink with each other so machines the RQ example I showed you RQ actually does forking in the background to run its worker the multiprocessing joinable queue that I showed you was in fact using a thread in the background to do the to put things into the queue async.io has thread pooled executor process pooled executor that I'll show you in a minute when they're communicating with each other because it's networking you then want to go into the async.io world and ARQ and area of HTTP that do that but all of these things interact and so it can get a bit confusing kind of where we are I want to talk about one of the uses of async.io that I don't think enough people are talking about which is in being a same way of doing multiprocessing and multi-threading particularly multi-threading for file operations and multiprocessing for CPU bound tasks you get all of the performance improvements from threading or processes but from the comfort of async.io and it's much easier to reason with so have an example here we we're using our same do-counts as we had just now in NumPy so we know that that is suitable that's a candidate for multi-threading because it releases the gil but instead of just calling our core routine we now have to create this thread pool executor that's creating a pool of threads in which to run our tasks and the clever bit is run an executor which returns a core routine that waits for the task to be completed within the thread and there's a process pool executor which has exactly the same uses just a different name and is obviously creating multiple processes and doing it that way and so we create this list of core routines and again gather them wait for them all to be completed and hey presto we get a time again we get the speed up of multi-threading but from the comfort of async.io so in summary I think I've probably not taken up half enough time have I? I don't have a clue we've talked about the four levels of concurrency we've said that they're all possible with Python none of them are unique to Python but they're all possible async.io I think today Python is probably leading the way in its ease at least even though it came to the party late I think it's kind of accepted as one of the best implementations of that I definitely think it's cleaner than what's going on in other languages except for arguably JavaScript but that has its own problems so they all have their strengths and weaknesses and the key thing is to work out which one you want to use for a particular application and that they often interact with each other so they're not unique they don't get to stand in their high castle and be on their own they're all interlinked with each other but the real point I was trying to get across today is that there's this landscape of different processes and processes is a bad word to use of different tools out there and you need to have a bit of an understanding about what they're doing just taking the first working example off the top of the page putting it into an editor pressing run and seeing what happens gets you a long way it got me to a company that pays my salary but it's not always the best way and it becomes a problem when everything goes wrong and you're trying to understand what's happening and you have no grounding because you've just taken the example and got it to work which is definitely what I did the first time round so thank you very much and I guess we've got lots of time for questions if we've got having said questions if we've got a couple of minutes I'll do a little tiny bit of advertising some packages I've built since we've got a second ARQ is a successor to ARQ but it uses async.io so it uses the async bindings for Redis and it allows you to enqueue tasks from AIO HTTP application or similar it also has some other useful features so it has this principle of every job has to be finished so it might be run multiple times but it has to be executed doesn't actually use a list it uses a sorted set which means you can enqueue tasks to be run at some point in the future and if they get stopped it automatically reruns them again when it comes back up DevTools I think is the most interesting thing I've ever written and no one seems to care at all so I'd love your feedback on it it's basically a better print command that tells you the line where it happened and what you printed and prints it in a pretty way I use it all the time but I've failed so far to persuade anyone else it's interesting and Pidantic is quite widely used as a type-inting using Python data parsing using Python type-ins thank you now questions lots of time for questions so we have also have two microphones over there you can probably see them but so please line up if you have questions line up behind the microphones and we'll be able to take a number of questions quite a number of questions I see we have one question so go for it maybe I didn't understand you well but you said there is no good tooling to do machine lever parallelism but I understand Celery is exactly the tool you can use to run your parallel workers either on a single machine or on a lot of machines so what I was saying is that built into the standard library there's no way of doing the cross-machine communication over HTTP or some other protocol there are some great libraries but they're not built into the standard library I think that's actually one of the most the reasons it's been so successful is that external libraries have to compete on being really easy to use and on iterating quickly and taking advice so one library has to be slow moving and has to be sure and has to not respond to advice half as quickly so actually I think it's in some ways a good thing maybe multiprocessing will be way easier if there was the equivalent of requests one library everyone used that was designed to be super easy okay thanks so I have two questions regarding RQ I'm already using RQ and I was going to ask first doesn't make sense for me to switch to RQ is that dropping replacement for RTO? that's the first question I'm using RQ now can I just switch to ARQ for my completely synchronous code does it make any sense to switch to ARQ for this? I mean you could do if you want it's more advanced features like running tasks sometime in the future and re-inhearing the job if the worker shuts down you can do you might want to go and use thread pool executor from within a particular job to do that job in parallel but in general ARQ actually same as RQ is only running per process it's only running one job well RQ is running one job at a time per process and it thinks you run another Haruku worker or whatever it might be or another job in another terminal to run multiple workers in parallel ARQ will run up to 60 jobs at the same time using async.io but obviously it's not networking or suitable for async.io that only one is actually going to be running at any one time so actually it was my second question so if I still have just synchronous code it will still be running one at a time unless I fork multiple workers anyway either you run multiple workers or from within a job you call process pool executor or thread pool executor maybe we're finishing early never no any further question folks oh yeah go for it I can give you the microphone is there any advantage to having a flattened list of coroutines that you want to run as opposed to calling a couple of different coroutines which themselves gather a list of coroutines or does it not matter as long as it's running on the same event loop it basically doesn't matter I guess there is some overhead to running a coroutine but it's so small that if you're worrying about those kind of steps you can try to see or ask or something in general don't worry about it that's not going to be your problem I mean at some limit of as I say at some limit it will become but at that point you're going to do it another way any further question otherwise I have questions I can ask you can you tell us a little bit about some sample use case real-world use case for say ARQ like you use it in your job if you can talk about that so we use it for sending emails so Tudor Cruncher the company I run we send I guess about a million emails a month not a great deal but at the point it gets to quite a high load we are currently tethered to Mandrill although they are awful and I hate them because our 300 customers have all set up their DNS records to send emails from Mandrill so moving over is going to be hell about 5% of emails you try and send through Mandrill you get back 502 or 503 or just a broken HTTP request and so we're using ARQ both to go and send lots of emails quite fast but also to back off and retry those jobs when they inevitably fail quite a lot and so does ARQ for instance have the facility to recue the failed jobs for you and let me try and understand let me try and get in the internet so here's an example of ARQ which is not actually different from what we were looking at earlier we have some tooling ready for setting up the things we're going to need when we're running jobs so you have a bit like something like AIO HTTP where you have start up coroutines for setting up say your database connection you can do the same thing here so you have start up and shut down where we can add to this context which is the first argument to any to any any job any function we set up any function we set up and I'm trying to remember here if I have an example of retrying jobs so basically there's an exception that you can raise which will retry the job and that is what is raised if you shut down the worker and the jobs haven't had time to finish so any job that gets shut down that gets stopped because the worker shuts down because the job is not removed from the sorted set until it's been finished so the problem with RQ is so I rewrote the Heroku worker which basically deals with shut down behaviour in Heroku because Heroku workers shut down variably that was killing us generating invoices for example which is one of our slower jobs and so when I built or rebuilt ARQ I built this principle that your job might run twice but it will always run at least once so if your job turns to take care of the fact it might happen more times but if it shuts down it will get re-inqueued okay in case your job runs multiple times I guess you get only one result it runs twice that's your job then to use an either potency key or to use a transaction or to do something in Redis to say has this job already started that's your problem but there's a kind of principle that you can never have exactly once in multiple times over I was taking an example of sending your customer their invoice each month they would get a bit confused if they received it three times but that's still better than them not receiving it at all and that's normally the case any question anybody okay so one last question and then I ask the next speaker to please come up slowly and set up there is no next speaker do you have any experience like releasing the GIL in C extensions and stuff like that writing those if so if you can speak about how you can do it if there are some tools that one can easily use so I haven't released in PiPi C extensions Pydantic we've just had a big effort as a community to a lot of people did lots of work on it which is really cool to see how the siphon which made it I think about 50% faster for lots of stuff but that is some tweaks you have to make to the Python but it's still valid as normal Python so in environments like windows where we don't have those binaries available it still just works exactly the same so not directly no okay that's cool if there are no more questions we can thank the speaker thank you very much correct okay he's going to present a cool new technology hardware technology from Intel to accelerate deep learning and yes you can if you want you can move to the front row but just to be cozy but anyway let's give Shailena a warm welcome what about the rest of you guys AI so deep learning because my talk will be focused around deep learning which is a very specific field of AI and I go even more specific on how I accelerate inferencing which is one of the important stages in deep learning it is a fairly advanced talk so I will turn it down just so that it's easier to digest so what I'm going to introduce to you actually this is the outline what is Intel deep learning boost it's a new fancy hardware technology I'm going to highlight what it is and why it is super useful and explain what are the new vector registers that we included in our latest hardware that we recently released in April I'll show you in a live example why it is so nice all right so cool so what is Intel deep learning boost it is a set of new AVX 512 registers in our latest hardware that we recently released the code name is cascade lake second generation of Xeon scalable processors these are really what we call Xeon, they are really powerful they are really good, they are really fast and it's great for AI and we are really proud that this new set of vector registers allow you to do inferencing really fast and those registers are called vector neural network instructions or for short VNNI and VNNI in itself gives you close to 2x boost on doing the inferencing now what is inferencing inferencing is prediction so you have your deep neural network it has been trained on your data set the next step is make the neural network think and make decisions so making the decisions that's what we call doing the predictions or technical term inferencing so we want to make the neural network decide faster make decisions faster infer faster and this is what we are bringing up to you hardware technology to boost decision making faster so oh yeah, so that was it the summary of what I just said and let's go back to the deep learning foundations so why what is involved in deep learning there's a lot of math involved so if you look at the typical convolutional neural network where you have a filter looping over your image there's a lot of math involved the cells being multiplied with others and then adding the numbers together so lots of multiplications lots of ads and this is heavy on a compute unit so since we make processors and processes what they do they compute we need to make sure that we can compute really fast in traditional typical use cases especially in high performance computing the data types are usually floating .32 so 32 bits numbers and we think that we could do things faster if we change the data type so I will go more about this in a second so recall back the convolution network that I mentioned these are really popular look at that this is if you look at the animated part this is a filter going on on an image and as it goes around looping over that image lots of moles or multiplications lots of ads and we want to make this faster so why do we need Intel deep learning boost one of the key terms a key concept is quantization now what is quantization consider this number 896.1924 if you represent this in floating .32 so 32 bits you take a lot of space in memory a lot of space in your registers to represent that number but the bulk the most all the credit of the information is actually in this 96 maybe you don't care about this .1924 it's insignificant compared to 96 right so if you represent this number as an integer you lose less registers to represent the numbers so only 8 bits to represent integer 96 and that's great so what benefits come out of this is you use less power so less CPU energy to represent that number we also lower the memory bandwidth we also lower the storage so you recall earlier you have four boxes to represent the 32 bit now you have only one so less storage for that and all of these are the benefit of higher performance that's the idea behind quantization here and as you recall you had this number 96.1924 so we are now reducing the precision of that number to just 96 so we are losing a little bit of accuracy so what is important is that we don't lose too much accuracy now a quick intro about this vector neural network instructions that we introduced so you recall that convolution this filter going over that image doing lots of multiplications lots of additions so we have vector registers in the hardware to do these multiplications and those additions faster so this is it so if on the first line over there that's the first generation of Xeon processors code named Skylake so if you would have two 32 bit floating point numbers being added you use one instruction to do that so you multiply these two floating point numbers you get an output of 32 bit and if you would do this in lower precision so int8 so you have two int8 numbers being multiplied and then if you have to add you actually use free instructions to do this free instructions to add two low precision integers and add them to get an output of 32 bit so we decided you are using free instructions you are spending CPU cycles a lot to do that can we do that better the answer is yes in our second generation cascade lake processors we combine those free instructions into one and this is the result the same stuff that you do in free instructions you do that with just one so effectively less CPU cycles spent in doing this process of adding and multiplying and then you add the two high precision int8 now you as a software developer maybe you don't care this is hardware why do I need to know about this hardware is there software out there that I can just do that out of the box easy for me the answer is yes we thought about you guys as a software developer and we contributed or brought to the market one product is a tool that can allow you to process int8 for you so a quick introduction to it so in OpenVINO in a nutshell when you have done your training of your model right it is in TensorFlow or Cafe, MaxNet whatever whatever framework you have used there so you have obtained your train model what OpenVINO does it takes a train model and sends it into the component called model optimizer to make that model more efficient and more CPU friendly let's put it this way and the result is this intermediate representation marked as IR it's a combination of two files an XML file in a bin file that contains the weights of your neural network and traditionally in normal OpenVINO that model will be in floating point 32 and then you do your inferencing on that train model that's the traditional way so first step get that train model optimize it with the model optimizer and then get that intermediate representation and then do inferencing with the inference engine that inference engine is the component of OpenVINO that will allow you to do inference on any type of hardware you have be it a CPU, be it an integrated GPU, be it an FPGA maybe or even the more video stick you know that little blue stick you may have seen that allows you to do inferencing on the edge for instance a drone that's flying or like a little robot going around you can do inferencing on that mobile device that's really great so this is how nice OpenVINO is now step backward this talk was about low precision inferencing so how does OpenVINO fit in there for low precision inferencing so what OpenVINO does it takes a 32 bit representation of that intermediate representation and uses a component called calibration tool it will calibrate this 32 bit representation into an int8 model so then once you get this int8 intermediate representation then you do inferencing in low precision so that's the big picture of it and the parts where you do all this calibration this is done once we call it kind of an offline stage you do it once once you have this intermediate presentation you can store it on your robot or your drone and then the online stage is this live process that you do on the device so this is the online stage anyway enough said about that let me show you now live results of the benefits of low precision inferencing so I will show you two cases in one case I'm going to do inferencing on the floating point 32 model in the other case I'm going to show you inferencing of the same inferencing of the same data set but in low precision integer 8 so let's see the difference so I have here an example which I will move to the screen my mouse went there so in this Jupiter notebook I'm going to show you in floating point 32 bit and see how fast we do that and inferencing in int8 and how fast we do that so let me maximize the screen, view, toggle header, cool so this is a very simple algorithm I'm just inferencing on cats and dogs it's an open data set and I'm using the Intel OpenVINO for inferencing so model optimizer will actually convert my ResNet model into this intermediate presentation this is done great the next step is declare the network and so on great I do that it's done and next import matplotlib that will load and take care of plotting the results and so on and right now I'm processing the image now it's done and then you can see that I'm proceeding with inferencing on my image data set there are cats, there are dogs really cute cats, really nice friendly dogs in there but pay attention to the numbers what is my rate these are 32 bit representations of my of my network and I'm inferencing approximately 300 images per second maybe you'd be happy with that now it says yes by leveraging hardware technology for that and so the calibration tool if you recall was that part that converts this 32 bit model into the int8 model and I've already done that so in the interest of time I'll proceed with showing you how it is in the int8 model so I define my target device as you can see the network int8 cool and I'm loading the plugin and I'm loading the model and so on and so forth now let's proceed with inferencing it's the same data set same cats, same dogs but as you can see I am inferencing faster approximately 600 close to 700 images per second and maybe we can plot this in a table to show the difference between the two here inference speed in 32 bit approximately 300 images and in int8 low precision close to 700 so what's the key message there leveraging low precision inferencing boosted by our vector neural network instruction on Cascade Lake almost got twice the performance on inferencing isn't that great so software technology low precision boosted by hardware twice faster I think that's nice so that's the key idea I wanted to share with you if you have the chance use Cascade Lake Cascade Lake is available on Amazon web services right now the only cloud technology that offers Cascade Lake with the VNNI instruction set and also if for the deep learning guys in there in the room take a look at low precision inferencing so from 32 bit the usual normal use case and the int8 there are corner cases where int8 won't work for instance when you really care about precision say you are looking at cancer cells in an MRI image for instance where very specific details are really really key and important then maybe not but in other cases like images of cats and dogs or maybe language processing sound and so on where precision may not be that critical then ET8 is a great boost gets stuff done faster that's it, thank you okay we have some time for questions I would pass by with the microphone hi, thank you first for the talk what's the trade-off in precision like if you compare the floating point 32 model and the int8 model you show the increase in speed but not the decrease in precision yeah there is some degree of precision in there and it all depends on the nature of your of the tasks that you're doing so in my case I was classifying cats and dogs and I did lose some precision some but it didn't affect my results so how much you lose depends on your use case I cannot give you numbers because that's very specific on the nature you could have some kind of validation metric but okay so in my case for the cats and dogs example my loss was not that great not too big difference for me was okay totally okay depends on your use case thanks so if I understood correctly in the demo you showed you compared using floating point without special instructions to use int8 with special instructions have you tried to benchmarking using int8 without the special instructions just to see if the lower memory footprint helps so of course let me show you back this thing this one in gray oops okay one more okay in gray that's sky lake the previous generation okay still the latest it's still well not the latest still the great Xeon processor and the one that's released this year is cascade lake in yellow the difference between these two cascade lake comes with vector neural instructions VNNI the comparison that you're referring is comparing the last one cascade lake versus sky lake yes and from there in my measurement results I got close to 2x performance boost so that was my real world result that I got on paper in theory it should be 4x if you do the math you know like calculating how many instructions you use to do some mall and some ads and how one instruction will do it so it's on paper 4x so on my machine when I did I got 2x 2.9x and yeah I think I also have it somewhere here speed up okay let me move that maybe hold on you see there difference okay 32 and speed up was 2.3 that's comparing int 8 and fb32 comparing the 2 int 8 on sky lake and int 8 on cascade lake in another example it was also around 2x yeah and right now I'm comparing 32 bit on cascade lake and int 8 on cascade lake so if you would do 32 bit on cascade lake and int 8 on cascade lake so 32 bit on sky lake and int 8 on cascade lake then it's 4x yeah any other question you skipped the calibration part here I mean you said you done it already yeah how much time does it that usually take and could you share what's it's based on or not yeah so the calibration takes some time to do that around 10 minutes ish depending on what big your model is and because I don't have time 10 minutes to do that that's why I did it offline but yeah for a model that was around 50 60 megabytes it was like under 10 minutes which you do offline one time and then you're done yeah the goal is like on on the edge in real life and you're not missing big quick decisions this has to be fast sorry Mike your calibration is based on what the calibration is basically taking the 32 bit representation of the weights and so on and representing them in int 8 and yeah so you know in a neural network of the nodes of the weights and so on so these numbers have been represented in 32 bit they take a lot of space to represent these numbers and going for all of that doing the calculations in between the nodes it takes a lot of CPU cycles to process all of these so the idea is to convert this whole thing well certain layers into int 8 actually not all of the layers are calibrated to int 8 some of the layers yeah right we have one question here so the reduction from 32 floating point to 8 bits is it like always possible or the algorithm that does it it always succeeds and if it does it is up to the person that created the model to evaluate if it is good enough yeah excellent question so a bit referred to the last question some not everything but some some layers it depends on the open Vino will figure out whether it can or it cannot and it also gives you a report like it has calibrated this layer it has done this successfully others it didn't but at the end of the day when some parts have been converted to int 8 you should out of the box see some better performance compared to puh with fp32 so it's answer can be that I cannot convert this like it could be when it's very complex or open Vino doesn't understand your neural network maybe any question five minutes I'm really happy there's so much interest in this that's great I didn't expect so many questions you guys are great so I have a question so you said that this technology is available on cascade lake yes are they for sale right now can we get them right now and will this be a technology that will arrive also on the desktop or is it going to be just on the seance on the data center type of thing so right now this vector neural network which is hardware technology is only available on cascade lake and future Xeon processes of course when it will come to desktop I don't know but it's on cascade lake now hardware we sell you mentioned sale right hardware is something we sell but open Vino is open source technology you can read the source code online if you want and our AI software tools are free and open source we don't sell these any other question yes one question are you planning to have this on mkl or are we going to like to add this operation on the maternal library like if you want to use it I don't know like standard python like for the next generation of hardware well mkl can treat in 8 but like do you will you have like some kind of like a convolution operation directly implemented in mkl that's like optimized like using this set of instructions Michelle may want to add something on this one you're asking about mkl there's also mkl dnn I'm sorry I'm butting in but I also work for intel and actually for this question I think I know the answer so the mkl dnn is the extension of the mkl library for these deep neural network operations and it already has support for intate operations so we're adding them as adding new algorithms as the architecture evolves and the vnn instructions are already there so I guess these instructions can be used not only through your tool but for example a compiler could generate assembly or the binary code that uses them so openvino is one of the software solutions that we provide that's already taking advantage of intate and this process of converting models to intate is actually kind of special because it's almost automatic it's almost like magic you don't need a data scientist to tweak it manually so that part is on the openvino but the mkl dnn instructions are available for anybody and there's other solutions that we're also working on at intel for example graph compiler called ngraph it's also taking advantage of these instructions through the mkl dnn implementation so yes many frontends can use this the direct optimization that intel provides for libraries like tensorflow will also be using these instructions if you compile it with the vnn architecture extensions so yeah one last very quick question if not we can thank the speaker again the speaker thank you very much very interesting so hello everyone and welcome next up Lillian Nandi will teach us about teaching coding to the next generation please give her a big hand I'd like to begin by thanking the organizers and also for you being here and it's a great privilege to speak before an enthusiastic and knowledgeable audience now in this modern Anthropocene A era education in computer science and computer programming is a must and it is the fourth r arithmetic, reading, writing and computer programming and is it not said that education is a process which discloses to the wise and disguises from the foolish their lack of understanding and there is no better subject to illustrate this than computer science or computer programming so I'd like to share a story with you and it's about a journey of developing a working fit for purpose successful model for teaching computer programming to the next generation and I believe that it is important as a society and as individuals to have some insight into this process so having joyfully accepted the position as head of department of computer science at an independent secondary school in the UK after my initial euphoria had subsided I immediately began to think of the heavy challenges that lay ahead with a great deal of trepidation and the challenge would be is that I would be acting as a one person band single-handedly teaching all year groups from year 7 who are aged 11 and 12 to year 13 who are aged 17 and 18 and there would be about 110 students in all and I would be teaching this newly introduced subject to school students not only in this particular school but it's a newly introduced subject in the nation and also a newly introduced subject to the world a number of people had resigned even before starting the job and I was under no illusion that the job would be arduous, frustrating, back breaking, political with heavy red tape however I am of the belief that problems are important to solve and they should not be insurmountable so the challenges of teaching computer science and computer programming are very well acknowledged in The Economist there has been an article which has said the subject is so young that teachers and curriculum designers have little pedagogical research to guide them and Sardia Nadell the CEO of Microsoft recently said in an interview the fact that most curricula in schools still don't recognise computer science like they do maths or physics it's just crazy now as mentioned we should not think of these challenges as insurmountable we are after all dealing with computers and technology which is one of the world's most dominant and powerful forces in the current era and also the world's positive, disruptive force technological force it completely over turning centuries old thinking and approaches in the way we tackle problems in the scientific artistic, commercial worlds and replacing them with radical, innovative and more successful approaches so in turn, I thought we can think of the teacher of computer programming as another positive disruptive force in which such a venture in the spirit of an entrepreneur with a mindset of freedom and independence which is to be welcomed and follow in the footsteps of our heroes so an entrepreneur he requires a business plan with a vision and a mission objectives, methodology unique selling points or USPs so we too can mimic that model so let us look at what we have under each blurb vision and mission what vision should we adopt as an educator what vision should we transmit to the young people and the students of the next generation we need a vision which everyone can buy into and I believe we are privileged in this subject to formulate lofty vision and how many other subjects can say the same so the vision is to create the next generation of Bill Gates, Steve Jobs and Elon Musk now we shall see later how this vision appeals tremendously to young people and it's not just boys it's girls as well in equal measure let us look at the unique selling points USPs our sales and marketing angle now we have to take note of the fact that this subject of computer programming is competing against other more established disciplines with long and time honored traditions such as mathematics Latin history we must also be mindful of the fact that there are a number of stakeholders in this the educators, the parents the children and so the key element here is to generate USPs which persuade all stakeholder groups and surprisingly enough their interests can actually be compete against one another and I'm not I'm going to leave you to guess which group is the most enthusiastic and embraces this the more most is that computers are ubiquitous and prevalent in most if not all sectors of our modern society applications involve medical research, weather forecasting, robotic surgery space exploration and in UK we had the astronaut Tim Peake going to speak e-commerce scientific research such as the experiments carried out in CERN where the worldwide web comes from and it was Tim Berners-Lee from the UK who developed that and that appeals tremendously to our young people especially in the UK driverless cars etc etc so computer science is now regarded as one of the leading disciplines of the 21st century and many of our children were surprised at this and indeed if computers hadn't been invented science may have ground to complete halt in the second half of the 20th century let's say some so as a consequence of this coding or computer programming is now regarded as many by an essential skill for any aspiring ambitious self-respecting young person in an aspiring nation and it has been dubbed the fourth R along with reading, writing, arithmetic and computer programming and in recognition of this new status and huge significance of computer programming governments worldwide have launched initiatives to have it taught in schools starting from the beginning of the school career in kindergarten through to junior school all the way through to secondary school and the regions in red are where the intention is that computer programming is taught from kindergarten through to junior school to secondary school and we can see that the nations all continents America, Asia, Europe etc so this USP appeals to the educational community now the second USP is the major role that technology plays in the world economy the market capitalization of the Fang stocks Facebook, Amazon, Apple Netflix, Alphabet is bigger than the economy of some countries indeed if it was a nation it would be the fifth richest nation in the world and it would be eligible for entry into the G20 the Fang stocks are in red now this was devised to be appealing to the parents and guardians who can then be most relieved to hear that their offspring are studying a subject which decent organizations will be waiting eagerly for their offspring services but when I showed it to the children they all gasped in absolute delight and glee that they were actually studying something of such value and enormity and the third USP is the financial rewards for studying it and it should be noted that the top 100 richest people in the world a substantial proportion of them are involved with computers 20%, probably 1 in 5 and when we look at the 10 richest people in the world substantially more are involved in technology and we have Jeff Bezos of Amazon Bill Gates of Microsoft Mark Zuckerberg of Facebook Larry Ellison of Oracle and this really did capture the imagination of the young people because they thought they can also join this club and girls as well so how do we approach the teaching of it? Well, as a first port of call we can look at the national curriculum now most nations in the world have a national curriculum or something equivalent to it in the UK we have a national curriculum for every subject this is a document detailing what should be taught in every subject and science also has a national curriculum but after reading the document and rereading it and a great deal of soul searching it was decided that the UK national curriculum in this subject although laudable in its intentions with very few design faults and process faults would be used as a one of a number of guides so the national curriculum would act as one of many guides but not be taken as the definitive authority on the teaching of the subject in schools what about resources such as textbooks well a decision was made to equip all students with a textbook and exercise book as note was taken of the fact that most serious academic subjects have both textbooks and exercise books associated with them now it might surprise people to know it's not uncommon to operate in this subject at the school level without textbook and exercise book and this has not been unnoticed by the students who comment on this informally now unfortunately no appropriate textbooks were available for the subjects year groups 7 to 9 that's 11 to 14 so worksheets a website and a YouTube channel which I've entitled little anonymous were all constructed to develop the resource space to be used by the students so I've constructed a YouTube channel entitled little anonymous which currently contains over 70 videos for these high school children who want to learn about computer science and computer programming 12 videos in total about Python programming so far and I have discovered that my university lecture of friends are also using them too there's a website and YouTube video and 65 subscribers so far you are welcome to subscribe so how should we start teaching school children programming let us now be more specific how should we perhaps pitch the lessons to year 7 students those aged 11 and 12 those at the start of their secondary school career in particular should we be teaching these drag and drop languages such as scratch as a first teaching language to find answers to these questions a study was made of what the young students learn in other subjects at this age and note was taken of the fact that it's not unusual for these students too in maths solve simultaneous equations in English to analyse poetry such as Radar of Kiplen's If highly esteemed poem and English literature they study plays such as Midsummer Night's Dream which is by William Shakespeare they are being asked to write essays at the moment on the advantages and disadvantages of Brexit so from this information we or I surmise that young people at this age are comfortable with being able to manipulate symbols and deal with sophisticated texts and therefore concluded they should be able to cope well with a textual programming language perhaps such as Python such as the first programming language so a decision was made to choose Python as their first programming language and the decision was made on the basis because I wanted to teach a programming language which is in demand by the employers of today out of the dozen choices and at the time of writing Python was the most widely used teaching language worldwide in schools and universities it's also the most widely used commercial language used by the likes of Google and the children really like this and there is a plethora of resources and many books being written on the subject which means that the language has a strong support network and has a huge quite a strong infrastructure and the language itself is under rapid development with new libraries being released into the public domain at a regular pace so and a Google search that's in yellow revealed a respectable number of hits and Amazon search revealed a respectable number of hits although admittedly there may be more hits for other programming languages still respectable though and a recent article in The Economist revealed that the number of people searching for Python in Google is going up at a tremendous pace more than any other language so we could say that there is more growing interest in the programming language than any other but perhaps the most glowing endorsement of all is the number of Google searches for Python outstripped those for international model and TV star Kim Kerr should do it what more could one want so how should we introduce it what about this teaching approach now a decision was made to employ a bottom up approach to teach computer programming as opposed to a top down approach the justification for the bottom up approach that it's a tried and tested successful and traditional method used in the teaching of computer programming to adults foreign languages and mathematics have also been taught in this manner traditionally now a bottom up approach is when concepts and ideas are learnt first and these are then used to solve problems a top down approach is when a student is presented with a problem and then he or she tries to work out how to solve it however there was some trepidation as whether the UK students would be accepting of this bottom up approach as the modern trending UK schools is to employ top down approaches while studying subjects and so it was felt in order for this bottom up approach to work they would we would need explicit buy-in from the students as they are relatively unaccustomed to it therefore we'd have to provide an explanation and this would have to be provided prior to any teaching so the explanation went like this computer programming languages they have an inbuilt grammar they can be thought of as analogous to human languages such as English, French, Chinese, German, Italian and just like we communicate with each other in human programming languages we communicate with the computer in computer programming languages which there are many and we are going to be using Python then an explanation was provided that just like you have essays they are analogous to programs we have paragraphs they are analogous to functions we have sentences they are analogous to statements and we have words they are analogous to keywords and we would be taking a keyword at a time and learning about its uses and its definitions and then we would be building up to write programs now this bottom up teacher led approach to teaching computer programming appealed tremendously to the younger students and their parents and the idea is that the basics are strong so that the school children can become not only confident but competent and happy to tackle any given programming problem so it is teaching by strengthening the fundamentals I was quite often asked by the students are you fluent in Python just like they ask are you fluent in French and in parents evening parent after parent were saying how much their children were enjoying the subject and how much they in fact loved the subject and I felt a huge sense of delight because now we were competing with music and history and Latin and all the other subjects and I think actually I think we were ahead of them but I didn't tell anyone that until now so let's have a look at some example programs written on a whiteboard by some year 7 students these are children aged 11 these are written by Therese so she's got 4J in range 1 to 11 print J prints the first 10 numbers it took them about 5 minutes to master the concept of a 4 loop and they this was on mass next one for J in range 1 to 101,2 print J printing the first 100 odd numbers and they squealed in glee when they saw these numbers coming out and then here we have a nested 4 loop a loop within a loop for program 3 and the children were quite comfortable with this concept also explained again you know before hand I had thought of it as a very advanced concept but after they sort of understood it like that I thought and they said to me what's the big deal about I thought yes maybe there is a big deal here this is from Jerome aged 12 and he has written programs using functions and so we explained the concept of a function and how it can return a variable and displayed it or showed them an example program adding 2 numbers they subsequently wrote programs to multiply numbers divide numbers and they played around with the concept of functions as well and again it didn't take them very much time to actually master this here this is their examination and this is Harry aged 12 and you can see here he's quite comfortable with the concept of div and mod he's quite comfortable with sort of terminology such as iterative statement etc here we have Charles aged 11 and he has written a program there to find the circumference of a circle and then he has written a program there to generate the first 5 square numbers and he can do this by hand quite comfortably and here we have May Ling here and she has provided definitions in an exam of algorithm and decomposition so she's written here an algorithm is a set of logical steps to solve a problem in a finite amount of time and I think it's quite good that May and her classmates understand the importance of solving a problem in a finite amount of time and here we have Boris here aged 12 and again Boris in his examinations being given a program and he's been asked to identify input statements iteration statements and data types which he can do quite accurately as well and remember Boris is about 11 or 12 when people are 15 or 16 now in the GCSE exams there are similar types of questions but they can answer them at 11 or 12 I think if they are taught and finally not least Kate is also aged 11 and she's been given a function and she's been asked to dissect the various facets of the function which she has done accurately as well now they were asked a question it's very important what their motivation is so they were asked an exam question computer science is said to be one of the most important subjects that a young person needs to learn about why do you think it's important to study the subject this was asked for 11 and 12 year olds now their reaction afterwards was goody I like writing essays and I think for three marks they did write essays so what did they write let's just have a look at some so here we have to raise aged 11 and she wrote the technological advances of the modern age that have been spectacular from simple things like apple watches to the more advanced artificial intelligence robots with these changes come a great responsibility for us as the future generation to understand this science so that we can continue to innovate and create we must understand the inner workings of machines and to question everything around us in order to build on the foundations have laid down for us when I was younger I used to believe that robots would somehow rule the world in some sense I believe that because if we can build a chess playing AI robot who knows what else human kind can come up with we must continue this legacy of computer science more and more companies are beginning to use technology in place of humans and although this closes up some jobs it opens up many more they need people to understand to look after check this technology in case something goes wrong we can be those people the future is us the young people we can make a difference the future starts now okay let's go on to Boris age 12 I think it is vital to study computer science as we are growing up technology is expanding and is becoming an essential part of our everyday lives nearly all of the famous and successful billionaires have made their fortune from making programs which have become used everywhere so now we have to have a chance to become as rich and as successful as them so here we have Boris age 12 a precocious intelligent ambitious Boris so a casual observer on reading this said to me the problem is that the ambitions aspirations and vision of these space age young students far outweighs and outstrips those of their career guides and that's my father Joy Devnandi so what inferences did I make from this or we can make well year 7 they found computer programming easier than year 8 who found it easier than year 9 who found it easier than year 10 found it easier than year 11 and the message is to start properly from the beginning is better point 2 points need to be explained properly students are happier with this student led or rather teacher led approach rather than a student led or independent learning at this early stage and they have told me this many times and terminology should be introduced in this way and it should be pegged onto examples and operational definitions there is a crying need for decent textbooks and resources fit for purpose and of course the best students are the ones who are motivated to do well now quickly moving on afterwards I moved on to another institution this institution took children from 2 to 18 so I was now exposed to even smaller or younger children so I was exposed to children age 9 and 10 which are who are in year 5 and I thought well rather than starting them off on coding and there was red tape around introducing them to coding because many people thought they were too young perhaps then we could introduce to them about the actual history of computing so what we did was to introduce them to the figures like Alan Turing, Steve Jobs Ada Loveless, Grace Hooper Bill Gates and we did like history lessons and then they wrote in word documents they did research and wrote up sort of projects about these people which they also incidentally presented for 5 or 10 minutes as well and as far as the institution is concerned they are learning word and they're doing ICT as far as I'm concerned they're learning about computer science so it kills two birds in one stone now this is Matt age 9 here and he did his research on Alan Turing he was absolutely enthralled by Alan Turing and so were the rest of the people in his class as well and we introduced them to the Turing test and they thought it was the most wonderful thing ever what was quite interesting was a few months before I had introduced it to 17 and 18 year olds and they ticked it off on their syllabus but to a 9 and 10 year old it was the most wonderful thing ever and they couldn't stop talking about it okay now the other thing about this school was that I was on break and lunchtime duty so I got to know these children in the computer lab very informally as well and they expressed their love of all things computing to me all the time so we had a little girl come in Shona at lunch times to learn more programming so I showed her a 4 statement which generated the 12 times table and she said oh this is a great way to learn your times tables but I know mine up to 12 let me write one which will teach me how to write my 13 times table so she did this and then she looked at the screen and she held out her hand and said I love this I love this it was for them they appreciate it with their hearts and their minds it's quite something to see then we have Tom aged 12 who was in year 8 now Tom told me he does his own personal projects in his time what do you do I said he said I do C sharp and unity what kind of things do you do he said I said I do artificial intelligence so I said can you send me a few lines of code I said so he sent me 10 lines of code and these are the 10 lines of codes and you can see 4 A4 size pages there so Tom sat in the middle of the classroom doing his own work whilst the others did other work and I sat him deliberately in the middle of the classroom so others could see aspire to be like Tom and indeed they were really trying their best to be like him and then he talked to me about artificial intelligence and he said he wanted to give a talk on it I said fine so he gave me an outline of the of his talk and that's his handwriting there which is very nice and you can see what he's put in his outline machine learning, visual path finding AI in infant carrying AI in cures etc he knew a lot it took him about 5 weeks to prepare his talk he stood up, he spoke for 10 minutes I was trying to get him off probably like you're doing with me because I thought he's taking up far too much of my lesson and then there were about 5 minutes of Q&A and then I said to the other children in future we will have more talks and more of you can give talks like this and they all nodded they all want to give a talk which is brilliant and then we have Bob here and Bob said to me one day he hadn't done his homework and I said why haven't you done your homework he said I'm very tired you know I'm really very busy I've got a job and I'm in a team of 30 doing development and I've had a promotion I'm trying to learn Java and C sharp and Python at the same time and here is Bob's email to us he's asking the school to load on Java so that he can practice his Java at school at lunch times and breaks and he tells us that he's always been fascinated by programming and it's an amazing opportunity to encourage coding at school he also wanted to be a marketing manager for my YouTube channel and here Bob towards the end of term he wrote me this little note here and he said oh all of year 8 enjoy your lessons I definitely do I'm developing my skills in Python learning the basics of C sharp have become interested in pursuing computer science for GCSE and you've inspired us to expand in 14 coding languages so they know a lot I thought so further conclusions I thought year 5 are actually better than year 6 who are better than 7, 8, 9, 10, 11 I think there are two curriculums there's the official curriculum and there's the unofficial curriculum which is student led and the unofficial curriculum appears to be more sophisticated and a substantial proportion of what is taught probably at age 15, 16, 17 can be brought down 10 years or so and so the idea is to maybe front load the subject and they're much keener at that age as well when they're younger so I guess as a child would say grown ups watch out and I leave you with a thought conclude my speech here with special reference to education which is general, global and eternal from an event almost 100 years ago the great Austrian theoretical physicist Paul was trying to establish the city of Leiden in the Netherlands as the centre of theoretical physics and watching his great efforts his great friend Albert Einstein described Paul as the best teacher in our profession and one passionately preoccupied with the development and destiny of men especially his students and it struck me and I'm going to ask you does this observation by Einstein on the teacher's preoccupation about students development carry any relevant message to our age, the teachers of the modern age when student development sometimes considered fake and machine learning intelligence and artificial intelligence are poised to take over our educational activities so thank you for your time and any questions are welcome Hi, thank you for the talk, it was really interesting I just wondered what your opinion was you kind of mentioned comparing computer science to maths, English and I suppose science more broadly as a kind of foundational subject but I wonder what your opinion is and somewhat encompassed in the few people you showed at the end in terms of machine learning and what have you in that it occurs to me that unlike the other say pillars of education computer science is a subject which changes year on year or if you're a front end developer week to week so and that has ramifications on say you talked about lack of textbooks well who wants to write a textbook when they're only going to get a year of money essentially from it so I wonder what your opinion is on that with respect to how you teach it given that it's so fast fast paced it is fast paced but the fundamentals the fundamentals have been the same for decades really so sorry for if the fundamentals have been the same for decades so you can first of all you teach the fundamentals which take a few years to teach and then you can move in step with what the current fashion is of the day or the impending fashions are predicted fashions of the day I mean really what they lack I think is they need a good fundamental base in order to be able to pick up whatever they need to pick up in the future okay thank you other questions hello did you use Jupyter or any other tools and how was the acceptance by the students I've only used so far and that's because well I mean first of all we just want them to master the fundamentals and python idle is fine but I do want to go on to other sophisticated IDEs as well but also there is a matter of administration as well so we have to take that into account but in due course we need to go on to those to the other IDEs and Jupyter and everything I've taught python to some groups of small children not to like 7 years old and I thought one of the barriers was keyboard proficiency and I was wondering if you had some ideas about how to motivate them to practice or if there is a tool like that that can motivate them yeah I mean I kind of that I don't know about the 7 to 11 year olds but when I was teaching the to be honest I found that the younger that they are say I taught the 11 year olds in earnest the more motivated they were to actually practice I wasn't persuading them to practice as such they were practicing on their own their general knowledge about the subject is now getting is quite good and they are reading about it and their parents give them say raspberry pies for Christmas and things and so and also if one friend practices then another friend also wants to practice so I didn't find the problem is with the younger children I'm finding the problem with the older children the 15 and 16 year olds thank you I just have a comment because you thought that when teaching the younger children you were teaching them the rather history of computer science because they were too young my daughter when she was 8 or 9 years old at school learned how to encode movements of some items to create a building create a structure or something and children in this age were quite good in this and it was a pretty good way to teach them imperative thinking required later in programming so even in this age we can teach children quite practical things that can be used later in teaching them computer science I mean I actually totally agree with you and I have I guess I have experimented with children outside of school that age and to teach them sort of little Python programs and addition subtraction and they actually cope quite well it's not I thought they were too young it's the establishment is quite orthodox in its thinking if there are no further questions let's go I just want to know how the subjects in schools in UK are affected by the parents' opinions because I actually am here from New York and they have this PTA and they are very influencing in what the school actually teaches of course it depends on different it depends on where the schools are I suppose what the school is parents are really getting on the bandwagon they want their children to be taught the subject and they want their children to be taught well and in quite a few places they are putting a lot of pressure on the schools that their children are not just taught but they are taught well because sometimes it's a new subject so parental influence is becoming greater and for the better any further questions? then let's give our speaker one big round of applause applause hello everybody to the next session our speaker for the session is Sven he's a software consultant from Hamburg he's doing development for Arch Linux and also some game development in his free time and today he will tell us how to become a wizard of the command line so give a warm welcome to Sven hello everybody so you want to become a command line wizard huh yes okay so yeah this talk basically it's gonna be sorry too quiet there we go it's a bit I'm quite a tall person so I'm gonna have to leave forward a little bit so yes in this talk I'm gonna show you quite a few new command line tools and in fact I'm gonna give this talk on the command line and the takeaway is supposed to be at least that you know some new tools at the end of this and you basically don't have to know anything before that like you don't even have to know how to get around and one thing to note is that even though I'm gonna be presenting many tools and many alternatives to well known tools this is not meant to be like that you have to replace your old ways of doing things it's just like that you can supplant your current knowledge with some new tools and these tools are mainly meant to be for interactive usage and not so much for script usage for script usage you should always kind of use the old tools that you're used to they are most likely not going to be available on the systems that you're going to be scripting for so with that said let's talk about me just really quick so yeah I'm a command line enthusiast as you might have figured I'm an Arch Linux developer yeah and I use Arch by the way and yeah and I work as a DevOps consultant freelance so why even bother sorry this was meant to be like this why even bother with the command line so anyway so command line is not going to go away it's been here for many decades it's going to be here for some more decades might as well embrace it it's efficient for many kinds of tasks it's not so efficient for other kinds of tasks but you should pick the right kinds of tasks to use the command line for or pick the command line for the right types of tasks and sometimes you just can't use a GUI right sometimes you're locked into some servers now me working as DevOps I lock into many servers a day and basically I really gotta know my tools around that and you can't always have an X server it's really not possible and yeah text is still the only truly universal exchange format despite what other people might tell you about themselves I'm just going to go ahead they are on the command line yeah all right I want to draw your attention to the fact that we have fade out effects and fade in effects watch this all right so don't worry you don't have to memorize this this is just like an overview because this is essentially a completely live kind of affair that we're going to have going here but please cut me some slack if something goes wrong right yeah we're going to be looking at some tools and some common tasks and how what you were probably used to and then what you might be using now so let's have a look right so first of all EXA is a tool for listing files now you might know that you can list files and unique systems using LS and tree but EXA is pretty by default and it has get support so let's take a look what that means right so normal LS looks like this right it's a pretty pretty plane so my LS looks like this because I have some areas to make it pretty but usually if you look into some default system looks like this right but EXA also looks like this by default but now this is not really so fun is it so let's for fairness sake let's use LS-L and use my oh shit I don't want it to wrap I'm going to see what I can can you guys see that at the back okay great and then I'm going to keep it like this so it doesn't wrap so we can see here what the LS-L output looks like so it lists my user, the attributes lots of stuff it's not particularly helpful though if you want to know the real sizes you got to do this right so that you get the human readable sizes and everything could be more colorful right you could always make it no colorful nowadays is to make everything more colorful and so you've got EXA-L and now it's very colorful you can see here that the attributes are all colorized so you can easily skim that and the sizes are human readable by default and the users actually my user is twin star so it's like yellowish and the root user which is not my user is just kind of like grayish and then the other thing is also readable now the way the files are colored is that executable files that have the executable flag on this side here they are green so you can see these are green multimedia files are like violet and so on and so forth so there's actually some logic to that but we can go even further now what if we had git support right built in so we have this git column here I can actually show headers so it might be a bit more, oops, header we have this git column and on the right side of the git column you have files which were changed locally but not get staged and on the left side you have files which were staged so we have n for new and m for modified so this is pretty cool Alice doesn't really have that but we also have tree you have this unique tool called tree which shows you a file tree and if you want to have the same in EXA you can also do like exa-tree now it looks like the same essentially the same but can your tree do this it can it was a trick question it can but you have to do you have to like D I actually had to write this down D P right like that not quite as colorful now is it but now we have seen that exa can do this but can your tree do this no it can't this is amazing right you can basically check your structure and you don't have to alternate between tools like git status all the time right like that that's pretty amazing so that's exa in a nutshell let's continue FD finding files now personal pet peeve of mine is like find is kind of sucky to be honest in interactive usage like if you want to just kind of do something and like make a fuzzy search for something it's and you always have to put the single quotes at the right location put the SRS at the right location that kind of sucks now but we have FD which is like find but it doesn't suck for interactive usage and it's colorful as well so you might see a trend here so let's find let's find all of the read me's in the CPython source code right so I have this prepared so we have find right and I've CPython right here and we want to kind of because people can we have we have to give I name because it might be case and sensitive because people might write read me's and weird kind of capitalization so go like that and we have to go like this read me and then we get all the read me's in CPython great so we can do the same thing with find with FD go like this FD and then just source CPython and it's same output but a lot less tedious to type right if you compare these guys a bit less tedious to type and honestly just look at because it has colors but we can go even further of course what if we wanted to find all of the Python files in the kernel output kernel source code actually we can do that actually let's do something else let's find all of the parser files in the CPython in the CPython source so we might want to go like this and we only want to find Python files so we do like this now we can do the same thing with FD like this you have this dash E which means extension and then we can do the same thing but now we can keep adding extensions easily like that for instance we can find all the RST probably it has RSTs and we can keep adding extensions for that so we can more easily do this I had to be honest I have no idea how to do this and find because you can have to like write in regex and maybe make a subgroup or something so that's kind of annoying another nice feature of find is FD is that it actually uses our git ignore to ignore files that we don't want so if we look at our git ignore right here it has some like there's some example that I did there's this ignored file and lo and behold there's our ignored file and if we FD for ignore won't find it if we go like you for unrestricted it will find it so that's a nice feature because usually you don't want to search for files which are ignored by your git ignore so that's FD continuing ripgrep now many people might have thought about that it's quite a common tool nowadays it's insanely fast it's like rep but it's really really quick and it's also user friendly and it has amazing colors and it also uses git ignore so what does this mean in practice well how about we search through the whole linux source code and we search for buffer right the buffer word comes up quite a lot with the linux context so I'm just gonna let it run for a little while but the general idea is that linux uses a lot of buffers and it takes quite a long time like it took like 9 seconds as you can see my shell there let's do the same thing with ripgrep oops there oh fuck me what the hell oh because sorry because I copy this we don't have to provide the recursive flag for ripgrep it also does this by default it's quite a bit fast if you see 2 seconds against 9 seconds but now ripgrep also does something nice it ignores binary files by default which rep doesn't do and git is basically a bunch of binary blobs and we don't really need to search that and also it has git ignore by default we search a lot less stuff and we usually don't care about all of that stuff and also if you check the default way that things are kinda looking so I have this file here the api helper and this is the output for ripgrep and this is the default output for rep and you can see here that the output differs quite a bit in the way that we get file numbers I think rep can do this as well but this is just like comparing the default output and by default rep actually doesn't do any recursive searching which if you don't do this it just kinda sits there and waits for standard input it's not very useful to be honest so it might make sense to just search that so this is what grep here does it also allows you to search for specific types of files so if we provide the file type py for python and we search for buffer in the linux source code turns out there's actually some occurrences of buffer in linux so this is all linux all python files as you can see in linux which do some helper tasks so that was pretty nice to do and with grep you kinda have to find all python files and that's kinda annoying so that's ripgrep it's pretty cool next up we have tokai I don't know how to pronounce that so it's kinda like clock not many people have heard about clock actually but if you wanna count your lines on the command line usually you would use some per script called clock now perl is a language engineer to be the slowest possible scripting language so it takes a long time actually to do anything in perl and I can actually demonstrate this by just counting the number of files the number of source lines in cpython takes a little bit of time but the general idea is that well it takes some time so there we go yeah it's wrapping sorry for that but we can see it counted this we get some results now tokai as an alternative basically like that it's pretty fast tokai allows us to count anything we want like linux takes a few seconds but I did this with clock and it didn't finish when I took minutes to be honest we can also see just how many c files there are in python we can only count the c files like t for type and then turns out there's a few sorry I'm saying files but I mean lines of code of course so that's tokai short and sweet if you wanna count lines of code it's not colored I think this is probably the only utility that we have today that's not helpful next up, http a personal favorite of mine it's like curl but it's like super user friendly you know curl also was made with the ability to be super user unfriendly if you actually want to provide jason I can't remember the syntax ever but let's try and request so I had to prepare this because I really can't so I have this local loopback http server so it just gives us that so let's do the same thing with http and for some reason the command line tool is just called http I can't imagine this ever conflicting with anything at all so it looks like that so this looks much better and this is the same kind of like output but it shows us the response headers in a nice kind of formatting here's colors and then it has the it knows it's jason and so it parses the jason and kind of formats the jason as well so that's pretty cool and now for instance we can also tell curl to show response headers and request headers and everything and it looks like that and I'm like oof so these are the request headers because it has these lines going inside or the arrows going inside and these are the response headers and the response body but to be on like let's use curl to send some data and I had to write this down because it looks like this so this is some jason and then we can't forget to set the headers always set the headers like that and then don't forget the method never forget the method and it looks like that and then you have this and we get this back because it's just like a dummy http server but you get the idea and this in http is like that I'm not kidding it's that simple and yeah we can actually also show the request headers and everything in httppy using show and then the basically it's a bit weird of a syntax but you get like capital H I think the response headers and then the request headers like the lowercase h and same for body and then it looks like that so I think this is pretty cool right you can see that and you can also set headers the same way and you can see that we set the header so that's pretty cool let's continue bat now bat is an interesting one bat is like cat and if cat and LS had an amazing magical unicorn baby and that's kind of what bat is it has syntax silencing support it has automatic paging support for long files and it has git support now how does that possibly work well I can show you so if I just bat something like change file you can see that it has these weird things actually let me show you this let me show you this first we have api with just a python program which does something but it looks like that not very interesting but then we have bat looks like that we have like this so we have syntax support we have on the left side you can see that we have these little squiggles which means that this line was changed these lines were added you can see the pluses and it won't use the pager for short files so if you just go like staged file it will basically just not use the pager it will also give us line numbers which is nice and that's essentially all there is to know about bat now obviously never use bat it's compared to cat or less compared to cat if you want to actually concatenate files but if you just want to look at files it's pretty cool and also if you bat binary files it will just say binary and cat is going to be like yeah so that's that sd is a short and sweet one it's basically like said but people can actually use that ok so have a look at this we have this change file and we have this replace me replace so with set if you actually want to replace that we will go like set-i s-replace and then something and then g because can't forget the g because it will only replace the first occurrence per line so can't forget the g and then this we bat this again see it replace but with sd you just go like something change it back to replace and then change file and that will actually have to change it back but if you don't want to commit we are a generation of millenials we cannot commit if we go like dash p we can preview that without actually changing the file so that's pretty cool that's all there is about sd and then we have hyperfine now this is interesting usually in unix if you benchmark things you would whip up time and then just run time a bunch of times I can actually demonstrate that I have this taxi simulation here which is a very fast program that does something and in theory it has some output but we don't show it but so if we want to time that how long that actually runs for we can run this a few times and we have some data but you know then we have hyperfine and now watch this there's some spawn up time to measure something and then it basically runs a program a few times to make sure to sort out the statistical errors and we'll actually also take note of the min and max times and you can see they were actually quite considerable we can actually tell it to run a few more times but the general idea is that the program runs quite a few times in order for it to figure out how long the program really takes in total and so we get the we can execute very quickly terminating programs and still make sure to benchmark them properly and so now we are down to a delta of like a plus and minus of 10 milliseconds which is quite a bit still but at least I mean the program terminates very quickly but you can imagine how this is useful if you want to benchmark some other tools so you can specify a warm-up phase if you're into that kind of thing where it basically runs a program like a few times beforehand and then completely discuss the measurements so just in case you have some kind of I.O. demanding program where you want to be very sure that the thing that you want to benchmark is absolutely your I.O. cache right so you can do that it's pretty cool and now yeah we have a bonus because we still have a little bit of time MDP which is actually the program we're using right now to run this demonstration which is basically the markdown presenter so you write some markdown files it looks like this if you want to be very specific it looks like this and this is my talk actually which is run right now by MDP and it is like Powerpoint but it has 1% of the features but it's all the features that matter all right and it's on the command line and your Powerpoint can't do that now can it we have another bonus GenAct so imagine you want to pretend to be doing some work but don't actually want to do any work now what do you do right so you can run GenAct and go like CC and now we're compiling something and you can run GenAct-M Weblog and pretend you're looking at some really I don't know, kernel compile you get the idea right you can also just run it like and without any arguments it will go into demo mode and will run a few iterations of every kind of program that it has it has quite a few programs actually has these programs you can run that so you can do this you get the idea right so I'm glad you like it because it's actually the only program in this whole list that I've written myself so thank you thank you for that yeah we have one more bonus which is a very short and sweet one, AskEquarium it's just like that right you have fish on your command line we have one more and then I'm done which is Cmetrics probably everybody knows us but still for completeness sake yeah thank you for that yeah thank you very much Sven for this very colorful talk are there any questions or comments does FD also have a batch statement like execute yes it does in fact it also has a batch statement for execute so if you want to get rid of X-Arcs you can do that instead it's pretty cool actually much less painful to use so when you have questions you can also line up at the microphones on the side there was those second signs after your like command line like after master like three seconds and so on what's that like oh yes okay so that's yeah so if you run a command which takes a few seconds to execute after one second my shell it will always keep track of the command and it will show you how long that command took always so you can't forget that it's actually called Liquid Prompt it's for Cish Shell and it's a for the ZSH yeah Cish Shell like it's like a special prompt which has this thing built in where basically it only shows you what you need to know you can see I also like this is my Git state currently it's like almost Zish Shell kind of no well it's a special prompt for the show yeah there's also a batch version of that it's pretty cool I can recommend Liquid Prompt more questions maybe in the meantime you can tell us how this fade out works I actually don't know I think it uses NZ color codes but I don't actually know I actually wanted to look that up it's a bad page do you know if it has an option to print to the alternative screen so when you close it doesn't show up in your shell I see so this is just a normal I think shell feature where you can press Ctrl Z in order to background the task and then you have this one sleeping process here which is now sleeping which is my presentation you can type jobs to see which jobs are currently sleeping and then I can press FG to foreground this again that's pretty cool oh sorry well let's I can show you outside okay so if you're ever running graphics what kind of window manager do you use well I use graphics right now actually I'm not really going to comment lying I use i3 as you can see it's just normal i3 stuff I can do this what terminal is that this is the terminal any more questions do you use any other tools related to terminal work which increase your productivity or ergonomics for example I think the best tool I found is drop down terminal I use Yakuak maybe some other tips or tools you use or your work well so originally I wanted to show of Wim because I also put this in the title of the presentation or in the description text and I figured we didn't have enough time for that for whole Wim thing so I use Wim a lot for everything really I do everything no without team I just use i3 and just keep opening terminals I don't like the double kind of window management I know that there's tools to make that easier but honestly I like to keep it simple and they just have just lots of i3 windows in fact I have 10 open desktops right now and all of them full of windows so I use a lot of windows one question here is ripgrep faster than silver search yes at least I mean so I used to use the silver search so the question was whether ripgrep is faster than the silver searcher which is another tool that looks a lot like rep and also has colors and also same defaults the basic philosophy of ripgrep or the silver searcher are much the same it's just that ripgrep just happens to be like 5 times faster in the benchmarks makes some difference I suppose with very large repositories do you recommend using the T-Max? well so I use quite a bit of T-Max but honestly not for development more for well well it looks nice I don't know so I know T-Max is very popular with X people because they don't have a proper native window manager I think nowadays they do but generally speaking I don't really like window management in T-Max so much as I like it in i3 and I don't want to manage them in two ways can all these apps that you demonstrated be installed with Pac-Man? yes in fact I package most of them for Arch Linux but also for other distributions and also for windows and OS X so they are all available and they also all have pre-compiled binaries for windows which just work how difficult it is for you to reproduce the same environment on another system? pretty easy I mean it depends on how much control I get since most of these tools that I demonstrated are written in Rust and have static binaries you can just kind of get the binaries download them and they will always keep working except for HTTP Pi which sadly is a Python and the distribution is complicated I know it's a bit it's a weird thing to say here but Python distribution is still not a solved problem I think but yeah it's pretty easy to replicate the environments do you use any other displays like status bar or anything like that? do you use anything else like a status bar or any other kinds of displays? oh yes so I use Polybar for displaying my system but I like to keep the output to a minimum I'm not into very flashy conky stats and stuff like that because I never look at them anyway just keep it simple and make sure that the system has enough battery time very very much thank you to Sven give a warm hand to Sven alright welcome to the session if you can if you can so the topic is obviously very popular our speaker for the session is Vita he's a CTO at Quantlane he has a background in financial mathematics and is now trading stocks using Python but today he's not talking about that but instead talking about typins so give a warm welcome to Vita how do you use mypy to check your production code wow that's quite a lot for those who aren't this might be a little bit steep in the beginning but I hope I won't scare you away from mypy my name is Vita I'm a software engineer and now a co-founder of a company called Quantlane we started five years ago and what we do is we trade stocks mostly in Europe and we do that automatically and semi-automatically we are based in Prague and everything we do on the back end is in Python 3.7 at this point and we also happen to use a lot of async.io and I like to think we were very early to start using type annotations and mypy so static typing is still quite a new thing to the Python ecosystem and the community we're still learning how to use it and the tooling is still being actively developed and for those reasons it is sometimes a bit difficult to maybe not get started with static typing but to actually cover complex code bases with static types but despite these challenges I believe it is really worth it because when it is done properly it can help you avoid a lot of mistakes and in the bugs before you even run your program before you even run your unit tests I'm going to talk in two chapters the first one is the high level approach you might want to take when you when you have a big code base and you want to cover it with mypy and then we'll talk about a few examples of code that is not the usual hello world function and how you might go on about typing that and in the end I'll remind you that it really is worth it even though it will look a bit complicated sometimes so I mentioned before that we started using static typing quite early at a point where we already had a couple hundred thousands lines of code and mypy was very early back then and it was crashing on the code I don't mean spitting out typing errors but actually crashing so we had to start gradually and only cover our code step by step and a big lesson we learned unfortunately not in the beginning but over time was that the default mypy configuration is quite lenient and if you don't make it slightly stricter than the default is you might learn a few bad habits that will come and bite you later and you will still have to fix your code and your annotations so I would recommend you whatever the code is that you are going to run mypy on I would recommend you to have full coverage meaning there are no functions which have no type annotations or partial annotations so these are the config options you might use for that second these are optional I want to consider them these restrict some forms of dynamic typing in your code some of these options are difficult to enable but if you can do it or if you're starting with a new code base I would definitely use them and this you really want to do with mypy and static typing it's sometimes you fall into a trap where you think you know what you're doing or you think you know what mypy is doing and it might not immediately tell you that your understanding is not quite correct so enabling some warnings will help you with that since covering an existing large code base is a huge amount of work you want to go step by step and then you run mypy only on the modules you've already covered and you might even start with a single module with a very small step and then you just keep adding on and you want to defend your progress by adding this check into your CI pipeline so it runs before your test to do and then of course you never want to make that list smaller you only ever want to expand it what worked well for us was doing an internal hackathon where a couple of developers stayed at the office overnight and worked hard to increase our coverage and we're still not completely there so we might have to do a few more sleepovers when you go the opt-in route you need to deal with imports because you are covering maybe just a few modules but those modules might be importing other code which you maybe are not ready to check in the beginning so this is how you tell mypy to not complain too much about other modules a word of warning that follow imports directive has another option called skip the documentation warns you not to use that we did and it was a terrible idea don't do it at some point when your opt-in list is sufficiently long you might you definitely want to switch to opt-out meaning you run mypy on everything by default except for some modules that you exclude in your config and of course there might be dozens of these ignored modules in the beginning when your exclude list is huge but then of course over time you work to make that list smaller and smaller until it disappears the benefit of getting to opt-out is that any new code you add to your any new modules you add to your project will be checked by default covering unit tests is tricky matter despite me recommending strict strict configuration for mypy I will backtrack on that for tests and just make mypy a little bit more lenient for reasons explained in this code sample by the way I'm going to put the slides up online so you don't have to take photos of everything there will be a lot of configuration and code when you use mocks and monkey-patching in your tests which you often do there is no way to explain that to mypy as of yet it is a very complicated problem so you just need to ignore those places where you monkey-patch but despite these challenges I would urge you not to ignore all your test files completely because even when you cover them with mypy you will get some benefits because mypy will be able to check that your tests are using your tested code as intended meaning the annotations are being respected if you build your own python packages you should know that even when they do have type annotations in their code and mypy passes on them if you use that package mypy will not follow those annotations by default so you need to tell it it's very simple you just need a marker file added to the package and that's it but unless you do that you don't benefit from annotations in packages when you use third-party packages which might not have type hints there's a few options you have this is something we don't really do so I won't go into detail you might want to ignore all third-party packages or better you ignore again very explicitly ignore just those that don't have annotations so now that you know generally how to approach a codebase we can talk about a few examples of what you might find in your codebase first example a very useful frequently used tool generics and type variables who here has heard of these or maybe even used them wow that's really good I think this is one of the most useful and needed features let's take the example of a weighted average which is a very simple formula, simple computation where you add up values and you average them using weights and critically we will want to implement this average as incrementally updateable meaning you can keep on adding values to the average and getting back the result so you might start by writing a very simple class for example that starts with some internal pre-computed values and you will be able to add a weighted value to the average and then you just add a simple method that can calculate the average at any time you will notice that we are using floats as the data type in there so we are explicitly saying that we can only calculate averages for floats but imagine you not only want to do that but you might also want to use decimals which is an arbitrary precision data type in Python so of course as written, as annotated that class will work for floats as expected and of course it will not work with decimals because you said your values were going to be floats by the way the reveal type function is very useful that is provided by mypy so it's undefined at runtime but for debugging what mypy thinks your variables are it is a very useful function so if you want to allow floats or decimals a good way to do that is to parametrize your weighted average so you make it a so-called generic class and you say you want to use this type by this type variable which we called algebra type in this example and we restricted that type variable to either be a float or a decimal then your original class will be very similar to the previous version but you will suddenly have a small trouble with the number zero by the way this code block contains but it's not very important at this point it works and then in the rest of your class instead of saying float you will be saying algebra type so you've parametrized the type now when you want to use that class when you instantiate it you need to add the value for that type parameter when you create the instance like this the first two lines are a weighted average of floats and you can see that it also returns a float and of course the second part is a weighted average of decimals what is nice is once you create a weighted average of a certain type you cannot change your mind and start mixing the types that is desired there is an even possibly nicer version which is to say actually what I need to do with my numbers is to add them multiply and divide them so I don't care if those are floats or decimals or something else ideally I would just say they are real numbers so in theory that sounds great but in practice the abstract number types in Python don't really work that well or aren't that useful yet so in theory of typing in Python being quite pragmatic it isn't an ideal world it is a pragmatic world and you need to be pragmatic too so your type annotations often won't be perfect they won't perfectly describe what you had in mind but they will approximate it another very important example understanding the difference between nominal and structural typing so these are fancy sounding words but it's nothing complicated nominal typing you already know that deals with class inheritance so animal examples seem to be popular in computer science for some reason so I went with one there's a base class of an animal and then we have a duck that apart from whatever behavior an animal has can also quack and then suppose we want to make a function that accepts something that can quack and make it quack so as annotated here this works because it's very trivial you're just telling my pie that your function needs a duck that can quack and that passes however imagine you wanted to have another animal that can quack and you want that function to work for that animal too so we could create a penguin which possibly makes sounds close to quacking but it would be wrong to inherit that from a duck that would be very wrong so you just inherit from animal but then of course your make it quack function doesn't work because it was told to expect ducks so nominal typing means you use classes and class hierarchy when specifying the types you need in contrast to that we can be talking about structural typing where you describe your types in terms of the capabilities they have so here you're creating a thing called protocol in other languages you might have heard a term an interface or maybe a trait and this is actual code those three dots are valid python syntax in case you didn't know and this really just tells my pie that there is an interface or protocol called can quack which exposes a public method called quack and when we declare this and change our functions lightly so it now accepts something, anything that can quack then this will work for both animals and the interesting thing is that we didn't have to inherit from that protocol that protocol is a class but that is more of a syntactic convenience and now any object that will have a quack method will meet the requirements of this function so this is very useful for for duck typing pun intended another example we encountered is when you want to somewhat define your own type without creating an entirely new type a very simple example of that is when you have a function called place order maybe that accepts a price and quantity of some goods that you're buying maybe and it doesn't matter what it does with that maybe it will save it in a database or whatever and it wouldn't be nice if we could do that it would make our code more readable because when you read those annotations you will clearly see this is of type price and this is of type quantity and it would also make it hard to define a quantity and a quantity and it would also make it hard to mix them up so you wouldn't be able to accidentally pass a quantity in place of price I come from a company that trades into financial markets and confusing prices and quantities where buy and sell is not a mistake you want to make so the first option that you might think of is to alias a type so you say there is something called price and it really is equal to decimal and then you can use that price as a constructor that works but unfortunately that doesn't create a new type this is just a convenience for you so you don't have to type so much it does make the code easier to read when you suddenly start writing price instead of decimal but as for type safety you get absolutely no benefit but what exists in the typing module is a function called new type which kind of alias is an existing type but it is a true alias in the sense that mypy now understands that it is a price not a decimal so what this does is you can still create decimals but then you need to wrap them in your price type and from that point onwards mypy knows it is a price not a decimal and if we define a function that takes a price you will see you can't pass a bare decimal to it and you can't even pass a quantity to it even though it really is another decimal so this is what we wanted and we are now able to differentiate between the types so this all works to a point once you start modifying the values you're back to the original type because really it is just a decimal under the hood and once you start making operations meaning calling methods on that type it will return back the original type so there is a limitation you should be aware of but the only perfectly correct solution of defining a new type is to actually define a new class and implement your type and the behavior you want so that is nice and clean but of course you will pay a runtime price because your price implementation will probably not be faster than the decimal type that is already in python so that way lie a lot of interesting dilemmas about preferring static typing purity or preferring pragmatic runtime performance and simplicity now in python you can do a lot of metaprogramming a lot of magical tricks and mypy cannot always understand them a good example of that is the data classes module that is new in python 3.7 or if you are familiar with Django then Django models are an example of metaprogramming that mypy wouldn't be able to figure out by itself so you actually are able to write plugins for mypy that help it understand magical code this is even newer than mypy itself there isn't much in the way of documentation yet and there's just a few working plugins out there we also had to write a plugin and if you need to go that way too then you might find our plugin useful because it's got twice as many comments as it has code and so this is a bit hard at this moment you probably won't need it but if you do I'm sure this will all get easier over time the final example I'd like to share is overloading function signatures that means properly typing the case where your function might take different sets of types of parameters and return different types of results based on the parameters a simple example is is having something that's indexable like a list is but let's say it's your own type so here we could have a series of numbers and we want to be able to index individual numbers but also slices and we want mypy to understand that when we use a single index a single value is returned and when we use a slice it's returned so we begin by creating a generic class this should be familiar to you by now like I said it's a very common tool and what we need to do next is to explain the two versions of the square brackets operator which is called dunderkit item in python so these two they look like method definitions they actually are but they don't have any body again those three dots are exactly what you want to put in there and you annotate them with an overload decorator coming from the typing module and all they do is they explain to mypy what are the possible ways of calling those methods and then you just add the actual implementation which in this case is very trivial and that real implementation has to be typed so that it includes all the versions of the signatures that you mentioned previously so this is a bit wordy but it actually makes a lot of sense once you get used to it the syntax is easy to understand I had a lot more in store for this talk but I had to cut a lot to fit into the time limit so I only have seven take a ways for you one is to try and make your life harder by using stricter configuration than the default is second is to go bit by bit don't take too much on in the beginning and go module by module and get to opt out when you can definitely learn to work with generics and type variables because those are your friends you will be meeting them quite a lot learn to use protocols because they are very much in line with the dynamic spirit of python so you don't have to create classes for everything just for static typing be aware of new type because it can add more semantic to your types it can be a decimal or an integer you can call it user ID or price or quantity writing plugins is hard but it is so important for my py to spread and become more popular that I'm sure it will get easier eventually and the last example overloading looks like boilerplate but it's not really that complicated and it is useful so there are other things that typing is complicated after this talk I certainly do but there are good reasons for that one is that we have to learn and understand new concepts as developers and that is great because they force us to think about our code more and in ways we perhaps didn't before another reason is that the tooling is still quite young and it's developing very fast very actively but there still is a lot of issues to cover and if I may just one final sentence that is once this becomes more popular and more prevalent once we learn how to use this then all our code will be much less error prone and it will be more fun I can already see that in smaller projects thank you if time for one short question if somebody is close to the microphone thank you for your talk I have a question so beyond primitive type annotations do you think that optimizing for a human which python says reliability counts and that kind of PEP 20 things optimizing for machine my pie in this case is a zero sum game or it's kind of something else that is a very good question zero sum game at this point we are making concessions to my pie definitely we are adding code and structures to the code that we maybe would not do otherwise but I don't think the gap is too big if it is done correctly then you might be making the code easier to read for humans when you alias your complicated type annotations and use them cautiously then I don't think we lose all that much very little and we gain a lot it's a very hot topic but we are running out of time so thanks again Vita for this nice talk