 Good morning, everyone. This is my first visit to Portland. I'm really happy to be getting to know already the city and some of you who live in it where I will be bringing the big PyCon conference in 2016 and 17. I'm going to for a moment think about software along two axes. We're going to be very conventional and just use the horizontal axis for time and I will graph vertically the idea of complexity having more code having more complicated code and In general when we start writing code our day our month our year our project looks something like this Over time becomes more complex sometimes there are bumps in that trajectory, but that's generally the shape of things and as I try to Get up in the morning. I write a test or whatever and I start trying to implement a feature I write and I write and I write new code until I reach this magical state called working code and When I was a younger programmer, I would stop there declare victory and move on It turns out that this was a mistake Once my code is finally working. It is probably better described as Barely working. I mean, what have I done? I have finally crested that the great border in the state space of of programs From all of the states my code could be in without having this feature. I've just added and One of the states in which it now possesses the feature, but unless it's the kind of feature I'm very familiar with what are the chances that I will have crossed that magical state states state space border At the best and cleanest and most readable way to implement this feature In my case a very very very small chance again unless I'm kind of already familiar and have done the trick before My guideline for implementing what are to me really new features is that once new code works Then I'm probably about halfway done with it Maybe I should now put it aside and reread it tomorrow to realize how Difficult to understand this code. I've just written is but if I've spent an hour putting together What is to me a novel solution? I generally find that an hour later in the day or tomorrow Really helps clean it up and make it better code So I tend to start of course at zero complexity because I have a blank screen or a procedure in front of me I work and produce working code and that moment is a magnificent one because I can check in I can Plant a flag I can plant a stake and say I at least now have this feature working But I do really find that the code itself needs more time It needs more work and that the complexity can then be ratcheted down through refactoring through caution through Cleanup and that it's only then that I typically reach what I would call good code web framework history I Think follows a similar pattern Both in the industry at large and also if you narrow your attention to Python itself We started with the idea of of course in the very early days you just Served the web off of your file system and through scripts into that file system if you wanted to run them But once we got the idea of putting our software and not the file system in charge We started writing code and we wrote more code and more code and Magic to produce forms and magic to persist our objects and magic to let URLs travel across our object persistence hierarchy and when we had finally Built that mountain of code all the way to the sky We gave it a name and it was called It worked I mean you and we had we had reached this point where we could now go out and accomplish things on the web But just like when I'm sitting and writing a new feature for a few hours we didn't necessarily land on the simplest answer to the web first and Once Ruby on Rails had inspired our community with this idea that the web could be radically simpler than we had made it Right out of the gate just a few months later Robert Brewer I believe came out with cherry pie which still has a loyal following today and Django very quickly followed that it's interesting There's been a lot of other attempts at web framework since then but two of the ones left standing are those two first Responses that our community made to rails I put cherry pie a little higher on the complexity scale because its pattern was that you build Controller objects and your URL kind of serfs down the attributes of the controller objects with its path components Django kind of went all like steampunk and just you compare your URLs with regular Expressions you treat them as strings and that is a simplification over having to teach new programmers how to write objects of course the community's experience didn't stop there just as Django looked at soap and said I think the web is simpler than that So of course the micro frameworks then looked at soap shook their heads and said I think the web is simpler than that and produce flask and produce bottle and the other Smaller more boutique micro frameworks that exist I should for the sake of record say that I have at times used a micro framework I generally find that if I really need to add an ORM in a templating language and a cross site scripting Vulnerability package that by the time I've kind of put all of that in there I'm kind of back up at Django's complexity level it lets me reach that complexity level gradually and on-demand from it all being Something installed on my system or server to begin with But I actually generally find that micro frameworks are more valuable because they let me handpick the pieces I put together not because your web app is going to be noticeably simpler when it's finished in a micro framework Then if you had used Django and finally I should mention Few programmers did stagger alive out of the collapse of Django of a soap and all of its code And they have actually Themselves looked at soap thought about how they can solve those same very CMSC very content-centered problems more simply and Actually another of under other small web frameworks have died as that Everyone I know in that community has now gone to using pyramid It's worth looking at if you want to see a whole different approach to what your URLs mean So that is kind of the trajectory. I think that we've taken Django was a first very successful Experiment in how the web could maybe be simple What is then a web framework? What is this task that we can do in a complicated way or a simple way? I think of a web framework this way a web framework is A denormalization engine It gets our data hopefully orthogonal and independent and unique in our back-end storage And it mashes it and mixes it up into a single place into a document That is delivered to a consumer to a user a web framework denormalizes data Slit it lands in documents which of course raises all of the normal problems of denormalization As soon as my username doesn't just exist in the database But on a web page that I have open in front of me I now have a copy of my data that raises all of the normal problems of caching What is there that will keep that username in the corner of my screen? Equal to the value in the database if it starts changing And so you have the full range of options All the way from as long as the tabs open. It's just going to be out of date All the way to using MeteorJS where they promise automatic update of all the data on the page when it changes on the back end the full spectrum of Solutions to having a cache or having a copy of of data Arise and the full range of simple to complicated solutions to that are Inherently present in what a web framework does which is made copies of your data Usually denormalizing orthogonal data on the back end Into some more complicated form And I found especially as mobile devices have gotten better bandwidth has improved storage has increased That a question that are Increasingly commands my attention is the question of how much data in a particular case Should you deliver to the client you can give a client a web page the absolute minimum Just the exact data and no more that the user has asked for On the other end you can just give the entire database to JavaScript and let the front-end deal with it If you've ever used the Sphinx documentation system to build a static site And notice that it has a search field even though it's static if you look behind that During the document building process It builds an index of all the words in your document as a Jason blob and stores it to a file That search field does not go talk to a search engine the JavaScript reads in your entire full-text search database as a Jason blob and does the search term Action right there in the front-end so even a normal Python project you might span in the solutions You're using this full spectrum between very carefully designed web pages that just give Exactly what the user has asked to see and nothing more and other parts of your solution that hand Essentially a database to the front-end and let it do its own searching Imagine for a moment because I think it's always fun to think of what we have accomplished and what the future might be like in this industry What if hard drives kept getting bigger? I'm told that they shall not do so Because of things like physics and it's not that you know the way that we store data with magnetic Polarities I'm told is starting to run into density problems that like you know the atomic level and or Where we can build and purify platters and so forth While there are researchers looking at completely radical other ways to store data Which we know will be reached because in a I believe it stated in Star Trek the next generation that the drive store data on an Adam-by-Adam basis Obviously it will take a long time and we've been told that we can't expect the rate of improvement To keep going because we've essentially been writing a single solution for 30 years And it will really slow us up as we have to pivot to something denser than a magnetic platter mediums But just for fun. What if hard drives kept growing in the last 30 years from 1984 to 2014? hard drives have gone from the biggest three and a half inch hard drive being 20 megabytes to Seagate a week or two ago announced an eight terabyte drive thus beating out Western digital at least for a few weeks For those clients that need maximum storage. What if the next 30 years? What if they they were able to keep going? How big is the ratio between 20 megabytes and 8 terabytes well if you do the math I believe you'll find that if we did this all over again over the next 30 years About the time of my retirement. I'd be able to buy an off-the-shelf 3.2 X a bite hard drive I Could put Facebook on it five times Google would need five of these Estimates vary But some people think the NSA would need as many as 20 or 30 in order to operate If that happened and if I live to see the day, I'd be 70 years old I mean, can you picture me as an old coot waving my new three exabyte hard drive at the kids? bicycling down the street and saying I remember we used to have a room full of computers to store this much data And the children will laugh at me Because they will be young and because I will be old For the record because I want I want to look I want to look like a good futurist here If this actually happens for the record, I don't think that a 3.2 X a bite device would be a pure storage I think we would probably go ahead and sacrifice some of that storage to putting an array of a few thousand small microprocessors on it so that instead of a single Bus trying to read all of those exabytes out to do something to reduce the data I believe you'd probably have small on-board processors a cluster so that every few petabytes could have its own dedicated processor To deal with it, but even if the rate slows up We have a lot of storage potentially in our future And even today questions begin to come up as I sit waiting for a page to load Why don't I have stack overflow on my hard drive with the daily diff really be that large that by Cable modem couldn't handle it in the middle of the night Why isn't Wikipedia on my hard drive? Why isn't the IMDB which he would actually is not that big It's a few that they have it as text files and you can download it and process the data yourself Why do I sit and wait for all of those banner ads to load when I just want to see that guy's name Who is in that movie with that girl in that future will wonder? Why not put the library of Congress's full media holdings video and audio and books on my hard drive? There will be questions like I went and looked at the IMDB database added up the number of minutes of footage in All of the movies and television shows in history You could have it all in high definition and by my calculation about point 008 exabytes Maybe maybe one of those selling points for the hard drives is they'll just come within all movies ever partition on them But you can blow away if you care enough, but that you'll probably never notice if you're not buying it for consumer purposes anyway Maybe all you'll need is updates as new movies are made I know Hollywood will continue making movies over the next 30 years. Maybe it'll be 0.02 exabytes by then You might think I'm being silly But we have already made taken this step and made this choice with our version control You think that the latency of browsing between stack overflow pages doesn't add up to much But do you remember what it felt like when you were never again? Waiting a few seconds for subversion to come up with an answer for you. Do you remember what it felt like actually? That's not what you remember Using get for a moment felt novel it then felt quite normal what you remember is going back to subversion for the first time If the day came when you simply had Wikipedia IMDB and stack overflow all on your hard drive I believe that when you went back to your mobile device and browse the next day you would have that Let's call it the subversion feeling That feeling that your whole life is once again spent waiting So in one very important area how we manage our projects. We've already made this flip We've already made this pivot from the idea that an application should just give me what I need at the moment to How about you just give me your entire database? Yes, I'm not likely to look at the first hundred commits to Django How about you give them to me anyway? Just give me the whole history and on the client I will deal with the question of how much of it I look at Incidentally, it does occur to me that there are very nice privacy consequences To me just having the Wikipedia on my hard drive and only my processor knowing what of it. I'm looking at Big win of Django as I have used it in projects Is that Django is flexible? It is not opinionated about where my views land on this spectrum between Share minimum and share everything now. It's true that there are helpers Django has class-based generic views that are opinionated about the relationship between a URL and what part of the database I'm trying to get to from it, but those are the exception in Django There's something you opt into when they make sense unlike some of its predecessors Django the framework itself is not strongly opinionated about what a URL means. I Begin a function definition and I am invited if I need to to invent from the ground up What information will live there and what it will be? Let me tell a story about Django project I did recently where a very modest and small version of this pivot wound up radically Simplifying the application Was a project for the New England Wildflower Society. They were named that in 1900 when they were founded in that this Conservationist organization first came to be but they have since then though. They're known by this name They actually do all kinds of conservation study of native plants Study of the way that invasive species and garden species threaten New England's native biomes and all of the animals other creatures that depend on them And they have a big audience of students that use their resources amateur botanists and teachers Who they wanted to serve better they went and found? Boston based Python consultancy jazz Carta a Group that started back in those heady days of soap and blown and those other big solutions but has recently been using Django on quite a number of their projects and having very great success with it and Nate Oni Told me that all of his Developers were busy here. He had several of them working on this project But they needed one more and so I got pulled in and I got to help with this really really fun project It tries to solve the following problem Botany in particular botanical identification is traditionally done with what are called dichotomous keys dichotomous is actually Describes a plant or something like that but every time it branches branches not several ways But exactly two ways the dye and dichotomous tells you that you're always Splitting into two choices over and over again before you reach all of the limbs or twigs at the end of the process and dichotomous keys are kind of a choose your own adventure series for botanists they start up at a high level like plants and The botanist designing the key asks what are the two biggest groups? in the world of plant categories and they develop a question that if answered correctly and only if answered correctly We'll let you look at the correct half of the plant kingdom among whose millions of species your plant lies And then the next question tries to split it into two big pieces and then again and again They actually did put their dichotomous key of New England native plants on their website Here's one example. This is the top level for gymnast berms all the things like pine trees and ginko's that are fairly Primitive in that the seeds grow right out Visible in the open all you've got to do when you hit the gymnast berms is start by Determining whether on the plant you're examining seeds are born singly Partially concealed by a red fleshy owl and whether the abaxial surface of the flat leaves is burying pale yellow longitudinal stromatolines You should see middle schoolers faces light up when they're presented with this kind of a question This helps the expert They look at that and they immediately picture Yeah, that group of plants and they know which way to go when they reach this particular page of the choose your own adventure This is not however Something that tends to help students in the New England wildflower society wanted to address this they wanted to focus on something That would allow and support a more user directed search search that does not Descend step-by-step into the classifications biologists find useful But the kind of search they called the project go botany That would simply ask you to choose a filter That would let you look through a menu and say well All right, the thing I have in front of me has red flowers and That would narrow the set of New England species to just those Well, you could then say okay It has smooth leaves and suddenly have your result set Narrow down more considerably I'm on a trip to Vermont and this is where I'm seeing the plant and that can narrow the range of Species they wanted the users to over on the left-hand side of the screen have a set of very common Questions and then a get more questions button that lets you even ask Give me I have a really good leaf in front of me. Give me more leaf questions. Give me more stem questions Interestingly enough, I'm told that the early user tests proved that 0% of users ever noticed the get more questions button. I Don't know if they ever fixed that But as you can see when you select one of these Filters it gives you the multiple-choice question and when you select one of the options and hit apply the page Adjusts in order to show only the species that match In fact, if there are few enough species on the page to make this possible It actually animates the throwing off the screen the ones that didn't fit and the tightening up of your results set to the ones that do I Did that it's a JavaScript in our first iteration We did something that seemed so obvious in retrospect and wound up being One of those suboptimal moves It was one of in my case it was one of my first times writing a web API and We thought that to build a way web API you build this kind of obvious one-to-one correspondence between the action the user is going to be perform performing and URL that performs an analogous API call We thought that you could just study and make a list of the things that the user would be doing and Just build a URL query that would do each of those and that that would constitute good design Let us explore whether that is the case if the person selects red flowers Smooth leaves and a plant that's present in Vermont. Well, we thought I will support a query URL where you simply concatenate one after another all of the Questions that the user has answered so far about their plant and Submit that whole Question back to the server for evaluation to get the new list of plant IDs that should be present on the screen We noticed that doing a three-way query Against our database, but by which I mean it was a three-way join with the same table appearing three times And then of course the student chooses a fourth filter and now you're doing a four-way join Five-way join six-way join. I mean if it's a big group of plants you can get up to ten-way joins twelve-way joins Notice that these searches get a little bit expensive and So we started wondering well, what if we put some caching headers in maybe have varnish or a CDN content Content-something network in front of our app Is it delivery? Content delivery network. Thank you Will that caching Help reduce the server load by getting to over an hour Keep parroting back a particular search result of two students or three students or four students or the same person Working again through trying to identify a plant happens to reach the same state in their search process answer No And the reason is this so for okay a first answer that was easy to address is that the exact same search Depending on the order that they choose these filters can come out in fact In factorial different ways that can be solved through a careful practice called Canonicalization where instead of allowing your application to have several different names for the same thing you instead insist that If the user creates a query if two users create queries with the same meaning necessarily they will produce the same text or the same query a very Trick with this kind of URL is to restrict your JavaScript so that you always order those filters alphabetically as you assemble the URL now whichever order they've clicked on these in you get exactly the same URL and Have the possibility of pulling the cached version of that result In our case We probably only would have needed to do that had we gone down this road in the case of our front in JavaScript We were kind of our only client the interesting thing is that if you're letting third parties build and deliver these URLs How can you make sure that they are hitting the normalized cached version of the URL and not one of the other Infactorial ways it could be written one possibility would have been Just in case you're over going this direction. You could redirect the other URLs so that People who failed to properly canonicalize at least then get quickly Moved over to the URL that you might have cached with a result One of the interesting things about this transform We noted it is that had we implemented it it is purely textual and does not need the database The fact that that query at the top should really be that query at the bottom Actually has no bearing on whether this is an actual URL or a 404. This is a transform Purely based on the fact that we disqualify the URL by looking at it. This is the kind of logic that could be pushed out Into a front-end caching and normalization layer and not even Hit the server that's busy answering canonicalized URLs with content And one of the interesting things is you might think well then I'll have an app sitting there answering 301s all day 301s if you read the standards can also be cached Varnish will sit and spit back a 301 all day if you've let it cash it a CDN like Fastly will do the same Caching non 200 results can be a big deal and yet I find we often are busy Setting all the right cash headers on our 200 results and not even thinking about all of the others in the really really great reddit thread That the onion did several years ago when they switched off of Drupal called the onion uses Jango And why it matters to us where it was just a free-for-all of their answering all of these skeptical web developers And answering them about why this was a really really good move that they made One of the most interesting paragraphs to come out of that Was their answer as to what the biggest performance difference they wound up making which actually didn't Have to do with Django specifically they said the biggest performance boost of all Was caching 404s Sending cash control headers to their content delivery network on their 404s right a view you very you might very rarely think to customize they went into Django Customize their 404 gave it a custom 404 view and they actually stuck a cash control header Telling their content delivery network No, no don't come back to us every time for this URL We give you permission to remember for a while that we don't have this particular URL the onion having gone through so many different Early versions of the website since the late 90s had a lot of URLs that had just been Mis-typed into popular webpages elsewhere URLs that made no sense URLs that the very first generations of web spiders had accidentally put together by Misunderstanding how relative URLs work and that were just still out there from a site that was so old They had a lot of old links that spiders were hitting every few minutes But don't exist and will never exist and it can't be deciphered can't be redirected to real content anymore And they found by giving their CDN permission to cash that their outgoing bandwidth was reduced by 66% because think about it once you have good caching on your 200s Your web server might not have much left to do Once the two hundreds once all the onions webpages are out there in their CDN being delivered from a Location geographically close to the customers. They found it most of what their server was then doing this for a force and They dropped their load average by 50% by letting Non pages be cached by their CDN So if we had wanted We could have somewhat improved the state space even for third-party clients by letting these normalizations be cached But even if we do canonicalization and Cash the pages behind these URLs How often will they be? Revisited and the answer turns out to be fairly rarely What are the chances that two students will in their careers land at exactly this query? Especially as you get to five and seven and nine different features of a plant having been entered Imagine group of plants that have a hundred different filters you can choose In use by students who may be only apply five filters before they're they've narrowed the plants enough Those are both underestimates, but they already give the number 100 choose five for the number of canonicalized Properly canonicalized URLs those students could hit as they're wandering out into the state space of this just game that they're playing with biological classification that's 75 million and that's only if they go five Searches deep students often go deeper and in a state space of 75 million. I mean sure there's a lot of oaks It's very obvious to note their leaf shape And so there are going to be some of these URLs that are hit more often than other Others, but you wind up with a very very big state space that your users are wandering through at the URL level level and a lot of different information to cash So a moment came when we can sexually kind of step back. I think it was actually during one of those Free-wheeling 40-minute Skype sessions that we always intended to be a stand-up meeting But somehow often became a little bit more and One of the developers asked a question that caused us to think this way This URL is asking our app is asking Django is ultimately asking the sequel database to do an in-way join Against the plant versus feature binary relation at the bottom It looks something like this you name the three a same table three times make sure that you're using plant ID to build triplets of rows from those three tables and Then limit each of those images of the table to only Containing rows for one of the filters that you're looking at and it's the database that's doing this work for your user It's looking at the table of all plant species and their associated features this binary relation It is using a where clause to grab let's say all the species that are trees rather than something else and Then it is doing the joint It is doing the intersection to look for the plant IDs in common between those three those four those five different Virtual or the computed tables in your select statement and The developer suggested the following pivot Instead of doing this instead of doing what we had planned having the database Equal sign there. I mean doing a straight comparison on that database table to get some lists of matching plants Then doing an intersection between those lists of plants to get only the plants that fit all five of your filters and Having generated that answer handing it back to the client What if we do in one sense exactly the same thing exactly the same computation But what if we just draw that line somewhere else? What if we instead have the server stop with having generated the list of plants that are trees or the list of plants with red flowers Give those to the client and Let JavaScript do the dreaded joining Let the JavaScript take that step that actually produces this Very very large state space choose your own adventure state space that the user is exploring We found that we actually already had a Jason payload that for a particular feature This is when you get that little pop-up letting you choose the leaf shape This is what the Jason look like behind it when the person clicks on leaf shape the JavaScript goes and hits the leaf edge URL and Previously it got back some text and maybe a little image with which to illustrate those options We now just went into that Jason and We went ahead and we gave away the database We just went ahead and for each of those options We went ahead and just told the client and if they select smooth leaves Here's all the species in New England that have that feature if they select jagged leaves Don't come and ask us. We're telling you up front. Here is the complete list of species That match that filter a Single line of code if you use underscore J s will get those lists of numbers and fairly efficiently mix them down It even kind of works on mobile to a single list of species. It's not Going through lists of integers and looking for similarity is actually something that computers are good at and that we found You don't even notice the time it takes the JavaScript to do this quick operation All of a sudden our search URL was gone this disappeared from the project and Instead if the user in any order went through the color of the flower the shape of the leaf the state that they had Founded in Vermont. I mean not like other kinds of states if They went through these in any order Exactly the same three URLs would be pulled once and They would contain exactly the same information Whether they were fetched very early or very late in that particular students search process Suddenly the data we were delivering was independent of where the user was in their search and the front end could put together that list of 55 matching species By doing its own work to combine those possibilities it had been given if we have in features We have in URLs end of story no exponential space to have to cash And actually we kind of noticed at the end that we didn't need Django We could actually have generated it generated it as a static site because except when the biologist update the database of New England plants these little lists of matching plants never changed during the day Very very stable data that can have very strong cash ability properties We found that giving our front-end client more data meant exponentially work less work for our servers All of which was an early lesson for me In thinking about this problem thinking about this issue of of data Very often in a project. I'm just racing to try to build Obvious URLs that correspond to what the user wants to see and visit and this was a first experience in stepping back and Thinking about the relationship the the the pipeline that I'm building as a data pipeline To think of my web app as kind of a data shovel Trying to get the chunks of data that make the most sense and Throw them at the client fast enough that it can put in front of the user the results that the user expects If I'm correct that very often The solution to building a really responsive really efficient really easily cashable app is To put data together in interesting ways. I think that as web developers We should have our eye on what is happening in the science and the data space because after being a fairly obscure language in many parts of the world for many decades Python now that it's what almost a quarter century old is Suddenly the big breakaway hit or one of one or two big breakaway hits in the area of data Processing while larger scientific disciplines are Scientists are very conservative. They still use Fortran This is very interesting. They're not comparing Python to Ruby or even see very often Python is being presented by the younger grad students as a competitor to Fortran and The professors are old and so they shake their heads and they keep writing Fortran and Talking about how they used to have a machine room full of machines and so forth But smaller and more agile Sciences for example astronomy because it's a fairly small and close-knit community compared to science writ large Has largely within the last two or three years moved over to Python recently I saw I believe Jake van der plas who's doing a lot with the ipython notebook and with sign up Scientific reproducibility and statistics I saw him give a talk where he showed a slide full of images of telescopes that are under construction and Pointed out that every single one of them had an API that's in Python Where if you get time once that research telescope is built on the telescope itself What you give them is a Python program that they can go ahead and test before your precious few hours on the telescope comes up That is a program that will point the telescope and take all the pictures you need In return for your whole research grant or whatever it is you you send to them Astronomers by the way people who care about big data They right now are building an array of telescopes in the southern hemisphere That when complete which will take a few years will be generating an exabyte of images every day They're having to wonder about things like how do you get enough power up to the observatory? To reduce the data to the point where it can go back down the mountain every day Because they want to they're basically going to take a picture of the entire sky at high resolution Every three days essentially do a digital a digital slow Sloan sky survey every two or three days and they and that's too much data for them to look at what they're going to do Is just give us a pipe where we can do whatever research we want on all of the blitz and motion that happens every night in the sky It will be other researchers that have time to build the algorithms that look for near-earth asteroids for Stars momentarily dimming is one of their planets Crosses in front of their disc or whatever Astronomers are going to be dealing already dealing with big data And it's scientists that are pioneering a lot of the work and scientific Sultansies of figuring out how Python as kind of one of the simplest possible languages You can throw it a scientist a grad student who's not a programmer doesn't want to be a programmer They're busy with astronomy and get them up and running and solving interesting problems I think it behooves us and I've been doing some work recently to just on the side learn a bit about NumPy arrays Adding up millions of numbers with a single line of Python code and no apparent loop Pandas data frames essentially having a spreadsheet inside of an object in Python with columns and rows of data that you can query and Group by and some with single method calls You're able to do with these vector oriented libraries queries like this it might look a little crazy We'll read from the inside out Person is a data frame like a database table, but in RAM on secondary storage Dot age is one column of the table a list of numbers. We're imagining in this case Getting a single member in Python and saying greater than or equal to 21 yields, which two values True or false so if you get a million ages a column of a million ages and compare it to 21 You get a million truths and falses in a tight little vector of booleans and The rule is that you can get either a column like dot age or a whole table like person in pandas and Index it by a list of truths and falses that's as tall as the table is And it selects out the rows for which true is that boolean value and so in a single line of code You can get your spreadsheet your data frame and begin to reduce it down To chest the rows that you're interested in using a very concise Notation that has no explicit loops they all happen for you behind the scenes and Continuum analytics who spends their life helping sciences and businesses use Python for big data Has just been writing blaze Which given that syntax will not only use it as it naturally will against NumPy or pandas They can also translate it into sequel They can also translate it into MongoDB They can also translate it into Apache Spark or any of a growing list of back-ends They are so often getting into situations where they need to run an algorithm They're familiar with but against a newer unfamiliar back-end or they have a Customer where legacy data is on Oracle, but the new data they want to join it with is somewhere else They are actually building not an ORM But a data mapping engine that lets you ask the be in the same way than or M Let's you build an object from one database row They're letting you write single lines of code that ask about enormous amounts of data That might be spread across many back-ends or if you're using Apache Spark Might be auto-sharded across an entire cluster of machines Python is becoming an important tool for large data sets And I think that it will behooves especially as shoveling data at the front end becomes more common and more possible to watch how these data tools might fit into our projects and Django is going to be a great web framework to be on top of to be along for this ride with Because it allows such flexible Normalization providing common patterns if I happen to have an object per web page But letting me grow go wild if I have something more interesting than that a big deal Especially early on in its success as it has no dependencies which for people that still install stuff by hand Especially if all their organizational ours them as an old red hat enterprise server. This is a big deal URLs are simply text everyone else Django is competing with thought that your URL should name objects in RAM or their Attributes Django is the first one that was like get the URLs out of my code Let's say that the URLs live on the data side of the house I was very suspicious of that at first because it felt so wild I mean what guarantees that the you are if you have in slash person slash 21 What guarantees that a slash person page even exists if it didn't traverse an object on the way to person number? 21 if you're from the world where the URLs Visited a file system hierarchy It felt like the intermediate parts of the URL had to land somewhere first before returning your data Django really achieved a simplification by saying nope URLs their text Views can be simple procedures It privileges relational databases we should Give Django so much. Thanks for the fact that Generation after generation now of Python web developer has learned a sequel back in As I think of how many no sequel back ends have come and gone over the time Django has been here But people using Django are still on old reliable industrial think strength relational databases by default and it is a wonderful for default and Django was the really the first web framework. I think in Python Given the fact that finding the simplest possible thing is hard To be the first to market with some real Simplicity and I think that solution is going to last for quite some time and I'm happy to be along for the ride Thank you very much