 So when we had data critic heard that there was going to be a meetup that was Montreal data Montreal Python Montreal Docker We knew we had to be there, but I only have 300 seconds. So so get ready Today I'm really excited to do the first ever demonstration outside the walls of data critic of our new product It's a machine learning database MLDB. And so the data part is pretty easy MLDB loves data MLDB is essentially an end-to-end system designed to Collect large volumes of sparse data to train machine learning models on that data and then deploy and keep updated Those models for real-time or batch processing. So MLDB loves data MLDB loves Docker MLDB is essentially a rest API Delivered as a Docker image. So to run MLDB It's just Docker run, you know some port mapping MLDB and that's it then you connect to whatever port you you've you've specified and that's basically our Docker and DevOps story with MLDB And then last but not least MLDB loves Python MLDB loves Python in three different ways that I'm going to talk about. So number one You know rest API's are awesome for interacting over the web rest API's are awesome for integrating distributed systems Rest API's are not great for their network latency if you want to do if you want to call a rest API in a tight loop You're you're in for some pain and some latency So what we do with MLDB is we allow you to take a Python script and ship it to MLDB over rest And we'll execute it in process for you So we'll do the rest calls inside the inside the MLDB process bypassing the network stack So that's one way in which Python is essentially the scripting language for MLDB The second way MLDB loves Python is that you know, we love the ipython notebook Jupyter Interface for data science. It's an awesome way to have a sort of conversation with with a library to do some visualization To do data science. So with MLDB we ship a library called Pi MLDB Which is essentially a set of Python libraries that help you interact with MLDB We've integrated pretty deeply into the ipython notebook using the ipython Cell magic and I'm going to give a quick demonstration Of that that alongside the the shipping of the Python code Look, sorry So given I only have my whatever is left of my 300 seconds This is a Teesney visualization of the reddit universe. So every dot in this in this plot here is a subreddit You've got all the musical subreddits up here. You've got all the little my little pony subreddits down here. I presented Basically how to do this using scikit-learn back in December This is this is the notebook that ships with MLDB on how to do this inside MLDB So I'll show you very very briefly how to do this with the Jupyter instance that ships inside MLDB First thing you do you load the MLDB extension after you've pip installed Pi MLDB and then basically you're going to use the percent MLDB Cell magic to have an HTTP conversation with the server. So I'll walk you very briefly through this We're going to delete the data the sort of machine learning pipelines in case they're already there And then we're basically going to use three algorithms SVD singular value decomposition. So we put that k-means for clustering. We don't have DB scan yet And T-SNE for visualization we use all the same tricks as scikit-learn to make it really really fast So the next thing we need to do is we need to load some data now This is basically how you can ship Python code over to the MLDB MLDB process you just say percent percent MLDB Pi and this little script here which Downloads a file from the internet decompresses it in memory and loads it into an MLDB Disset will run inside the MLDB process. So it'll be it'll be quite fast We're basically using the same rest interface as you can use over HTTP But we're bypassing the network stack So this runs in in in about 10 seconds and then all we need to do is run our pipelines We get it out to the bottom and we can basically query it using You can create MLDB using a dialect of SQL To to get out cluster labels x and y and we can make our little subreddit map using bokeh So this is all done in iPython notebook You can run the iPython notebook that ships with the Docker container or you can run one on your own machine So those are the first two ways in which MLDB loves Python You can send Python code to it and we have some awesome hooks into iPython Notebook slash Jupyter depending on what you want to call it the third way that MLDB loves Python is that we provide an SDK so you can build plugins for MLDB in Python You can essentially embed some Python code into MLDB to Essentially serve up new rest routes to build rest API's predictive API's and to build new UIs so I have a quick little demo here called Titanic which Which will compute your chances of surviving the Titanic disaster based on your characteristics as a passenger I'm running out of time But basically here you have the percentage the likelihood of surviving and you can see that the more you paid for your ticket the more The more likely you were to survive the older you were Not not too too much relationship So what I'm demoing here is that essentially as I'm sliding the slider around my browser is making repeated HTTP calls to MLDB to the predictor inside this scales up to thousands of requests per second for real time for batch scoring We can do millions of scoring per second So this is basically my demo of MLDB where we're at as a project This is this is a major new product for our company. This is the first time I'm demoing it. I'm pretty excited We're essentially in closed alpha right now, so I don't want to broadcast the URL here But if you come talk to me or anybody from datacritic team raise hands. Yay Anybody from datacritic all of us here on the tech team are based in Montreal We'd be happy to talk to you be happy to meet with you in person get your feedback So come talk to us. We'd be would be super happy to chat