 Hi, I'm Viral Shah and I'm one of the inventors of the programming language Julia. I have two other colleagues out here. There's Shashi Gauda and there's Rohit out here. So, you know, there's a strong Julia contingent this year as compared to last year. There are three more people, three X more than what was last year. It was just me. But actually you were there as well. So it was yeah. I think about three years back it was just me and then last year it was two of us and now there's three of us. So as you can clearly extrapolate the exponential growth of Julia, next year there might be, you know, maybe all of you will be, you know, contributors to Julia and talking about Julia. All right. I think my computer just decided to go to sleep. Okay. So I'll be kind of talking a bit from this corner out here while I do demos. So, you know, just focus on the screen, focus somewhere or by the way, so the, you know, thanks for coming out for the first talk in the morning. It's early. It's not easy. And I had a hard time coming here myself. So a good morning to everyone. Can we, can we do a show of fans of the kinds of programming languages people like to use? Hello. How many Julia, how many people out here who have tried Julia in some form or shape? Wow, that's pretty cool actually. For the others who haven't tried out Julia or even for those who are using Julia in some form or shape, do you guys, what, what, what do, what other things do you usually use? Matlab, RR, let's go with R first. All right. So I, there's always fewer R users than I think there would be in the audience. I expect like all hands to go up or something. Okay. So Matlab probably nobody. Oh, there's a couple of people. Okay. Three people. Python, I would expect a lot. Yes. Yes. Okay. But Python with like mathematical computer, like NumPy, like who does import NumPy? Okay. So I am guessing sort of more general purpose Python users and then so a few people are using machine learning, NumPy, all that stuff. What about, okay. And then how many of you guys actually deploy any of these things into production systems? Or how many of you rewrite these things into some C or Java or Fortran? A lot of rewriting. Okay. Few rewriting I guess. And, and I was, is it fair to say that the others are basically using it for personal like desktop exploration? Like you have a data set, you load it, you explore it, you get, and finally the output is maybe a slide rather than a system maybe. Actually, people keep telling me that if I just make it easy to generate powerpoint from Julia, then I will get a much larger user base then. Anyways, okay. So today's talk, I actually interacted with a lot of you out here in the audience and I decided to slightly tweak my talk. My original topic was going to be on parallel computing with Julia. And there was a lot of interest in just knowing about the state of Julia also. I'm going to basically make it about where Julia is going and how we built towards a flexible parallel computing system with Julia and what the roadmap for parallel computing is. What I won't be doing is giving a tutorial on how to do parallel computing in Julia, but hopefully sort of come up with the inspiration and the motivation for how we are doing things and showing you a lot of pointers so that when you go back, you can actually immediately start doing some of these things. So think of what I'm going to show you as a guide for either getting started with Julia. If you're already started with Julia, how to do parallel computing with Julia or just become contributors to Julia in any form or shape. All right. So the work that I'm talking about is done by tons and tons of people. The people in red that I have highlighted are actually people who are based in Bangalore and have contributed significantly to Julia. I just noticed that I don't have Rohit's name on this slide, unfortunately, but there's Rohit as well. I think I'll edit it right after this. All right. But today the Julia language repository on GitHub has about 350 contributors to the core Julia language itself. In addition, there are many more packages and all kinds of fun things that I'll talk about. All right. So the way I have this is this talk setup is basically I have a few slides and then I have a lot of demos and then I will just switch to, you know, if I finish early, which almost never happens, but in case I do, we'll just do random impromptu things where you can ask me to type whatever you feel into Julia and we'll see how it breaks or how it handles it. All right. I'm sure a lot of us saw these headlines, right? India to have 70 supercomputers. The government has approved some very large but budget of thousands of crores of, you know, some number with lots, 10 zeros or 11 zeros behind it, of worth of money to be spent on all these supercomputers. And probably we will see the traditional scientific computing kinds of use cases for these machines, right? So when you hear of supercomputers and you have those fancy looking, you know, space like installations, usually they get used for climate modeling or some atomic energy thing or for, for some structural computation, you know, when you're sending something into space, you want to figure out if the structure is going to withstand it, the fabrics, the materials, that's your traditional scientific computing, you know, the predicting some structure, some genomic thing, all kinds of interesting fun and scientific use cases. But I suspect that most of you out here are not actually doing the traditional scientific computing. I might, maybe your hand or two, if there is someone in the audience at all. Okay. That's what I would have thought that, you know, today is, today the world has moved beyond this stuff, right? 20 years ago, this is all that existed. But today people will ask us if you put up a slide like this, okay, 70 supercomputers is fine. But the question I keep getting is, but will it run spark? Right. And, and everyone is curious, right? I mean, there is a whole world of things that's going on out there where people are, you know, democratizing big data, there's like every new tool coming out every day. And presumably everyone wants to churn it, load it, do something with it, you know, produce a visualization, produce some insights. But there is this entire other world which has existed for three or four decades where there is a lot of learning about performance engineering about mathematical tech techniques, but all kinds of beautiful libraries that have come out. And are these words going to stay apart? Are they going to actually combine? Is there any leverage possible from one to the other? So let me ask you, this is my favorite question. I've been asking this in the last several conference talks. If you had 1000 processors today right now, accessible from your laptop, which you can imagine right on AWS, you can easily imagine getting this stuff. How many people actually do it? How many people have actually done something with on the order of 1000 processors? We have only one person from, this would have been at your, at Walmart or the earlier, previous, at a large bank. There was another hand I thought I saw. Zynga, okay, so the large distributed server farms, okay. Walmart, okay. So similar, maybe characterize just quickly what, you know, in a one line, like what do you do with these? Like, is it a single application? Okay, okay. So, so order management or some, some would like a data analysis understanding behavior of customers at some level. So I think the key thing is that people are moving increasingly from scientific use cases where you're simulating science that is simulating, you know, two galaxies colliding to actually now applying the same amount of compute to humans. How humans think, right? Like every company out here, every e-commerce company out here is putting thousands of cores, tens of thousands of cores to understand who is going to buy which pair of shoes in which city. And I mean, Flipkart has that amazing visualization running right outside, right? We can see sort of people all over India buying all kinds of stuff. And that's where the new generation of compute is actually going in is on humans. But if you had these thousand process, what would be forget what you actually do for your job? What would you actually like to really solve? Sorry? Of like what problem would you like to solve? Which talk to buy? Okay, that's a good one. Which talk to buy? Let's let's just go while let's just give me some ideas. I'll build them into demos when I come back here next year. I don't think the thousand cores will help you with that though. So I wish that could I was thinking, you know, that's one of my favorite days. So if I had a thousand free process, I will just mine Bitcoin, like, okay, that's a good one. So that maybe that's connected to your thing, right? If you mind Bitcoin, your bank balance will never go empty. So, all right, what what other applications hidden Markov models? Okay, so train maybe a speech recognition thing or something. So, you know, increasingly this is going to be a actually a challenge, right? Everyone talks about Hadoop and spark and big data and all that. But the fact is that today we have more computational power at our fingertips, then we know what to do with it. It's actually hard to come up with a demo of what to do with a thousand process, I will show you the least imaginative demo in as I go ahead about what I did with a thousand process. The first time I got it, I wish you guys can do better. So we'll we'll come to it at some point. So what is Julia all about, right? At at its most basic level. So this is Henry Pankare. People remember him from some high school college mathematics 100 years ago. So at its simplest level, mathematics is the art of giving the same name to different things. And essentially, if you think about a programming language about computing about anything, it's basically just about the right level of abstraction. If you if you have the right level of abstraction in which you can express your programs, the computer can actually generate, you know, high quality code and execute the tasks that you're giving it, right? So at some level, you know, what what what mathematicians have been doing for the last 100 years, and and what, you know, we as computer scientists are doing nowadays, there is actually a unifying common thread underneath, right? And when we talk of parallel computing, whether it is of the colliding galaxies of or of understanding human behavior, at the end of the day, we can only hope to crack these problems when we get the right levels of abstraction. Without the right levels of abstraction, you're just going to be sort of, you know, going off doing all kinds of things without understanding them, right? And maybe that's fine for a business need, right? If I have to just get something done, it's okay, I can I can just download some machine learning thing, something, you know, as long as it works, if it produces something, I get my job done, that's fine. But at the end of the day, we want to get one level deeper, we want to understand what we are doing. And it's important to sort of keep, you know, that keep the scientific touch at the back of our mind. At least that's what I feel. All right, so if you have never tried out Julia, you could try it out right now, you could go to Julia box.org login with your Google name and password and actually type in some of the things that I'm speaking and try it out right away. Shashi is one of the authors of Julia box, along with Tanmay, who's not amongst this audience today. Where else is Julia being used? So this is, you know, when I talk of abstraction, so one of the fields that Julia is being used is a quantitative economics. And this is essentially being done by Professor Thomas Sargent at the New York University. For you may have not heard of him, but if you have, he was the Nobel Laureate in 2012. He's moved his entire research group to Julia and his entire textbook, which is a standard economics textbook is now in Julia. And Python, you can see the two logos at the bottom. They're working with the Federal Reserve Bank of New York, which builds models of the economy, large distributed parallel computing stuff, and it's all going to be in Julia. Here's where the Julia ecosystem stands today. About a thousand downloads daily, the Julia website has actually received a 1.6 million unique visitors since its launch, which I essentially feel like that might be sort of all the number of people who might ever want to use, you know, a tool for data science have essentially come and visited us at least once. We have a staggering number of GitHub clones and visitors to GitHub on a regular basis of 500 clones of the Julia repository every every fortnight. So these are people who are either following Julia on the bleeding edge on master or are contributing. Julia box has 10,000 users, we estimate about 50 to 100,000 users out there doubling every nine months. A lot of people in companies are actually using Julia, you know, quietly either as individuals or as as our customers. And some of them are actually in this audience, they may or may not want to identify themselves. But all right, this is the obligatory performance slide. I don't want to spend too much time on it. I've shown it the last two, three years, but this is just a quick recap for people who may have not seen it before. And it's actually not even very visible. This this project is actually really bad, it seems. But what's, you know, if you couldn't see any of the numbers and text on these slides, what what you need to just basically see is that the the line out here represents the ratio of the performance with C. So if if if a particular language implementation has a dot here, that means it has fastest C. And this is 10 times slower 100 times 1000 times and 10,000 times slower. So for example, there is a parse integer thing right there on the top yellow, which is octave. And that is 10,000 almost 10,000 times slower than Julia. These are very sorry, well then see yeah, sorry, thanks. But Julia is almost as fast as C. So it might as well be Julia. And and if you look at so the first one is Fortran Fortran is almost as fast as C unless you do some string computations that's go that's Java. And then increasingly you see that sort of the standard deviation starts increasing that's JavaScript out there. And then you have your interpreted the more compiled languages are on this side, the interpreted languages are on the other side. Julia is actually the only one which has a very good clustering around this. And on these sets of benchmarks, right? So of course, never believe benchmarks just use them for for what they are worth. We had Julia con last year. And this year was 3x larger. We had a nice cake, which which said Julia con on it, which is kind of fun. And amazing hackathon, that's a room full of 50 60 people were simultaneously hacking on Julia and making commit. So that was a lot of fun. I hope some of you guys can make it to the next Julia con or maybe we should Julia con in Bangalore this year. Or hands if you want to see Julia con in Bangalore. Wow, everyone here would come awesome. All right, it's happening. All right, once more show of hands for the photographer. Sweet. Okay. All right. Now I turn a little bit into the world of parallel computing. This is my past life. So, you know, my PhD was in high performance computing. That's how I got into this. Always wanted to do scientific computing, you know, try to understand how the world works, the nature works turned out after all these years, it's actually harder to figure out what humans do than what, you know, science science does. But at the bottom of it all was, you know, there was a system that I built as part of my PhD thesis. It was called star P. And it was basically a parallel extension to our MATLAB Python. Primarily it was MATLAB driven. And it had all this complicated stuff, like this is the work of a marketing guy, okay, no, no, no reasonable technical guy would actually make this Visio kind of diagram. But, you know, we had a company that we were part of and we made all these complex components. There was a client, there was a server, there are all sorts of libraries and you know, I mean, it's unreadable. But this is how this is how all these complex diagrams usually look like, right? And they're supposed to, I think the only goal out there is to show that we have some complicated stuff, which you won't understand. So, you better pay us a lot of money for it. That's what I usually walk away with. That was 2003 to 2009. And, you know, we learned a lot from that experience, actually, I still think that it was one of the best parallel systems put out there that even Julia hasn't fully met yet. But Julia is open source, as you all know. The lessons that we took away from our experience building a parallel system before, which we are trying to incorporate into the Julia design now. And this is sort of where I'm motivating the rest of my talk. The biggest problem was you cannot build. So, when we started with Starpey, we said, you know what this sequential language stuff is solved, right? That's Python MATLAB R. Everyone knows this stuff. We should just accelerate these things and build on top of this. And there is an industry of people doing that even today. But, you know, we did that for a decade back then. We tried really hard to improve the performance of octave of Python of R by adding parallel primitives to these languages by extending their object systems. And finally, we realized that that is not the right way to go. If you want a good high performance parallel computing platform, you cannot build it on top of something that was not designed for it. So, having an open source high performance based language is essential. And after 2009, you know, when Interactive Supercomputing was bought by Microsoft, me and my colleagues, Jeff Bizansson, Alan Edelman and Stefan Karpinski, four of us started this project. Literally, I mean, people often ask me how, why did you start writing a new programming language? And, you know, this is the reason. I mean, I spent 10 years doing something, finally figuring out that it was not the right approach. And then we started bouncing ideas. And, you know, one week later, we had a repository and there was code. And everyone wonders how do you start something like this, you know, how these things happen. And it's actually the story is not that interesting. It's, you know, you're sort of unhappy with the state of affairs. And then you start doing something. And then, you know, maybe others think it's a good thing. And it kind of builds on. If it does not build on, you throw it away and start something else. That's how I think about it. But that was the first language. We need our own, the first lesson. We need our own language. This was our, we had a sales guy in that company whose truth about performance was give me 2 to 10 X performance with as many courses you can throw at it today. I can sell it. Don't give me the theoretical crap, right? All those, all those, how many of us have seen those scale up diagrams that all go like this or like this or, you know, some, you know, all of us have done that, right? All of us have written them in papers or but in the real world, it doesn't matter, right? People want problems to be solved. So this, it was a fascinating place to work. We did amazing stuff, but, you know, I think the fundamentals were wrong. That's why we started work on Julia. Julia has now progressed enough five years down the line. 2009 is when we started and I'm happy to announce that we actually have a company now called Julia computing, which focuses on helping out enterprise users who actually want to deploy Julia in production. And we have close to a dozen clients now. And if anyone is looking for contributing to an amazing parallel computing language or just a great language ecosystem, talk to me. Okay. So that's the end of my blurb on Julia computing. This is the world that, you know, I was, I was hinting to when I started, right? So there are these things which all of we did a show of hands and everyone's using a lot of these types of things on their desktops. There are these types of things where everyone's also doing quite a bit of right. So with, with your cloud and I mean AWS, how many people using AWS out here? Pretty much everyone as your, oh my God, that's not good for Microsoft, I guess. All right. So everyone's using AWS. So are we. And, and, you know, you can get those thousand process, you can do Bitcoin mining or, you know, speech speech model training or deep, deep learning models. And then there is that world of which India is going to buy the government of India is going to buy 70 more of those things on the absolute right. And these words are all disconnected, right? I mean, what are we going to do about it? Keep it at the back of your mind. We think Julia is actually the right bridge. It takes all those learnings from the right hand side. It, it's already sort of encompassing everything on the left hand side. And it lets you use what's actually coming out in the market, which is, which is the middle one. All right. So the rest of this, I'm actually going to do some demos and unfortunately, because I cannot, you know, I did not have a reliable internet connection. I have not planned to do the cool big parallel demos, but I'm going to show a good feel of where Julia's performance is, the kinds of things you can do. I'm going to do an SVD, a recommendation system and image seams. I think stock price analytics is going to be done by Shashi in a talk in the afternoon. And then I have a save demo or a notebook, which I can show on this and then a roadmap for parallel computing. So that's how the rest of my talk is structured. All right. So let's switch to some demos. All right. So that's the Julia box screen. Did anyone try it out and was able to get to it? So I guess, okay. So a couple of people were able to get to it. All right. All right. So this is my first demo. Is this, nothing is visible actually. I don't expect you to read this stuff. So this is, this is just there. This is a set of homework that are actually given out in an undergraduate linear algebra class. So how many of you have taken an under undergraduate linear algebra class, you know, in your engineering, I think everyone is an engineer and everyone's taken it, even if you're not raising your hand, I know you've taken this class. All right. So, and you might remember some terms like singular value decomposition and Eigen values and Tim poston actually did a amazing refresher for us yesterday. So I think I don't even need to say anything. We did this homework, homeworks for the students at MIT who are learning. In fact, Shashi actually implemented a homework API within ipython notebooks. So this is, this is an ipython notebook, by the way. Well, now rebranded as Jupiter notebook. Ju is for Julia, PY is for Python and maybe Pyt is for Python. I don't know and R is for R. So, so ipython notebooks now called Jupiter notebooks actually natively support Julia. And one of the first things that you do in these classes is, you know, you have all these exercises that students do and they're kind of boring. Probably most people just ask their friends and write in some answers is probably what happens. But you never actually understand what's happening in the singular value decomposition, right? At least I remember when I did my classes, I just probably made up some stuff and whatever I could remember from memory, I wrote it down in my exam. I don't think I did very well as my exam scores showed, but that's not the point here. All right, so this is an image of anyone recognize this picture? Gilbert Strang, that's right. He has an amazing set of videos. I recommend that if you actually care about any of this stuff about machine learning, about linear algebra, he has an amazing set of lectures on, on the MIT website. Just Google MIT open courseware against Gil Strang and you'll get the most amazing linear algebra lectures. And for me, that was like when I, when I looked at, went through those several years ago, it, it changed my life. I mean, it was that good. So that's Gilbert Strang. Now, what are you going to do with Gilbert Strang? You're going to do an image compression of Gilbert Strang. Here is ranked zero Gilbert Strang. All right. So, so this is a simple SVD of this image. I forget the code. It's not visible out here. But, you know, everyone knows you can do a singular value decomposition and break up a matrix into three factors, U times S times V. And the product will give you, you know, the original back and the number of singular values and the way they are spaced out their structures, their magnitude, tell you a lot about what information content is there in your dataset. This is what you do when you're, you know, applying your models for machine learning for your principal components analysis, essentially the same machinery running under the hood. I have these beautiful sliders out here that actually are part of Shashi's interact.jl package that work with the notebook. So when I, what I'm doing is every time I move the slider, I'm actually recomputing the image by using a few of the factors from the singular value decomposition. So when case one, I'm saying just give me one factor, just give me one feature from my decomposition. So give me one piece of information with which I can reconstruct the image that I had given originally. And this is what the rank one Gilstrang looks like. It does not look like Gilstrang at all actually, right? I mean, just looks like it's some random mat kind of thing. But as I, as we start, I mean going in, right? So we are saying now instead of one feature, give me like five or six features. You can kind of see a human face. And then I think if you go to about 20, you can, you can say that, yeah, this is definitely him, right? So I had an image which has almost 300 features, right? The original matrix at 300 features, but I'm able to get a pretty darn good approximation with just 20 features. This is essentially what you do when you write your movie recommendation systems, right? You take the large recommendation matrix, users rating movies. And then you say, you know, if I just had 20 pieces of 20 features or 20 pieces of information from this, can I actually make a prediction? And turns out you can in many cases, you know, if the theory holds, and this is 20 pieces of information and reconstructing an image. So this is a very simple two minute guide to image compression, right? I mean, I just, what we did is we did an SVD of that image and we compressed it and recreated it. The question, the point here is, I wanted to actually highlight a few things. So, you know, Julia is incredibly fast. I can actually start doing these things in real time. I can recompute the SVD in real time for images like this. We can also, and this is a part about data exploration and data science that often gets missed out. I think some of the users of R who might have used shiny might appreciate this that actually building these sliders and exploring your data saying, you know, visually, right? That when I move it and I see a graph changing or I see something varying with time. I see data at different levels of aggregation. I think this gives a lot more intuition into what the data is about and what the underlying structure is about, just being able to play with stuff, right? I mean, you know, Tim Poston gave a talk, right, where he talked about geometry and being able to feel things in 3D and everything, right? And being able to manipulate your environment. All of us did that as kids. And I think that there is actually real value to what he said and as these sliders and this SVD approximation show. So you can just kind of, I don't know, I keep like just like to do this all day. All right. A second demo. This is a demo of a recommendation system. Now, everyone here would have written a recommendation system. I am 100% sure of that. I mean, you would come to fifth elephant if you did not write some recommendation system of some kind. Or maybe not. I don't think everyone needs to spend their life writing recommendation systems, but here's let me show the code. This is actually a tragedy out here that you can't actually see anything. It doesn't even matter how big I make it. All right. I'm an Emacs user and that is also the reason for my RSI, all that control X and control C stuff. All right. So I don't think I want to show you this code right now, but anyways, you couldn't see it. But this is essentially recommendation system written in Julia. This is one of those Netflix prize winning algorithms. And it's not one of those Franken algorithms that won the price, but it's one of those algorithms that got 95, 98% of the way with a very elegant solution. And it's basically the alternate least squares model. This work was done by my colleague Abhijith who is not here this year. And this is the core of the loop. It's basically doing it's not doing an SVD. So when you do SVDs or you know, when you do an SVD, you end up getting negative values, right? I mean, when you actually look at the features, you might get additive and sort of subtractive components. So it might say add this thing, then remove a little bit of this, then add something else, remove something else, and it will reconstruct your image or your data or whatever. This alternate least squares method that's being used here is actually is essentially like the SVD under the hood, but with a constraint that says that only additive things are allowed. And people believe that only additive things are actually how the human brain works. So for example, when you recognize faces, you don't say that, okay, you know, this phase, you don't think that this phase is this, this set of eyes minus this nose plus those years, right? You kind of recognize individual features in a face when you recognize them. So there was a beautiful paper about using the same kind of tricks to have how face recognition works. But this in this case, you're applying it to recommending movies. And this is this is alternating least squares method. It's literally just 20 lines of code. And then there is this recommend function, which you know, which is which is doing the recommendations. The code is all online. It's open source. And it is reasonably good quality. It is also very easily parallelizable. But I will leave it to the audience to figure out what the obvious parallelization trick is. The hint is that this thing out here is actually iterating over all the users and over all the movies independently. And so with what I show you in the rest of the talk, I think it should be an easy exercise for any of you to take this code and actually parallelize it. If you do that, please come up, please send a pull request. I can use it in my next demo. All right. So this is about, let's just say how big this code was. So this is about 97 lines of code. But really, when you build a recommendation system, that's great. I mean, I think everyone out here could figure out the math or just download that library and and build that thing, right? That's that's not a big deal. What would be really interesting is if you could, you know, build all the UI layers around it and actually build a real application and deploy it either facing the world or just for an internal deployment for your colleagues or your friends, family, whatever the purpose be. To me, building UI is like what is in the background now. So this is an output of that recommendation system. And before I go there, okay, actually, let me show what it looks like. So this is this is a visualization that we built on top of it. And here we have four users. I mean, I'm just I've just randomly taken four users, one, two, three and four. I was slider out here with accounts and then I have these movies that come out. So top 15 recommendations for Bob and I could just make it like one recommendation and it's just doing it in real time. I'm using a small subset of the data. So the recommendations are kind of junk because I've done it small enough to actually show you something in real time. Otherwise, this would take a long time to train the model. But you see the, you know, how the mathematics, the machine learning, the data has all come together and a beautiful visualization, right? To me as a, so I've been a traditionally a scientific programmer all through my career. And I always knew that I wanted to build this stuff, but I was like JavaScript, HTML, CSS, like I had no patience to chase all of that stuff. And thanks to Shashi's Azure package, we are able to build these things. The code that has produced this is actually 20 lines of Julia. I'm not going to show that code because I want you to, I want to save time for my talk and I want you to go to his talk instead. But this is the kind of stuff you can do. So let's, you know, I could change a user out here and, and I got a new set of recommendations and I could just make it a fewer than I can scroll down beautifully. And the other important thing is that all the fonts, the quality of the rendering, everything, it's just, it's just sort of nice out of the box, right? I don't have to go and change some defaults. By the way, I hate it when you have open source software that shift by default with bad option by bad defaults, like why can't you just have good defaults, right? I mean, why do you have to have bad as the default and then change, you know, have the user do something good. It should always be good by default. That's at least what we try to do with Julia. All right. So this is a, this is a recommendation system, but with visualization. Here's a, here's the third demo, which is a fluid images notebook. And I think I'm not going to spend too much time on this, but essentially this is carving out image seams. And we've taken this thing from the fifth elephant website. And basically what the algorithm does is, you know, given a point on top of the image, it finds a seam in it. And if you remove that seam, you can actually delete it and sort of compress the image, not in terms of size, but visually compressed. So let me just, you know, I forget all this code, right? All of you can go on the internet and search for image carving images or carving seams and figure out what it's doing. You know, everyone's smart enough here. I just like to do these things. So here I'm saying that start from the first point on top of the left, left corner of the image and find a seam. So that red thing is the seam out there. And then I'm just saying, okay, let's find a seam that's out here or out here. And this is actually updating in real time. The original, this is actually a homework from a Brown University and it was done in Python. So what they did was they actually gave very tiny images. Like they were like this small, they were like about this size, because in Python, you could actually not solve this complicated algorithm in real time. I mean, in any meaningful time for a student to do the homework. But in Julia, you can just kind of keep doing this and seeing the performance of this. All right, so these are the seams. And so how do we, you know, what do we do with this image now? So, you know, as I go through this, I'm kind of removing seams from the image from all the way and so this is this is the fifth elephant banner that that is completely compressed with this image thing, you can kind of see right that if you just randomly did it, it might not kind of look like the, I mean, you might not have that elephant thing out there. But now you can kind of see, oh, there's the elephant and it says the fifth elephant, it didn't remove words from the text, but it kind of removed all that empty space, which was probably not adding too much value, which looked nice visually. But if I wanted to, you know, get skinnier, then then that's kind of the stuff I would throw away. And and again, you know, we, you know, maybe this is kind of a right size of compression for this thing. All right, now I get to my simple parallel example. And this is my least imaginative use of 1000 CPUs. I got them on Amazon for $10 an hour using the, what, what, what, what is the thing where you bid and get those image, those CPUs spots, spot instances? Yes, thank you. So I got these spot instances and you're actually going to unveil this feature in Julia box, where you could just swipe your credit card, well, virtually swipe because you can't actually swipe it on the internet, but you could put your credit card on, you know, using Amazon credits, you can get access to these computers through Julia box. I think a lot of the people out here are comfortable booting up AMIs and doing cool stuff on AWS, but actually in the real world, either people don't want to do it, but most often they cannot do it, right? I mean, there are all these steps that you have to go through and install the software and configure it and all that. In Julia box, we expect you'll just sort of, you know, put in your details and, and you get the CPUs and just get charged by the hour. At least that's the, that's the dream. So we add the workers. In this case, this number is 1024 out here. So we added 1024 workers. So it was basically 32 times 30 to 32 nodes of 32 cores each. And the point here is that I'm trying to show is that how you could actually use this thing interactively and you will be able to supply the use cases. So I did n workers. I got 1000 of them. Julia's this cool command called peak flops. If you want to figure out how fast your CPU actually is and you have no idea if you, you know, of what, what that number is, then you just run this Julia function called peak flops. For the guys who logged into Julia box, just type peak flops and open left and close right parenthesis. In this case, it came out to one E 11. So that's a hundred giga flops. Does that sound right? Yeah, it probably does. All right. And then here's my simplest parallel program. I'm doing a P map, which is a parallel map, run peak flops on a problem of size 2000 forget that's just an input argument to peak flops for I equals one to n workers and workers is one zero two four. I'm going to this one line is actually, you know, it's got the cluster. It's configured it. It's booted up Julia everywhere. It's got all the process. It set up the remote procedures, the remote RPC calls. And now it's just saying P map peak flops on everything and after it runs it, the results come back. So on some processors, you know, the result is like one E 10 on some of them. It's like 99. You might be wondering, why did I get one 11 right? Why did I get 100 giga flops there? And I'm getting something on, you know, 10 orders of magnitude lesser out here, right? One order of magnitude, which is, you know, 10 X lesser. And the reason is that peak flops internally actually is multi threaded. And if you are actually running only in a single thread, single thread of computation, it's actually trying to use all the processors that you have in on that shared machine on the shared memory of that machine. So probably it was running on a 32 core thing and or a 10 core 16 core thing. That's why it got it. But when we run in parallel, we actually disable multi threading because otherwise you have n processes, each of which try to use n threads. And now you have an oversubscribed, you know, set of threads and you get terrible performance. So, so this is what we did. And then finally, you just sum it up and you got this number, which is one e 13. So it's 10 teraflops. So for $10 an hour, you can get a 10 teraflop machine. I mean, this is really cool. And this is really amazing, right? You can actually start factorizing RSA keys, small RSA keys like RSA keys that were generated probably, I don't know, five years ago, I think with compute like this and and and enough dollars that you can throw at it, you can actually do it. That is why the government actually up the minimum RSA key size from 1024 to 2048 bits. Now, I think anything in a 256 512 range might be crackable. I don't know. I haven't tried it. If you have a crypto expert who knows some of these things, maybe they can raise their hands. I'd be I'd love to build a demo that cracks RSA keys in real time. All right. And okay, so here comes my least imaginative demo. I think someone talked about doing Markov chain Monte Carlo, right? So large Monte Carlo simulations, very common in large finance applications, risk management, asset pricing, a lot of physics, all kinds of stuff. At the end of the day, what's happening under the hood is is coin tosses generate a random number. It's it's predicting some path of the future. And on these 1000 things, the first thing I like to do is I wrote that goes from one to that's 10 to the 12 actually, so one to trillion. And I'm just adding up all the random numbers that I can generate a random Boolean values. The point I'd like to make out is not that this is a this is not a very imaginative program. It's obviously not, but it's representative of a lot of interesting computation that can happen. And in most interpreted languages that I've done that I have myself used over the years for data science, I have been unable to actually write a for loop that goes from, you know, I would dare not write a for loop that goes over 1000 or 10,000 if I'm using Python or R or maybe MATLAB is a little better nowadays, I think it is, but I haven't used it for the longest time. But if I would dare not write for loop that goes over a million of anything, forget a trillion, right? So what this shows you is is just the ease of use that has been built into all these things. So hopefully this is this is interesting enough and gets your juices flowing so that you'll do something interesting with it. All right, I think I'm going to wrap up with a roadmap for parallel computing. I think I have a better blown up version out here. Go full screen. So this is how this is how Julia development actually happens, right? So it's all on GitHub. Ideas get proposed on GitHub, people, you know, put things together and then, you know, things get implemented piecemeal. So this is issue number 12044. If you just Google Julia number 12044, you'll probably land up on this issue. And this is how we are building our entire stack for parallel computing. So you know, when people ask about can Julia be used for data processing for big data for this for that. So Julia is not actually a framework, right? So for example, Hadoop is a framework. Spark is a framework, right? Hive is a framework is a framework in which you load your data. It's meant for a specific purpose. Julia is actually a general purpose programming language. You can you can do math, you can do, you know, distributed computing, you can build user interfaces, you can write games, you can do all kinds of stuff in it. And a data processing framework is one of the things that you would do with parallel Julia. So this is the way the stack is and this is not a roadmap for the future. This is a roadmap of which 95% actually exists today. So at the lowest level are your byte streams TCP IP, but if you come from the other world, it's MPI zero MQ other things, a message API, remote procedure calls on top of that team app, the kind of stuff I showed you and then topology of smart iterators and DAG workflows. So that's my last slide. How can you help documentation tests, give a talk and wrapping up. I have a lot of stickers. If you want to put one on your laptop, please come and see me outside. Thank you. The talk is really interesting and we forgot to kind of ring the bell. So you're like five minutes to take questions when we are getting the other speakers set up. Okay. Do you guys have any questions? I see that the performance and other aspects that Julia talks about. I see a lot of parallel to what Golan tries to do with parallelization and things as well. So why Julia, not why go for data science and things? I don't think the focus, I mean, I have not used any go. So take my words with a pinch of salt. But from what I've read about go on the blogs and things, I think go is a fantastic language and a fantastic design for building large scale distributed computing systems. But it's not a language that is mathematically or numerically focused for doing desktop data exploration. All those algorithms that I showed, I don't think they exist in go. In fact, I believe someone from the go community is trying to wrap Julia inside of go so that they can get access to all these things in go. All right. Any other questions? I have a follow up question to the, I think it's interesting that you said Julia is a general purpose language and maybe the first adopters of the data science community. So how far along is like a framework that you are building, which is to wrap rest API around Julia code? It already exists. And in fact, the company that I founded Julia computing along with my co-founders actually commercializes Julia for enterprise deployments. And that's exactly what people do. There is a Julia web API.jl open source package. It's pretty trivial actually to use. You just provide your function and, you know, make a function call and it becomes available over the rest API. And Julia box is actually going to have a private version. The one I showed you was the cloud version. In the private version of Julia box, you can actually get a complete auto scaling and, you know, something that runs and engine x proxy. And you just write your Julia function and have it served, you know, on a large scale thing, pretty much five lines of code. So that's where we are going. All right. Thank you.