 So I've been going through the speaker's list ever since it was out and One thing that stood out was that I'm like the youngest speaker here today So I think this photo you already over here. Disclase me very well So it's a great honor to share the stage with some of my role models and It would not be an understatement to say that I'm here to just yes. These guys speak I Come before you as a cat amongst giants, and I hope you will forgive any mistakes that I made So you're probably wondering who is this guy and Introducing myself. My name is Samir Deshmukh also known as wardrobe on GitHub and Twitter I come from the incredible country of India, which is home to the lonely deserts of Rajasthan the serene and snowy Himalayas the emeralds in the Indian Ocean that are the Lakshavadi pylons and The monument that is described as a teardrop on the cheek of time the Taj Mahal Specifically, I'm the forms and I'm from the city of Pune, which is a city of about six million people It's also known as the Oxford of the East because of around two million students who studying in the city in about 700 colleges So I'm also an undergraduate student in computer engineering at the University of Pune Most of my family has mostly consisted of doctors in fact my great-grandfather was a doctor My father my great-grandfather is a doctor. My father is also a doctor But somehow I turned out to be an engineer and look like this not too long ago I also play in a band called cat kamikaze. You can check out a music on SoundCloud. I also love cats And I like the Japanese too. So it's cat kamikaze I am on the core team of the Ruby Science Foundation. So also known as Sai Ruby The Ruby Science Foundation is committed to making Ruby a viable language for data science and scientific computing Which is something that Ruby has been lacking in for a while now It's very popular for web development, but not in this domain as much So we basically make tools and infrastructure and open source libraries for this purpose I was first introduced to Sai Ruby as a summer intern for Google Summer of code 2015 Later that year. I also received the Ruby Association grant from the Ruby Association and This year I'm an admin for DeStock 2016 administering the Manage managerial duties for Sai Ruby and also mentoring a student for improving data science tools in Ruby So the talk my talk for today will be scientific computing in Ruby And this talk will go over a few tools that you can use in Ruby for data science and scientific computing And I will follow up my slides with a short demo of these tools The first tool that we'll go through is called the iRuby notebook iRuby is a browser-based Ruby REPL shell that is mainly used for interactive computing So this is what an iRuby notebook looks like as you can see it's running in your browser And there's an input shell in which you can put in Ruby code. It renders it very nicely in the form of HTML, JavaScript or CSS You can also put in a form inside the notebook and you can accept Input from users which you can use for input into any Ruby script that you have in the in the notebook The next gem is called nMatrix nMatrix is an n-dimensional area object similar to Nampa and Python It helps to interface Ruby with a few high-speed C libraries such as Atlas or LAPAC for linear algebra calculations So this is roughly what an nMatrix allocation looks like nMatrix is not a dynamic object like say Ruby's array, but it is of a fixed size So in the first argument I'm saying that the nMatrix should be of size 2 by 2 that's a 4 size nMatrix In the second argument I'm specifying the actual elements and then I want a matrix of 32-bit floating point numbers So I specify the b-type option as float32 So nMatrix supports 8 data types currently ranging from 8-bit integers to 128-bit complex numbers And it also has three storage types Two of which are sparse storage types and one is the normal dense storage type By itself nMatrix is more of a Ruby container is more of a Ruby wrapper over C arrays but what makes it truly shine is that it exposes a very powerful C API which can be used by anybody for Interfacing these internal C arrays with a variety of high-speed C libraries So you can basically create a plugin for nMatrix with nMatrix as a dependency and Use this internal representation and the nMatrix API for creating your own plugins Currently we have plugins for four major libraries Atlas and LAPAC are mainly used for linear algebra FFTW is for fast Fourier transforms and the GNU scientific library or GSL As you know has almost every mathematical function one can think of The nMatrix is also coming to JRuby by August and it will have the exact same API as the nMatrix for MRI and It will use jblast for linear algebra The third library is called Niaplot Niaplot is an interactive plotting tool It basically generates interactive HTML and JavaScript plots that run in your browser and let you interact with them So this is a very simple sine wave plot generated by Niaplot Niaplot also sports a somewhat similar plugin-based architecture similar to nMatrix It has map-nia for creating map visualizations Niaplot 3D for three-dimensional objects and Bionia for realizing relationships between genes The fourth library is called Daru. So for most rubies from all over the world Daru stands for data analysis in Ruby But in the national language of my country that is Hindi Daru also means alcohol I hope Daru the library becomes as important to rubies as Daru the drink is for Indians I hope both of them become equally important at tonight's conference party It's a so Daru is basically a library for analysis and cleaning of data You can say that it's primarily used for data scientists who are interested in analysis of data It provides some powerful indexing functionality and it also lets you read and write data from many different sources like CSV files Excel files SQL databases and even active record It works very well with wild data So it so happens that almost 60% of our data scientists time is spent in cleaning data and bringing it to the state that That can be analyzed by conventional means and Daru tries to make this very simple by Providing functions for doing that. It also has a lot of statistics functions, which work with wild data as well The libraries that I've quoted right now near plot I ruby and in matrix Daru was able to leverage these libraries for things like efficient storage of data visualization and interactive computing So Daru primarily performs its functions via two data structures. The first one is called a Daru vector It's a one-dimensional heterogeneous area object and each data data point in the vector can be labeled independently with Daru's indexing functionality The second object is called the Daru data frame This is a 2d spreadsheet like data structure that can be indexed both on rows and columns I'll switch to my demo now So this is an iRuby notebook As you can see you can put either markdown or a code over here Now I'm going to create a form So I can put in the number of rows that I want for a matrix say four columns four and I want the number to be three inside it So to turn the hash and I use this hash to create a matrix Short demo of n matrix Now I want to create a four by four matrix Consisting of 64 bit floating point numbers So I specify the shape as four by four and I specify 16 elements over here Now I want to create an n matrix with eight bit floating point eight bit integers So just for demonstration purposes, I'll use 129 over here if you store 129 an 8-bit number It will be reflected as minus 127 You can access any element in the n matrix with a simple Ruby array like API So I want the 0th row and the third column. So which is minus 127 You can also assign numbers So 0th row and first column I want to be 56 and do that, but as I said it cannot be expanded like Ruby array So this gives a range Now I want to use the Atlas plugin for performing some fast computations and we'll compare this with the Ruby matrix class for To know actually how much the speed actually increases So let's benchmark this Now over here, I'm creating an n matrix Of size of these sizes five to two hundred and I do a dot product on them for matrix multiplication So let's benchmark this now. So as you can see it's instantaneous The for size 200 the speed is zero point zero one seconds And I'm doing the exact same computation and matrix on a Ruby matrix. Sorry And the time turns out to be quite a lot. So as you can see it's magnitudes faster For the same computation Now let's do a small example where I'll solve the system of linear equations using n matrix By the way, you can also use iRuby notebook to represent latex equations, right in your browser like this So I'm representing the coefficients of this equation this way in an n matrix of 32 bit floating point numbers and At the right-hand side In an image is called RHS now use the Atlas solve function And it gives me the solution Now let's head on to a demo of nap plot So now I want to plot a simple sine wave similar to the one that I showed in the presentation Here I'm creating the data for the x and the y axis I create an upward plot object and add it to the line graph and add a line graph to the plot object Now I want two waves on the same plot. So what I do is I want to sine wave I create data for that using the math sine function and I want a cost wave and I create the data for that with the math cost function The sine wave should be a line plot which I add with specifying the line option and The cosine wave should be a scatter plot which I add with the scatter option The color over here is red and the color over here is green So as I said, it's an interactive plotting tool You can actually zoom into this graph and if you hover your mouse pointer over one of these dots It will show you the exact coordinates on the x and y axis So now we want to plot two graphs which interact with each other and change according to Input that you put into one graph. So let's create some hypothetical data so I have four band names here and And I create an app lot data frame which is an internal data structure for storing data and I create So there's a popularity between zero and hundred assigned to a particular band in a particular country Now I want to plot a histogram Which has the of the popularity of each band and this next lot that I want is a bar graph of the frequency of of the Number of times the band occurs in the data frame So I just do it with add.df and I add both these plots to the app lot frame So as you can see both these graphs have been displayed here Now what I want to see is the number of bands that have popularity between zero and twenty in any country So I can just select it this way and as you can see the bar graph has changed to reflect that So if you see Megadeth is three and In order to verify that this is correct Let's filter this data frame and you can see that Megadeth occurs three times here one two and three and Each time the popularity is less than 20. So this bar graph actually reflects exactly that So next up we'll have a small demo of Daru so the Daru index provides indexing functionality for The rest of the Daru data structures. I'm creating a very simple index over here and You can select an element in the index with the box operator Or you can even select multiple indices by separating them with commas So now you want to create a data frame. This is a very simple case of a data frame. I Specify A, B, and C as the hash keys and you can see that they've been reflected as A, B, and C as the names of the columns of the data frame Indexing also supports Most more type of indexes called multi-index data index and categorical index Multi-index used for hierarchical indexing of data. So now over here. I'm creating a multi-index with tuples And I use this multi-index as as for indexing data for this vector So now I want to select the sub indices of A comma B So I specified here And I get CDP and now I want the sub indices of just A So I specify A and I get BQ which is exactly what is here So now the datetime index is another index that is used for indexing data on timestamps Here I'm creating a datetime index from a range starting from the beginning of 2011 up to the end of 2013 separated by three days each. I create another hypothetical vector and I index it on the datetime index As you can see each day over here is separated by three days and now I want to select the data from 1st of January to the 10th of February Which can be done like this as you can see the last date is the 9th of February Now I want all the dates in 2012 and now let's take a hypothetical data frame Which has average temperatures of these given cities and I want them to be ordered by name and temperature So I specify the order option as name comma temperature and it needs to be sorted on the temperature So I specify temperature as the argument to sort Now data frame also has a where based where clause based squaring syntax similar to ARL reactive record and You can actually specify where data frame where where the temperature is less than equal to 25 and It will show it to you instantly Plotting is also very simple so Over here I'm specifying a bar graph and the x-axis is the name and the y-axis is the temperature And it gives me a graph That's it for my demo and I have cool syrupy stickers So you can come say hi to me and I'll give you a sticker And I'd like to thank Red Dot Rubicon for having me over This is the first time I'm stepping out of my country and I was quite blown over by the amazing infrastructure of Singapore and The very hospitable and inviting nature of its people So hats off to Singapore for being an amazing city and hats off to you guys for being an amazing audience. Thank you so much Thank you, Samir Any questions from the audience please? I So it can export the diagram to either an SVG or Pure HTML, but if you export to an SVG you won't be able to interact with it So it will just be a static diagram So Samir, I have a question So regarding your Your scientific do you do any scientific research With the libraries that you have Planning to do you have any suggestions for any of us that's simple to implement? Well, you can start with the plotting of just potting. Okay. Thank you so much. Thank you