 Hey, so hello everyone. Thanks for coming for the last talk of the day and thanks for being brave enough to stay here until the evening. My name is Svit Aranek and I work for Repet. Specifically, I work as a QE for GDG, which is GDG Data Group Forum, which is a product built on top of the InfiniSpan project. And during this presentation, I would like to give you a really, really brief introduction into some machine learning, neural networks, and stuff like this. But as this, I will mention quite a lot of stuff which may seem to be a little bit hard, at least at the first sight. So before we get into it, let's start with some motivation that we actually want to do that. What you can see here is a plot which shows how the amount of data produced on the Internet grows over time. So it's probably not surprising that the amount of data grows exponentially. What is more interesting is these two color areas, this one which grows on a preliminary, is to estimate how many of data is structured and this exponential contribution is data which is not structured or semi-structured. So could you give me some examples or how to understand no structured data? Anybody? Go ahead. For example, when someone chats with someone another? Yeah. So it can be text and something other. It's written here. So basically it's most of the stuff you produce on Internet. If you upload something on a Facebook, Twitter, write some blog post, some text, it can contain numbers. It can also contain images. And of course, for example, image itself, it's basically unstructured data because you don't know what's in it. Unless the user describes it, tags it, and something like this. The same applies for video and so on. So anybody in the room who stores terabytes or petabytes of data just for fun, raise your hand. Nobody. So obviously nobody would like to store a huge amount of data without knowing what's in this data. So to get something useful from a large amount of data you store for your application which you run for users or it can be just the few states, for example, what users does on your web page and so on, you need to process this data to get some useful information from it. So you can start this whole area usually called machine learning or it's one of the approaches how to do that. And it's really a huge topic which starts with some trivial techniques like passing the text or doing some averaging histograms and so on. But you can use some, for example, linear regression and things like that. But there are classes of problems where these simple models will completely fail. And in the rest of the talk I'd like to focus on this hard part which, for example, can be, as I mentioned, an image as an unstructured data and you want to recognize what's in the image. This is a typical example where if you want to have some good success rate you typically have to apply some more complicated model. So this is what I will describe in the rest of the talk. So how to address these hard problems? You probably heard about deep learning. It's a buzzword but actually this is not something completely new. It's basically applying deep neural networks on these problems. What is deep neural network? It's sketch data here. It's the simplest possible deep neural network. Typical definition of deep neural network is neural network with just two or more hidden layers. So here is input where, for example, you send your image. These are hidden layers. And here is the output. What is typically, for example, some classifier which I will tell you that on this picture is a dog, on this picture is a cat, and something like this. So this is a neural model which is able to solve these problems. Another... So this is one part of the problem. The second part of the problem is that in recent years there is more and more pressure from users to process data almost in real time. Nobody wants to wait for the feedback. If you want to upload something somewhere, if you want to get immediate feedback, for example, similar posts and things like this, and this immediately. And as this is happening in a really rough line among the data, this is usually not called a big data, but it starts to be called fast data. So basically the problem is that you have a really huge amount of data, and you need to process almost in real time as much as possible. There are, again, some techniques how to do that. For example, you switch to unstreaming. This is pretty obvious. You can keep the data in memory as much as possible during... when you do some computation like regression model or averaging and so on, you probably heard about Apache Spark, right? This is a typical example which does that. It's compared to a series of Hadoop MapReduce jobs of HDFS. Apache Spark can give you pretty nice speedup. Not only the main thing is that it tries to keep it in memory, but there is lots of more optimization in the dust. And another hint how to be fast is not to keep the data in memory when you do some computation, but as you typically do something with the results, so you should keep the data in memory whole your application stack. And I won't go into the details. I spoke about this topic here a year ago, so if you want to learn something more about this, you can check my talk from last year. I did also some brief introduction into Apache Spark, so if you haven't heard about Apache Spark yet, it can also give you some introduction. So basically, let me do a really short summary of what we have now. Actually, we have a problem, because we have data which can be pretty complex, which requires some complex processing to sort out, and at the same time, we want to do that as fast as possible in a real-time case. So it might sound a little bit maybe impossible, or at least it's hard to do. So in the rest of the talk, I will try to show you that it's not impossible, and actually, if you choose the right tools, it can be actually quite easy. There are many tools which you can use. I choose a couple of them which I like, so it's my personal choice. It can be done very likely with other tools, so if you have questions why you choose some of these tools, I will show you later on their answers, because I use them, I like them, and typically, for example, I will give you a short introduction to TensorFlow, so question why I choose that, because in the past, I worked with that a little bit, so I like that, and there can be some discussion what's better and what's not, but it's probably not important here. So first tool I like to introduce is TensorFlow. It's a toolkit for building deep neural networks and other computational graphs. The second one is InfiniSpan. Here is where I choose it, because I work with it, and the last one to have complete pictures, because these kind of data-like pictures, videos, and so on, are usually not possible to keep everything in memory all the time, so typically, you would probably need some data layer where you upload the data from InfiniSpan's way to set, which is, in my opinion, quite interesting technology. Here can be HDFS or whatever you want. There are typically standard solutions, so one thing I choose that is because I find it really interesting compared to, for example, OpenStack Swift or S3. The second one is that if you do some search in the Internet, how to use, for example, Apache Spark with set, you typically don't find any good answer for that, so I try to show that with InfiniSpan, it's easy to use also. So now I will give you a quick introduction into each of these three projects, and then I will show how it can work together. So let's get started with TensorFlow. As I said, it's a tool for building primary focused on building deep neural network. There are many such frameworks. Maybe if you are a little bit interested in neural networks, you probably already heard about Kafe, Piano, CNTK, deep learning for J4 TensorFlow, and if you are interested in some more complete list with some short overview of what each framework can do, here is quite nice short summary on this Wikipedia page, and if you check there, you can actually see that the market here is quite competitive, and many frameworks can cover basically everything, what other framework does. So I'm not saying that TensorFlow is the best, because I never do any research about that, but I like it. It's a library done by Google Brands Team. If you are interested in this, I would recommend you to read this white paper. It can give you more in-depth introduction how it works, about its architecture and so on. It's actually second generation of Google machine learning system. The first one is called this belief, and it was open sourced about one year ago. Obviously, it's used by Google for, for example, speech recognition, photos processing, and many other projects. And it's, of course, used by many other projects outside Google. I would like to here mention the only one. It's Mozilla Deep Stitch Project, and I'm mentioning it, because here on Defconn, it is enough to talk about this project on Sunday by Tillman and Kamp. So if you are interested in machine learning, I guess this one could be also interesting talk for you. So let's get quick overview or get some basis of TensorFlow. TensorFlow represents, or provides you some pieces from which you can build your, for example, neural network. And it's stored in a graph, computational graph. There's a trivial schedule here. The nodes represent mathematical operation like metrics, multiplication, add, apply, for example, predefined. There are some predefined functions, like run functions, which you can apply on top of the matrix, and adjust our inputs to the operations. Well, so you build this, from the pieces TensorFlow provides for you, you build this graph, so typically you build some neural network, and then you will run it in some session. So create your model and then push it into some session and run it. So session is basically client representation for particular TensorFlow runtime. There are two kinds of variables, besides variables that I'd like to mention. The first one is what is called variable in TensorFlow, and it's variable which has predefined size and type. And the second one is placeholder, which is not known, or it's size, it's not known during the, or and value, the most important thing, the value is not known during the time when you create the graph, and typically serves as an input to this graph. So typically, for example, if you process the image, the image will be placeholder, and when you run this graph, it will be input to this graph. And the last thing I will use or mention later on is checkpoint. It's typically file where you can store the variables, and if you want, you can store the whole output or whole state of your graph. So what is typically usage is that you train your neural network straight into some checkpoint, push checkpoint somewhere, then load pre-train neural network, and then only apply it to some input data, which is coming to the system. This graph represented in Python. The main API in terms of flow is provided only in Python. As you probably know, the data scientists like Python quite a lot, so there's probably the reason behind that. And I will talk about how to deal with that in different point, which is a little bit later. But now, if you know a little bit Python, I think it's readable. Also, if you've never saw Python before, here we defined two variables, which are two matrices, and one placeholder, and then we multiply this w and x. This is here, and add b, and apply a root function on top of that and then proceed with some other operation. So, as I said, this could be very, very simple model, and then, as I mentioned, we create a session and run this model in the session. So here I load some input. It can be whatever I want to process, and then I put it in a dictionary. So I have here a placeholder called x, so I will here provide that x is this input and it's all around the session. And basically, that's all. So hopefully, is that anything unclear about this? Because this is basically how it works, and I hope it's not difficult. So, as I said, to provide to a building box, it also provides some cool functions, like, it provides some typical used cost functions for neural networks, optimizes library and descent, optimizes and so on. And these all are just building blocks, and from this you can build whatever neural network or any other computational graph you want. So basically, it's up to how you do that. So hopefully this is clear. So may I have a question to the previous image? Well, my question is how is this related to the neural network itself? Can I say that the nodes from the hidden layer are like the... This is actually slightly... Yes, so the BWX is like the inputs. It's like the... Typically, X formula is input, and this is variables, this is bias, and this is weights, which you tune when you train the neural network. So basically, you tune, this is parameters, and you tune it during the training. So... Okay. Yeah? Clear? Mm-hmm. So I would quickly mention a couple other features which I found really cool in terms of flow. The first one is that you can store it in Portable Format. I'm not sure if you ever heard about Portable Format from Google, which can be read in... Or Google provides, you know, binding for many languages, so you can reload it in other languages. Another nice thing is that it... You don't have to care about low-level details. You write your model, and then you can run it on a CPU on your localhost, train it, play with your model, and when you are satisfied, you can push it into dedicated cluster when you can run it on a lot of GPUs and stuff like that. It also supports CUDA, and so you don't have to deal with this quiet low-level details. And another important thing is that it supports clustering. It uses gRPC on Lugut again, and it's able to distribute the load over the cluster. This is a trivial example of how it works. So again, it's quite easy, and it takes a lot of hard work, which you have to program yourself. So it's done in the framework. So you can very easily distribute the load over your cluster and run it in a distributed fashion. So now, later on, I will run it from Java. So how to do that? For example, you are fine to build your model in Python, but you don't want to run in production in Python. So as I said, the main API is in Python, and it provides only very limited API for C++. There are two issues to provide the API in Java, but unfortunately, these two are still not done. But fortunately, we can reuse C++ API via JNI, and don't worry, you're going to have to do this yourself. There are a couple of libraries which are already implemented, and one I use, and it works really nice, is Java CTP presets. So using this library, it looks in Java something like this. You build your model in Python, store it into a protobuf file, and then in Java, just define your graph. Here you read the protobuf file, create your session, and load the graph into the session. Then you define in tensor. It's basically a matrix. If you've never heard of tensor before, matrix is one special type of tensor. So here it's called tensor, and load some input data into it, and here you run the session, and it's similar to Python, but as it's usually in Java, it's more rather line, but basically it just provides the input, and it says here, I'm not consistent here. What in the previous slide was here is images. So I have in my model a placeholder, which is called images, and I provide new array of tensors with only one item, and it's this input. So hopefully, again, it's readable, and the nice thing, even if you don't want to read this, the takeoff from this is that to load the model in Java, it's just few lines of code. It's not something very difficult. So let's move to infinispand. Anybody in the room who never heard about infinispand? Still, I expected that everybody heard about this nice piece of software, but if not, it's the data grid platform written in Java, so it's basically in-memory data grid which tries to store everything in memory, and it's key value store, which is highly available. There's no single point of failure. It's elastic. Also, if you provide some schema, you can do some searches. It's transactional, and it has really lots of cool features. I'm not able to go through it here. There was a good talk about infinispand this morning by Sebastian, and if you miss it, I will refer you to infinispand.org, but basically to understand what will follow, you don't have to understand some cool features of infinispand. Just keep in mind that it's something which cannot run in cluster and keeps all the data in memory, so it tries to provide some layer which is fairly fast and where you can store your data. The only thing I will mention from infinispand features is something called cache store abstraction, and it's basically a typical use case. You don't want to keep all your data in memory. You want to offload some data into some permanent storage, or even if you can keep everything in memory, you have some backup if something happened. This is a way how you can load the data from infinispand to some permanent storage and load the data from permanent storage back and forth. It works both ways. There are various implementations of cache stores. You can store the data into databases, various slide providers, OpenStack Swift or Amazon S3, LevelDB, Cassandra, and also SAP. What I forgot to mention here is this configuration piece, and this is how you configure the cache to keep only some limited number of items. Here I define that I want to do eviction of items and I want to keep only five items, and I can choose strategy that will be used. This LRU is a least recently used strategy, so it means that when I have five items in a cache and I will add six, one item will be removed from infinispand to permanent storage, and it will be the item which was least accessed. Which is kept in memory the longest amount of time untouched. Is there more fancy cache replacement policies as well than like ancient LRU? Yeah. In particular, we have larger cache sizes that make sense, like in multi-cube eviction policies or whatever. There are, I think, three policies implemented by default, and if I'm not mistaken, I will have to check. I think you can define your own, but I'm not completely sure about this, but at least you can choose more strategies, more predefined strategies. Now a few words about SAP. SAP is a distributed object storage. The really nice thing about it is that it provides various access to the data layer. You can access it as object storage, blogin device, and also as a regular O6 file system, and all this can be applied to one cluster, so you don't have to run a dedicated cluster for object storage and a dedicated one for if you want to run some shard file system. Everything can run on top of a single set cluster. Again, similar as Infinispan, it's highly available without no single point of failure, and they said that it can scale to exabyte level. I personally never tried, so I don't know. You can try it at home if you'd like. I would believe them if they state this, and of course it's open source. Here's a pilot architecture overview. It's how you can access it, and actually there's a fourth one, and it's what is provided by SAP. In the bottom here is what they call RADOS, and it's basically the distributed object storage itself. The first way how you can access it is with RADOS. It's a library which you can bind to application and talk from your application to the RADOS distributed object store. A few months ago, I implemented cash store for Infinispan, which can store data into the set and use this RADOS. Hopefully it's far. I've never done so far any performance, but it should be faster than these two typical ways, which you can use because it goes directly through RADOS and JNI calls. How to integrate it with Infinispan, it's again pretty easy. You just add this, for example, to Infinispan server. This piece of code where you just define the IP address of the cluster and your current issues, and that's all. Again, integration is pretty easy. Before we go to the demo, I try to show you that writing the model in terms of flow is up to how to do that. Just save it into Protobock file, loading into Java in some Infinispan client. It's pretty easy. It's coupled with a bunch of code. An integration of Infinispan with set is about, again, few lines of code configuration, as everything is done for you. Now to the demo where I try to put everything together. I try to keep it as simple as possible so I use the neural network, hell work example, and it's a minute data sample. If you've never heard of this before, it's basically set of hand written digits and the goal is to recognize which digit is written on the picture. The second reason why I choose it is because terms of flow, guys did quite a nice job and there are MNIST tutorials or tutorials which use MNIST data samples on terms of flow documentation pages. There are very little tutorials for beginners and also slightly faster tutorials for experts. Basically, I copy paste the code from there so that if you go through the code of the demo later on, there are no comments because it's really detailed and commented here so you can go there and read step by step or what it does. And even if you don't want to go through my demo, I would recommend you to check these pages because it's a really nice interaction. So here's some high-level architecture of this demo. I have some huge client in C++ which closed images from training set or sorry, from test set. If I click on someone, it will send it to InfiniSpan so this can represent some IoT device or a skewed application. It can be ported to mobile phones. It can be also mobile phone. The code here, there are various ways how to push it into InfiniSpan. In this demo, I use the REST API when it arrives to InfiniSpan. It sends to TensorFlow client. It's basically a TensorFlow or application which run on my model, my TensorFlow model and which use InfiniSpan listener. So immediately once it connects to InfiniSpan immediately once InfiniSpan get any data it will send it to TensorFlow. Once the image is specified, the result with the number it found it's on the picture it send back to the InfiniSpan and again from InfiniSpan it's sent to Node.js server and Node.js will send it to the browser. And behind the scenes, InfiniSpan is set up with some eviction so the data in a cache are moved from InfiniSpan to the server. So I will show this part and this part. All the interesting stuff is happening here but there's nothing much I can show you. So hopefully, actually there's not much to show but basically the output is that it should work together. As you can see, I intentionally choose different things like C++, JavaScript and so on. As you can see here, there are about five different languages and your application can be simpler but what I'd like to show is that it's not big demo and you can still integrate several couple or several languages together quite easily and it's not a big pain. So much time I have. Five minutes. So I have InfiniSpan server running. Here is my application which runs a flow with InfiniSpan client. Here is Node.js server running and I connect to InfiniSpan. I also have VM which runs SAP. So here I can show the pools which are running there. Here you can see is a pool which will store the data from InfiniSpan. It should be empty now so there's nothing here. So here I have my queued application which loads test data from MNIST sample so as you can see as I start this hand within the jits, when I click on some number it will send it to InfiniSpan with ID of the image which is some ID, a number of an array and as I showed you in slide before it will get processed and hopefully will appear here. So let's try. Wow. And it looks like that it works. So I can click some more. So now there still shouldn't be anything here in SAP because I should keep five items in InfiniSpan but when I click somewhere here so now there should be two items stored in SAP because they get evicted from InfiniSpan and they are there and it's ID 102 and 147. So these are the last two get evicted. So basically as I said there is not much to show here because all the interesting stuff is happening here under the hood but the output is that it works and it works in real time and all the pieces you have here you can scale in a cluster and so you should be able to scale it as you need and I don't have much time so actually I have one more minute so I won't go into the source code but it's not very long, you can go through it download from my GitHub and you can see that to put everything together is really a few couple lines of code and it's actually pretty easy because all the hard stuff is done for you. So let's go to the summary. If you forgot everything from this talk I would like to remember three points that one first point is that building a pipeline for processing complex data in real time or almost real time can be actually quite easy to use right tools. I think that's definitely one of these tools is TenorFlow which is a really powerful machine learning framework and the last one is that Infinispand is a real middleware because all the pieces in this demo was glued together by Infinispand so that is not only just some stupid cache but it's a really major piece of hardware which you can use as a backbone for your application stack. So questions? Why did it not behave like LRU but most recently used? Most. Basically it put the second last and the second entry you hit No, it was three recently used. I mean the last two. The most recently used was on top. It has the first five in the cache so the top most two have to be evicted together. No, no, no. It's our way around. Here is seven and if I click something it will appear on the floor. So when I click here it will appear on the top. So these two were evicted in the square. Okay. So there are no other questions How did you train your model? I didn't get that part. I didn't mention it here. As I said, it's basically I wanted to keep it simple so I just use the example from TensorFlow. So basically I run and it's really stupid neural networks. There is no regularization dropout and so on so I'm quite surprised that it correctly identified all the pictures and it was just a lot because success rate was only about 90% and so basically I use one provided by examples train it with the MNIST train set and then use test sample. And basically this is typical approach which you probably use that you will train your model in your internal cluster and then push into production some trained, already trained neural network unless you want to use some approach like stochastic gradient descent that your network gets trained as user use it. But if I say that this is quite common use case that you will train some data and then push into production already trained neural network. So it can only analyze some only specific type of pardons and writing pardons. This example take only this concrete data sample if you want to apply on some random data like I take my mobile phone and write some digit here and check it, it would require quite a lot of stuff because you have to center the image and do a lot of stuff like this so it would be much more complicated than this simple demo. So if it is a really long line it will slice each and every alphabet and try to analyze it, right? Yeah, it's definitely possible and doable and I think maybe if not in a TensorFlow directly I think that on UdaCity is some online code curse which you can take by the tenors of four guys and there is I guess some example when you train your model for whole alphabet. It's not just images, right? You can give it text to find patterns in it, right? Yeah, it's not that simple I recognize the numbers if you would like to take some text and split it it would be again one level more complicated than this.