 So, my name is Bhushtak, I work for ThoughtWorks, Pune Office, and I'm a big Scala enthusiast. I became developer just because I could program in Scala for the past three years, before that I was doing something else. I was a business analyst at ThoughtWorks. So Scala is my passion, and I realize that using Scala frameworks and tools, there are a lot of other things that you have to learn. We are using play very heavily, and the play is all based on something called futures and promises and all that. And they call themselves non-blocking architecture, and they say it's similar to Node.js. And it was pretty confusing initially, right? Async I could understand, but it was very confusing. What is non-blocking? What is async? Why is it important? And we had to deal with that, because we actually delivered a project. And this is a summary of my understanding of those jargons, and my attempt to classify these terms into different headings. There will be demo, so we'll allow you to program in IntelliJ, but before we start doing that, let me set some context. I'll quickly ask how many of you are using Scala for real work? So I'm going to set the context by saying that this is what the situation for the project I work with. So to serve a single page, it's a web application, to serve a single request, you had to make a call, we'll come to the front-end server, which is this guy. And then this front-end server will go to n number of back-end services, up to 50, if not 100. To serve a single page, there will be 50 to 100 back-end calls, which will happen either on the same machine or different machine. And there were reasons why this architecture was chosen, it was flexible, modular, and all that. If you're using architectures like microservices, service-oriented architecture, this is a very common scenario. And if you notice, a lot of Java frameworks, at least that's what I realized, they provide out-of-the-box support for scaling with increasing number of threads, but scaling at a level of requests. So for example, where this arrow ends, right? In this region, you can scale. You can say that, okay, if you are expecting 1,000 concurrent requests, you can configure your thread pool to be 1,000 or more, and this will scale well. But there is very little support if you have to scale within a request. So all of our discussion is going to focus for a life cycle of a single request. And you can compare with your current approaches, how will you do that in the current frameworks which don't have these abstractions? So to simplify my demo, I have dumbed it down further by saying that, let's assume that this request is about squaring a set of integers. And there is a back-end service which will do this square work. It's a difficult task apparently, so I will go to a service and get it done. But because I have a set of numbers, I can probably go independently and get all the results at once if possible and then merge and then do some aggregation and give the result back, right? So that is the context of our demo. Now I understand that sometimes your task that you call for in the back-end services for are not so independent. So you can't start the next task unless you have some data. But let's assume in our case that these are completely independent tasks which can be fired up in parallel, right? So we'll start the demo, but we'll start with the very basic primitives so that we all agree on the vocabulary. So I'm going to write a test which will test a blocking implementation of square, a single square method, right? Not for a set, just for a number. And that is this blocking. And I'm putting some log statements so that I know in which order the execution happens. And I can implement blocking again with some more logs. And to simulate that this is a network call, I'm putting thread.sleep with a delay. So when you do IO over the network, this is something similar will happen. This is a blocking call, so it will be equivalent to holding a thread for that much time. And then you compute the result and then you return it after logging, right? And I'm going to run my tests in the ID. But what will you expect? What will be printed first? When will the call return be printed? Well, it will be printed in the end because this is a blocking call. So first the call will happen, call has to return. So you will first see that call begins, call ends, and then call returns, right? This is as expected. This is every normal function that you ever write, right? Without the reactive abstraction. Well, if I go with this and implement another test. So I have another test for squares, which is doing exactly same. But in this case, I'm doing it for a bunch of numbers, one to ten. I want to now delegate to my back-end service called blocking for a single number and get the square of all the numbers so that I can do further competition, right? So this is the place where we are trying to mimic the calling over the network. And to do that, I need to implement this abstraction, which will work with a sequence of numbers. And what will it do? It will take a set of numbers and give me back a set of numbers, right? And how do you implement that? It's trivial. If you're using Java 8, you already know what map does. So I take the numbers, I map over those numbers, and I delegate to the blocking call. So I get all the squares and I return it back. So if I do this, what will you expect? What is the order of events? It's not very surprising that they will happen one after the other. Because there is no concurrency, no parallelism. It's just like I want to make ten calls and I do one after the other. So the prints will be in that order. So first one will begin, then end, then two will begin, then end. And this will keep going. So I have the delays random. The delays random less than 5 seconds, 5,000 milliseconds, right? So that's why you see some randomness. But it's not very surprising. I'm just going to stop this. Well, let's take the first step towards async, right? I say that I want an abstraction which will allow me to spawn this task in parallel without having to hold that thread where I'm making the call. So how do you do that? For that, let's again write a test. So I'm going to go single square test. And well, that call will instead of returning a result, which is an integer, it will give me a future of integer. Now, future is a reactive abstraction. That is the first jargon that we are learning. We'll know the properties by looking at the example. So let's go ahead. But because it's a reactive abstraction, I need to hold my test till the abstraction is complete, and that is the last line doing. And then before that, I make a call return. How will I implement that? Well, in Scala, I have abstraction called future, a constructor for creating this future. In C-sharp, it is called as task, and I'm sure it's, I think it's called completable future in Java 8. So you can experiment very similar results you will see in all of those languages. Well, I'm cheating a little bit, and you will know why it's a cheating. I'm just delegating to a blocking call. I'm saying that the only problem was that my current thread of execution was getting blocked, and I don't want to do that. So let's spawn it on a different thread. How will it spawn it on a different thread? Well, it expects each future operation, including the constructor, will expect a parameter, a parameter which compiler is passing here for us, a Scala feature, which is not essential for the demo. And this is called as execution context, and this is nothing but a thread pull. Who implements that? Okay, there is a default, but we can implement it, so let's try doing that. First of all, I just use Java API to create a thread pull. I say executors, and make a factory call to create a thread pull, given a number of threads. So here, we are using four as our number of threads in the thread pull. And using this thread pull, Scala allows me to create the execution context, which will be now chained in, and spawn this future on that thread pull, and not block my current context. So if you now test this application, this should work. So let's go there, and try to run this. So now what will you expect? When will the call return be printed? Well, we'll get the same logging, because we are delegating to the blocking call. So logging of begin, end is still there, but something different happened this time, and you should not be surprised, because future guaranteed that it will not be blocking your current thread. So what happens, the call returns immediately, because it's the future, it's not going to take any time to just schedule the operation, and then you see the begin, and then you see the end. So while we achieved something, maybe that is the first step, but the real test of this is only when you actually go back to the original problem, and see if it really works. So original problem was that I want to square a set of numbers by making so many backend calls. Well, how do you implement that? Very similar, you just use a map, but when you use a map, you get multiple futures, right? Map will just, for each integer, it will give you a future, and then you get a sequence of futures. I want a single future containing the whole sequence. So there is a library operation, which will just do that, future.sequence, we don't worry about that, but this program will compile. So I have done my job, roughly, I have delegated to the call that I just created, now I would like to run this. So if I now go to square test, and run this async, what will you expect? Initially you saw that all the 10 calls were happening sequentially. One will start and end, then two will start and end, and now we'll see something different. Well, some call started together, two. I would have expected four, but because the thread pool is not being utilized properly, two calls are happening together. If I run it again, maybe the four calls, I think that will be better if I run it again, so we can actually see that there are four calls, I hope it works this way. Okay, well, so maximum four concurrent calls can happen, even though I have a sequence, this time I have increased the sequence to 20. Even though I have 20 items to be sent to the backend and get my square done, at a time, only four are working. So why is that, why is that? The answer is obvious because I'm scheduling this on a thread pool, which is the only four threads. And even though I made my task async, it's not enough because it's delegated into a blocking call and that is going to hold the thread till it is done. So just wrapping a blocking call, like JDBC, inside a future is not enough. Maybe it is the first step, but it is not enough. And that is the first jargon I want to explain. When people say async, that gives you only half of the information, that says that your interface, your API, is async, which means it's a callback-based API, it's a future-based API, it's a promise-based in different languages that are different words. It's a task-based, but it doesn't guarantee that it's end-to-end non-blocking because it may be internally delegating and you have to trust the library author of the driver and the framework to find or confirm yourself. So that is the first point. Now, we want to rectify this, right? We want to really do it in a non-blocking way. And then we'll compare. We'll say that async is an API and non-blocking is something else. How do you do that? So let's go back and do the single test, square test, and very similar, but this time a different operation, non-blocking. Let's go there and implement it. Okay, well, how will you implement that? If you're in Java, how will you emulate? What is the blocking operation here, by the way? There's a single blocking operation that is creating trouble here so far, which is me making the network IO, which is thread.sleep. If you have to do it in a non-blocking equivalent, what will you do? Sorry. Well, there is no direct equivalent because if there is any direct equivalent to be blocking, so there is a dual of that, a dual of... So how will you do it in JavaScript? For example, if you want to simulate, sorry, set timeout, but what essentially you do is a callback, right? You have to schedule it. You have to schedule a timeout after some time and then that gives a control away and then after that much time, it will schedule the task. And Java, you will do exactly the same and we are going to use Java. So well, this is a non-blocking implementation. Let's add its own logging here. And what I'm doing, I'm using a scheduler of my own, which is giving me a future after scheduling, after scheduling, and the delay of scheduling is the same delay, random delay. And what we schedule, the same operation, same operation, right? Square and the logging. Now, what does it do? Well, it uses Java 6 or Java 7 APIs to schedule and let me get all of this out here. Okay, this is the Java call. So I have a thread pool which is a executor service from Java API. It has a schedule method. It returns me a future, schedule future. This is a Java future. Don't confuse it with the future that we just saw. This is not even similar to the completable future of Java 8, but it gave me some notion of future, right? I am going to completely disregard that. But I want the whole scheduling to return me a future, Scala future, which I can compose and work with. So how do I do that? Well, there is a pattern and that's where we'll introduce the next jargon, promise. The separation between future and promise terminology is unique to Scala. Many other languages it's put in the same data structure. The difference is that promise, to promise you can write once and future is completely immutable. So future is a deferred operation which will complete sometime in future either with success or failure. But promise is a container to which you can write when your event happens and associated with each promise there is a future. So there is a pattern here. There are duals of each other, promises and futures, promises are for writing, but they don't compose. They're not supposed to be leaking your API. They're like mutable dirty things which have to be hidden inside your combinators. And what you expose is the associated future and that is a pattern. You create an empty promise of T and you return a future associated with that promise in the last line. And what you do in this callback which is a schedule callback in the callbacks in Java using runnable or callable, I fulfill that promise. I say that, okay, complete this promise with the block. Block is the Ruby-like block of code that will be passed. So that block will be executed after this much delay on this thread pool and after that is being done, I will complete the promise. As soon as I complete the promise, associated future will be complete. So whoever is depending on that will be able to see that, sorry, go ahead. So maybe closure also has a similar notion of promise and future separation, right? Right. So I'm gonna use this now. So much about the promise and future. Let's go back and see that, okay, how it is used. Well, we already looked at it. So let's go back to the test and see if it works. So square test, single square test, non-blocking. So this is a single element. So it will behave very similar to the async variation but after that we'll go back and check it with the collection of operations, right? So I begin my task in a non-blocking fashion. My call returns and then the call ends after some time, right? So my, the caller is not blocked, okay? Let's go back to the multiple squares and repeat our test here. Now what does it do? Same, one to 20, but using a non-blocking. And in non-blocking, what do we do? We do exactly the same thing, right? Exactly the same thing. But this time we delegate to the non-blocking method so that we can assume that all our operations are going to be truly end-to-end non-blocking, okay? Let's run it. So I go to square test and I run non-blocking. So what will you expect now? So I'm squaring a set of 20 integers and I'm saying that because I have used truly non-blocking API, what should happen? Well, there is no, I mean, there is no contention for the resources. There is no, there's no resources required apart from some part of a thread. One thread is good enough, right? We have four, which is redundant anyway. That's why all the operations will start together, right? And with some random delay, they will complete and finally I will get the output. And why I say that four threads are redundant because if you just go to config and make this one and run the test, you will see no difference because I'm not using any part of the thread. There is no heavy CPU bound operation. It's just IO bound because we are mimicking the IO by scheduling after a delay, right? And this will work still fine and this is truly, this is truly non-blocking. And I think this concept is easier to appreciate by nodeJS guys because nodeJS environment, there's exactly one thread, right? So they don't have any option, but always to think in terms of this paradigm, right? They can't cheat and they can't delegate to a blocking call, otherwise it will just haul the whole thing, right? But in the Java world because of the concept of thread pool, there is an orthogonal direction of scalability which is available and people often confuse the two, the non-blocking and the scalability with the threads, they are different. They are different and they serve different purposes. So I hope that I brought out the difference between blocking, non-blocking, async, and how futures and promises help us do that, right? And how just wrapping your blocking calls like JDBC inside future is not enough, maybe you have to do slightly more than that. And at that point I can conclude the first part with some lessons and then maybe we can we can ask questions before we proceed. What we say that async and non-blocking mean different things and one is about API which is async and non-blocking is about the implementation. We are guaranteeing that I have looked at the code and it's not really blocking especially on JVM. Async API can be implemented for blocking calls that we saw were cheating. For non-blocking calls, this is just an insight. Sync API is not feasible, it's not possible, right? If I say that your implementation is non-blocking, it's just basically a way, a perspective to look at it. If someone says that implementation is non-blocking, the API has to be async, which means it has to be callback based or future based. You can't write a method getNextPerson which returns a person and claim that implementation is non-blocking because that by design is guaranteed to be blocking because it will give you, so on iterator when you say next, it is blocking. Why you don't have to look at implementation because the API is blocking, it's giving you T rather than a future of T or giving or making a call to the onNext, right? Well, to summarize, you can put this in a table, nice table that API and implementation, you can see that all combinations are possible except that one, that non-blocking implementation cannot have the sync API, but otherwise all combinations are possible and async API is that the future is the one that we are using, right? So that's about the first part, any questions, any comments so far, okay? So maybe we'll go ahead and we continue. So the second one is about demoing the JDBC scenario, we talked about JDBC, right? We said that just wrapping blocking calls in future is not enough, but you have to deal if you're writing a play, so for example, we did not use JDBC, but it's very common that even if you're using play or end-to-end non-blocking, you will use things like JDBC, right? What will you do? The implementation is already done, you have to deal with that. So if you don't understand how this scheduling happens and what the exhibition context, you can land up in trouble, just wrapping the JDBC calls will not be enough and to highlight that, I want to go back to a test. So what I'm going to do here is do a mix, a mix of non-blocking calls and blocking calls. So async are blocking actually, so when I say async, you can also consider a blocking, right? I'm mixing blocking and non-blocking here. And I'm saying that well, most of part of my application which doesn't use JDBC is going to be completely non-blocking, that is great. Other part I'm wrapping in futures and that should also be fine, right? I should at least the part of my application which is truly non-blocking should function. Well, the moment you mix the two things, you may be surprised with the results. So let's run this test, but before running, let me go back and change the thread pull back to four so that we, okay, right? Some observations here, right? So when is it complete? Okay, it's completed. It should refresh, idea is taking a while, okay? So what do you see? You see that all the non-blocking calls are triggered where they started at the same time, right? Because they were all non-blocking. But immediately after that, we mixed, like a usual application, we mixed some JDBC-like code which is a fake async. And what happened that because of that, all our four threads got utilized. We are working with a single thread pull and we are trying to schedule both kinds of activity on the same thread pull. So these four activities took all the resources and even the non-blocking calls which are anyway not going to require any resources are still waiting and they have to wait long because now all the resources are hogged by the blocking part and the blocking part will take its own sweet time, one by one, sequentially, right? One by one and it will come to an end. And then in the end, when that one thread becomes available, all the non-blocking will finish and then we'll get the result. This is not what you will expect. So just to give you a simulation, I mean, analogy, if you have a website and your homepage was all nicely end to end non-blocking, but there's some part of your application where there was a form and database connection, JDBC, if there is a load on that part of your website, your whole website will come to stand still, to a halt because all the resources, so even your homepage will not load because you did not take care to segregate the blocking and non-blocking. So what do you do? Well, because you have done the first step of wrapping all of them inside the future abstraction, you have to do a minor tweak. And that minor tweak is the main culprit here is our blocking call that we implemented, and, sorry, is this, right? And this is still using the same thread pull, right? Because I'm creating a single object here and I'm using thread pull from there, both of them, so let's separate that, right? How do you separate? Well, I have another one. I'm creating a blocking conflict, a different instance of a thread pull, and I'm saying that don't use the default one, but use the blocking conflict. Now, if I do this, and if I run the same test now, it should be all right because I have segregated their paths, right? So blocking will still block the threads, but you know what, its resources are different anyways. So if there is a load on the form which is making a database contact, oh, well, that part will be slow, but your homepage will not even know about it because it has its own lane on the highway, and that's what you observe here, right? Not only you started all the operations together, well, they started ending at their pace, right? Without having to wait for all the non-blocking to complete. So I think that is an important lesson, and I would like to summarize it with the next set of conclusions that blocking IO is contagious. Just wrapping inside async API is not enough. What you actually need to do is segregate them on separate thread pools using execution context or whichever other paradigm you're using. And once you segregate them on different thread pools, it's like giving them different speed lanes so that they don't block each other, right? And this is a very common pattern. In fact, the recommended practice is that, for example, if this were truly a JDBC situation, then what I'm expected to do is go to config and configure this thread pool for 300 threads in a real production server, or more, it depends. I mean, these are the usual guidelines. So I will not shy away from giving a huge amount of thread pool for scaling my blocking calls, but just same number of threads as number of cores for non-blocking system, and that is the recommended way of mixing the two paradigms. In our demo, for example, I can maybe use 40 threads, and if I run now, right, then the situation will be much better because I had only 20 numbers, so all of them will be happy. I mean, there will be absolutely no issues, I mean, there is no weight. It will end much faster, right? So, well, that is lesson number two. Any questions so far, okay? So we'll move on to the next topic, which is about RX. So we looked at our API, which is squaring a sequence of numbers, right? But these sequences are finite. I know that it's a set of 20 and not more, right? So I can afford to, in a non-blocking style, synchronize all of that and give me back a single future containing that value, but if it is an infinite stream, if it is an infinite stream, for example, if you are connecting to Twitter API, a streaming API, and Twitter is throwing tweets at you and you have to deal with that, this API will not work, do you agree? This API will not work because this API is deterministic finite, right? You have to know the end of your item. In Twitter stream, there is no end. By nature, it's very infinite. So we need a different way of programming, but on the similar lines, on the similar lines, you have to extend the same ideas further and how do you do that? So let's start with some basic building blocks. So let's go back to square again once more. And to demo this, I'm going to use an abstraction called observable. Have you heard of a library called RxJava? How many of you? RxJava? Many of you, right? So RxJava is a Java-based library for reactive extensions. Reactive extensions, like so many cool things in functional programming, started in Microsoft, like Link, the talk we saw in morning, the same person that did Link also did reactive extensions. And after many, many years, Scala and Java is trying to catch up. So Netflix has adopted the whole paradigm and created a library called RxJava that you can use to achieve something similar that we just talked about in finite streams, right? And still allow you to work in a very, very functional style and in combinators, using combinators. So what I do here is, I start with the same logging, but instead of now expecting a future, I expect a single time observable, right? Same as non-blocking, but instead of now using a Scala future, I'm expecting an observable. And that observable is basically in finite stream, roughly in finite stream, right? And how do I implement that? Very trivial because I have a library. So I just delegate to my call and then there is a library function to convert a future into, if you just take Rx Scala, which is a wrapper around RxJava, you'll get this out of the box. So you just create that, right? And this will work in a test very similar to what we just saw, okay? So what I'm doing is I'm making a call to streaming and I'm printing and just in the end, I'm doing a blocking call so that my test holds and doesn't exit. And let's run this. And I will expect that the call returns before the call ends, right? So call return first and then the call ended. Well, how will it work in case of multiple items? So I have to go to that test and get the multiple items out. And it's now, okay, let me just get rid of this. It's very similar to the earlier one that I'm streaming, but streaming is implemented in slightly different way. First do the mapping as always. But now after mapping, I get a sequence of observable. So I will need some support, library support to basically flatten it out and return observable. So what do I do? With a couple of options, we'll try out some of them. Like future.sequence, there is another paradigm. What do you do? You take the factory for creating observable and wrap all these singleton observables into a observable nested observable. So if I call this a variable and see what is the type, the type of this variable is observable of observable of int, because it's like a future of future. But future of future is we were not able to do because future holds exactly one value, but observable can hold n number of values. It's like a collection equivalent of a single operation which future did. And in the end, we say concat. Now concat is going to flatten it. It's similar to flatten. We'll also use flatten, but they have different semantics. Concat will preserve the order, like we preserved in case of future. So let's try to use this and see the output and then we'll discuss it further. So once I run this, there is one big difference. I'm not printing the list at all so far. I'll print it later on, unlike the previous cases. Well, it worked exactly like non-blocking, but there is some more information which is available to me. And what is that? As the things are being completed, some of them are being emitted as signals, basically, as I print them. And how come they get printed? Well, because I subscribe. Well, the fact that glanced over because I didn't want to go into too much detail of the API. For a future, if you want to do any side effect, like printing and on the framework level on the controller, you put a callback in the end, not everywhere, just in the end. And that callback is, for example, uncomplete. And that was the callback I was using for future. That uncomplete, print the value which is a sequence. Here, what is the equivalent of uncomplete? It is subscribed. Because it's an infinite stream, there is not a single event of a whole collection which is going to happen. Each time an item gets emitted, there is going to be an event which will be triggered. And I'm saying that on each event, okay, do this operation on my subscriber. And well, my subscriber is nothing but a lambda, which is just printing. And I say that this lambda will be called on each event, right? So that is an underlying mechanism, very similar, but instead of just one-shot behavior of future, you get a streaming behavior, and that's why you have to subscribe. And because of this, instead of seeing the entire list being printed, as you saw last time, you will start getting the events for each item as they get completed, and which is very nice, because now this is truly, truly non-blocking, and it's not, so I don't even have to wait for the whole list to complete. I can just change the operations further and further and further. And then there are cool ways of rendering your web pages, even doing some server-side magic, where your web page itself is streamed and Facebook has a pattern. I think they firehouse or something similar, they call it, right? It's a very famous blog post from Facebook, and you can implement it using this paradigm, either using Rx or using plays of their abstraction. So it seems that whole HTML web page that you can create on the server can be streamed in part using chunk and can be rendered on the browser. You don't have to wait for the entire big page to be created at once, which is great, and a lot of other possibilities. But if you notice for, because I did concat, it's guaranteeing the order. So one comes in four and then nine and 16, right? But sometimes order is not important. I just want this any, as the operations finish, give me those values. Concat is going to hold back, even if 10th number is finished squaring, it will hold it back till its turn comes and then it will emit. So it will hold and the memory will go up transiently, right? If you want to avoid that, what you can do is, you can go to the implementation of squares and instead of concat, you can say flatten. And if you say flatten, well, this will give the semantics we just talked about. It will not care about the order, but it will work more efficiently without keeping anything in memory. Oh well, 20 happened first, so it emitted that and then so on. And well, it is just, I mean, there is no weight at all. There is absolutely no weight at all. And if this is your use case, maybe you will go with this abstraction. Futures also allow you nice way of handling errors. We didn't look at it, but there are in some special situations, especially if you're working with a collection of items, error handling in the future is not so straightforward. You have to do some wrapping up. We can discuss that later. I would say that that is slightly better in the streaming abstractions and we are going to look at that. So let's introduce an error. So I go to square and I say that instead of this, you say that if the value here is 12, then you should throw an entire error. And you should say that 12 cannot be squared. Otherwise, sorry. Otherwise, you just square it. Okay. So what will happen now? Well, this is a stream. So it will continue emitting events and you deal with that. But as soon as the exception happens, the semantic definition of stream is that it will end. It will end. The end condition of a stream is either on complete event or on error event. So on error will be called and after that no event will be emitted ever. That is a guarantee by the implementation. So we will not get all the events back. We will get all the logging possibly because they will start concurrently, but we will not get all the events back. And let's see if that is the case. So I'm going to go back and run the streaming once more. Well, it happened unfortunately right in the beginning. So I didn't get a single event back. Let me see. I did not get a single. These are all logging statements. So what I will have to do is run this one more time. First, let me just stop this. There's some problem here. Let me run this one more. Well, I got some events, not bad, at least for the demo. So I got one event for sure, which is number 19 was lucky that even before the exception happened, right? Because they are being sent in parallel. Now it's, and the delay is randomized. So the error is getting the shortest delay and that's why it's blowing up too far. It may blow up later, but you're completely at, it will happen at random, right? But you don't want this to happen. So I got the event, a square event for 19, but none other, right? They just happened, but I did not get the event back. So how will you handle this situation? You can go to the implementations and say that instead of flatten, you say flatten delay error. If you do that, this will work similar to flatten, but if there's an error, it will delay it and then it will emit it right in the end so that you can maximize. And now if I run this, you will see that even though error happens first or last, it will always be emitted in the last so that you can at least make use of most of the information which is 19 items out of 20 and then finally it will be emitted, right? Whichever, 12 cannot be right. Well, you can do slightly better if you don't want to stop here. You can take some action. So for example, you can say that on error and then you can on error resume next. And you can put, you can do the error handling here. There are different ways. You can say that if this is the type of exception, then continue. If this is the type of exception, then do something else. I'm going to do something very, very simple. I'm saying, oh, exceptions are expected and if exceptions happen, right? Just resume on an empty observable. So this is a stream which will continue as soon as it encounters error, which is going to be in the end because I flattened it and delayed the error. It will switch to this new observable which doesn't have any element. So it will terminate, which means that I get observable with 19 elements without any exception. So let's run this now. Cool. So there's no exception. It succeeded and the order was not important. So we got the results. Well, I want to go to the next topic, but at this point, I think it will be good to just summarize some of the thoughts around Eric that future of sequence of T, the abstraction that we used earlier, cannot deal with infinite streams and you need a better abstraction. For example, Twitter stream. An observable, which is the name of the API in RxJava, is a better way of dealing with these situations. But there are issues. There are issues here. So there are reasons why in every situation it may not be appropriate. So the whole streaming model of RX is push-based, right? So there is a notion of a subscriber, which is an observer. Our callback was an observer. And there is a callback there, called onNext, which was being called. So if you have a chain of operations, if you're chaining map, filter, flat map, the way you're seeing since yesterday, right? If you're chaining all these operations, instead of iterator being pulled, here callbacks will be called, right? So it's a different model. It's not iterator-based. It's an observer-based. So it's a push-based model, which means that if you have to deal with the memory issues, you have to block. So when someone, so if I'm a map stage of the pipeline, and my onNext event is called, and I now need to do mapping, which is the expensive operation, I have two options. First, block it, which means do the operation inside the callback, which is a bad idea, but we can do that just to prevent out of memory. And then if each one does that, the slowest component in the pipeline will determine the throughput of the pipeline. You appreciate that. But there is another option that don't do the heavy processing inside the callback. Inside the callback, just add to the local queue, and then queue will be handled in a different execution context or a different non-blocking flow. Well, that is very good because that will give a huge amount of throughput, but you know what is the problem? You will have to deal with memory issues. You will go out of memory, and there is no way to control that. There is no way to control that. So there is no easy way out. The only easy way or the only way is out is that your queue has to be smart. You should be able to spill over to the disk and provide you all the guarantees of fault, all those and whatnot. Or you can deal with a better abstraction on better protocol. And that is what I'm referring to here, that if you want to see what is happening, what kind of initiative is being undertaken to take care of this thing, you must look at reactive streams. So reactive streams is the protocol for dealing with a push-based streaming abstraction very similar to RxJava without having to deal with either blocking or out of memory because it's a dynamic pull-push model. There will be a way to signal a demand and then the push will happen based on that demand. So it's awesome and many projects are implementing that. RxJava's next version will implement that. Aka streams which are going to run on Aka cluster eventually will implement that. And in that consortium, there are four or five other players which will implement that. Eventually, their hope is that it will become part of JDK. If that happens, that will be great because the whole pipelines across the applications will be able to work with a high throughput and without memory issues. Okay, so that is the third part and now I will go to the summary and then to the Twitter demo. So in summary, if you want to now have some intuition about observable, it's similar to future, but future deals with a single value and observable deals with a multiple or infinite value. So if you want to draw an analogy, in case of standard Scala or Java collections, you will deal with sequences versus integers if one versus multiple. Same way, you will deal with futures versus observable if it is one or multiple. So the pattern is similar, the intuition is similar, but one is for multiple items potentially infinite and one is for finite items because all of them will be single value even if it is a sequence. So after this, let me go and show you the real world example. I hope my internet works now. So I'm going to go to the Twitter client. Okay, so Twitter has this streaming API where if you set up your OAuth and credentials, you should be able to connect programmatically and get a connection to their live stream. Now this live stream, there is no way to say next on that because it's forced onto you. So it has to follow push-based model and we have to use observable or something similar. That is what we are going to do. So I have done some plumbing. I'm using place libraries which do not have support out of the box for observable. So I have done some plumbing and that plumbing will give me if I make a Twitter client get JSON stream for that URL, it will give me observable of JS object. So this is stream of JSON. This is stream of JSON which will start flowing and I should be ready to handle and I am going to handle that in my subscription. So I'm saying that in attach an observable, for each tweet it will print line and write a separator. For now, this tweet is going to be just a simple JSON because I am not doing further processing. But let's see if it works. And I'm doing a thread.sleep so that test doesn't terminate and there is a bug. So I'll have to click stop before 15 seconds but let's see how it goes. So if I run this, well, it's working. So I'm going to let it run for a while and just stop it now. Okay, cool. So what do you saw? This was an infinite stream. If I don't have a test exit. For example, if I attach this stream to a reactive abstraction on the UI side. By the way, all the reactive jargons we are discussing today applies to the server side and programming for IO. The reactive abstraction that you looked at in ELM and others, they are more for the UI programming. So now, but they can be connected. For example, if I had a ELM like system and I just want to attach this output, I mean it will be a live tweets basically on that white space. I will keep seeing the live tweets basically as the one that will continue infinitely. But you see that there are other, there are two kinds of tweets, deletes and created at. Created at is the real status update which I'm interested in. I'm not interested in delete. So how will I process it? Well, let's process it by filtering. So I can use a Java 8 like API filter and I filter in a pass a lambda and that lambda goes from JSON and does the right thing. It looks for an ID. And if it ID is not present, it will return false. So those things will not come out. So only thing that will be filtered and come out is where a tweet is a status update. So this is my higher order function on collection like Java 8 and this is the lambda which is I'm passing just by passing the appropriate method there. So now if I run this one, you will see that, you will see only created at statuses. You will not see delete because I filtered them out and my criteria, not very robust but it works. I mean it has worked so far. Well, I let it run for a while. Let's stop it. So you see only created at right. So if you want to process this stream, now the whole streaming abstraction becomes a processing pipeline of collection that you saw in so many talks. And all the similar abstractions are available on these kind of APIs like observable and reactive stream. So what next I want to do is I want to filter out only those tweets which proclaim that they're in English. How do I do that? Well, I have to first do a map and then do a filter again because I want to quickly now convert into a domain model. I don't want to work with JSON so fast. After that, how do I create a domain model? It's trivial. You create a domain model tweet which has an ID text name blah blah and you have a factory which will take a JSON and create the domain model just like Java. And I'm using that factory in my map operation. So at this point, what I get here is basically a stream of my domain models, not stream of JSON. Let's verify that quickly. So my domain model has a two string method, intelligent two string method. So you should be able to recognize that that it's no more a JSON. Well, yeah, it's working. So tweet is my domain model and I get all the things passed properly and start working with that. And now using that eliminate, oh, well, I already used filter. So all the tweets which are proclaiming that they are using English, even though they are not because you saw some Arabic tweets there, they will only, they will come out because they would have specified their lang parameter as in English for whatever reason. And then you can keep composing this and go to the extent that you want. And it's a beautiful abstraction with if you don't have issues with the out of memory and blocking, then RX Java is an awesome abstraction. All the major languages have connections. For example, Clojure has a wrapper, Scala has a wrapper. All the modern JPM languages have the wrapper around RX Java and it's pretty cool. The only, if you want to play with something like this, you will have to set up your OAuth, which is a complicated step if you're not done earlier. If you want to deal with something similar without having to do authentication, there is another test. This code I'm going to push. Sorry, this is a, so there is another API. This is a web-based, this is a web-based chess website. And this is implemented in play and Scala. And this is an infinite stream. It gives the API that where the moves are happening. This is a live stream of events, live moves which are happening. And this is a chunk HTTP response. That's why it doesn't have a content link. That's why it doesn't terminate. It keeps giving you chunks and browser understand that and it will keep happening unless my browser crashes and it will not stop. So this doesn't need any authentication. So if you want to play with this abstraction, this is a better idea. I'm just going to close that. Okay, to conclude, I'm going to push this code on this repository and I'll put this presentation somewhere you can access. So something about us, ThoughtWorks and me, we like to talk about Scala and Scala related things. We have been doing teaching Scala for a while now. We have initiative called Scala University. You must have seen that outside. And we have done, as part of Scala University, we conduct four day free workshops for everyone. And we want to do it in all the major cities where our offices are. So it's out of the four days, two days are weekend and two days are weekday. So you have to invest. There is also a coding test that you have to clear. It's not very difficult, but you have to show commitment and finish that. We have done it in Pune, Gurdugav, Chennai, and we want to do it in Bangalore next, hopefully in December. So I will encourage you to go to Scala University site and there is a registration, which is just show of interest. It's not actually registering for the December one. It's show of interest that you are interested in this city, in this topic. And then we'll consider that feedback for deciding when and where and what to do. So that is about the Scala University. And thank you. I'm conducting the Scala deep dive workshop tomorrow. So those of you who are attending, I'll see you there. Sorry, so questions please. Yes, I mean, those who want to. I wouldn't even try to say that there are differences. Because I will say that in my mind, they're orthogonal. One is actor could be an implement. So underlying, so to implement, if you see the RxJava source code, they will be using threads and thread pools and all the dirty tricks of concurrency, right? Which is abstracted away from you. Well, you can use actors to do that task more efficiently with better error handling, but still provide all the combinators which are required on the API. For example, there has to be a retry. I don't know if it is right now there, but I remember that if this is an observable, so there has to be something like retry. There is a retry. So the combinators and the underlying implementation are slightly orthogonal. Underline implementation maybe will allow your streams to be really distributed and fault tolerant and scale well and almost replace. For example, if Accustream is implemented properly on Accustream, I can see them replace or become an alternative to something like storm, park streaming, and the things which people think of only in context of big data will now just merge, right? Anyone who is using Accustream will try to use those abstractions and will get the similar guarantees and performance. So it's slightly different in my mind, but that's why I'm saying it's orthogonal. Accustream is going to come up with something called Accustreams. And I encourage you to just go and look at the API which is very similar, which is even better, right? But which is very composable functional on the top level. Underline materialization of that flow happens using the distributed actors and the Accustream one. So they are slightly orthogonal. One is for materializing and one is for composing. So any more questions? Cool. I'm done.