 I'll get started now. It's about three. So this presentation is on Dataflow. It's basically a way to program in eventually the declarative concurrent model. I don't mean eventually as in the library, I can't do it. I mean eventually as in I'll get there explaining it. And it's not very well known, but I'll show you some references if you want to read up further about it at the end of the presentation. So this is me. I work on engineering cloud. I use layers of liquid for pretty much everything. All right. So this is sort of the outline. Basically at first I'm just going to talk about the purpose of why I'm giving this presentation and sort of created this library. And then I'm going to gradually explain the concepts which is going to start sort of on a basic sort of program. Not super basic fundamentals, but at least theory and then expand from there. But don't worry because it gets into really cool stuff. And at the end I'll just have some tips. Okay. So the purpose this sometimes I just pick images that don't really mean anything. But the purpose of this basically is you know Ruby kind of had its success early on both pre-rails just having like a pretty good hacker community around it and like people doing pretty innovative stuff. And then post-rails of course you had a big influx of people and you know people were pretty happy like oh yeah Ruby you know one it's awesome like go give talks about Ruby because it's so cool we can do all these things with it. And that at least in my perspective seemed to die down a little bit more recently or at least a little bit after Rails became kind of the norm. And at this point we have we know we do have other languages that have copied Rails or Ruby's useful libraries both like test libraries web frameworks that kind of thing. And we also have languages such as like Clojure and Haskell and Erlang and you know whatnot that are sort of innovating in pretty cool ways. And I think part of that is because they they still have relatively small user bases. So you don't have a lot of backwards compatibility problems like you would. I mean you know Ruby is getting to be a decently big community right now. So you have a lot of people depending on like old versions of libraries and stuff like that. But I kind of want to encourage more people to just try to play around with ideas, other programming paradigms that are kind of similar you know. In the same vein as Dataflow to sort of keep the hackers in the Ruby community instead of like letting them leak out into other communities where really cool innovation is happening. Not that that's not happening in Ruby but you know just you don't want to just rest in your success right. So that's kind of the purpose pretty much. Okay so we're going to start out here with just talking about lexical scope. So in lexical scope you can see here we have a variable and then we define a method and the method closes over the variable. And the really useful thing to note here is that you can look at this method definition. It doesn't matter if it's defined inside of a class or it's in a module mixed into any other class. You can look at this and always know what it returns. And that's a really powerful property because you can look at the line of source, the lines of source code in one place and not really have to worry about them changing from underneath you. Whereas so traditionally dynamic scope if you're familiar with like less communities or something or stuff like that it's more talked about in those communities where you have basically a block of scope that's reassigning a variable used inside of it. It's kind of the default scoping model by languages used originally before people learned that it pretty much sucks. And if you you can kind of view object orientation and the way instance variables are used as a sort of dynamic scope in the sense that the block or your current environment is the object state. So when you look at this method definition it might seem like a very simple definition and everyone of course knows what it does but in fact at runtime you have no idea what it returns. And when you really step back and start to just meditate on this little bit it's actually kind of scary that like this really really simple piece of code has so much underlying complexity that you can't you can't tell any you don't have you have no idea what it's going to return. If it's mixed in as a module to a class that class may already have instance variables set. And if it's already in a class and other methods were called then you know that of course set the state or didn't set the state or whatnot. So it's kind of as Varubius were used to writing code like this but you really don't take you really don't think much about all the mental overhead it takes to kind of get past having to track all this different state and what what actually results from that. So there's a lot of work and even though it may be on the back burner for you right now like you don't really have to think about it anymore it is still wasted cycles that could be spent on doing other things. And the of course that's not a bat that using this is not a bad thing because what it does provide is modularity and reasoning about the program is reduced but in favor of modularity right. So in functional programs you would have if you wanted to introduce something as an argument you would have to introduce it to all the other methods that are calling this method that way the argument can be threaded through right. And that's that of course is bad because now anytime you want to add something you have to change all your methods right. So it's it's a trade off but a trade off that I think especially Rubius and most like other mainstream Korean language users should think about. So the other thing involved here that makes it harder to reason about is mutability right. So you can be you can see it has initialized method and then you later on after you instantiate the object call the foo method but you're not guaranteed it's going to return the symbol foo because any number of other methods could have been called that change that state. So same deal it's kind of makes reasoning a little bit harder. And now you can sort of you know it seems like okay well now we must know that this is always going to return the symbol foo and while that's true in the synchronous world in the concurrent world it's not true because you can have something like this right. You can have a thread that's in the background somewhere that's constantly changing your state to whatever and you're like okay this is going to return the symbol foo. Well right after it gets set this thread starts executing and resets the variable and then this actually returns Shaw's bot which is a reference to tribes who it's a pretty sweet video game. I guess I'll have to Google that. Yeah I sort of started playing video games and then stopped but obviously I wasn't young enough to start for whatever. Okay so we have a savior and that savior is the declarative model. So originally the declarative model in the most ideal sort of abstract sense is rather than writing how an algorithm works you write specifications sort of blueprints of what you want the algorithm to do or to give you back rather than how to do it right and of course in the most general super awesome sense you'd have some fancy artificial intelligence that'd be able to figure out what needs that how how something should actually happen according to the context and stuff like that right but more conservatively in the declarative model you if your specifications themselves can actually mean something different at runtime then how in the world are you going to be able to implement algorithm for that and that's that's why when you have declarative programming you start talking about you know mutable state and how mutable state is not a part of that and how your programs want to remain referentially transparent which basically means that you can do value substitution based on the arguments that gets get passed in so every time something is called you always know it's gonna return the same thing and this so the declarative model in general is sort of well it is a superset of a model that includes functional programming within it so that of course applies a functioning functional programming as well and that's why you hear people talking about referential transparency in functional programming worlds but it is more general than that and I might get in I might mention why that so later but maybe not it's not really it's a subtle detail not really that important so first of all what we're gonna do here is show we're gonna step back from concurrency the idea of concurrency and just deal with declarative synchronous programming right so if any of you have dealt with like more you know more towards the edge of purely functional languages then you'll notice that you're not allowed to rebind variables once you set them right so we set my var to bound then we such rebind but now we're out of the declarative model so that's not allowed in a language like Erlang or something you'd get in run a runtime error and you can't do that now okay so here here a few new concepts are introduced we have this idea and just so you know this up here there's an implicit include data flow to give me these methods at the root level right so we have this local method and you might not be used to declaring variables before you use them especially you know you might be like oh that's a static language thing like who does that right well actually turns out that it's actually really cool and really useful even in dynamic languages so we declare we we say local we declare my var and then we call method on it right except what is my var it's an undeclared variable so in other languages such as in Java if you do something like this you would get an exception or runtime error something like that it wouldn't be allowed and in C++ it's actually pretty interesting because you just get a pointer to a random a random reference to memory so you have no idea where you're gonna get back it's kind of like Russian roulette it's pretty funny and in data flow what happens is the thread sleeps now that's pretty useless in a synchronous model but as we'll see later it's really powerful in a concurrent model so of course we can't like change Ruby right we can't have the equal statement not allow you to reassign variables that would break a lot of things and not be very Ruby-ish and also just stepping back a bit I you know Ruby's mutability is what allowed it to succeed and what allowed frameworks like rails and you know active record and it makes those kind of things possible so I'm not attacking those concepts I'm just saying that people should try to think about problems in sort of in the sense of what you know what paradigm makes sense for the certain problem and you don't have to start at this really complex reason about super mutable sort of Ruby implementation of OO and metaprogramming level you can start a really nice you know declarative model and then work your way up as the problem needs be and those ideas are taken from a foreign language called Oz if you Google for Mozart Oz that's where all of this comes for like the almost the entire presentation everything directly copied from the obscure language called Oz it's really cool but anyways okay back back back on track here we're not going to use equals because Ruby wouldn't make sure our semantics get followed correctly so we're gonna use unify right so we have unify my var bounds now unify may seem kind of weird you can sort of think it instead of seeing unify kind of think bind right so this is like my var equals bound in I'm not I kind of use the word unify because I may implement more complex pattern matching sort of ability into the library later on that would allow you to do like you know match match partially across unification and stuff like that but basically just think of it as setting the left side equal to the right side for now but realize it's much more powerful conceptually so then we unify again and try to rebind it but now we get this I guess you can't see the comments very well but you get a data flow data flow unification error bound is not equal to rebind right so that's not allowed and of course you could sort of skip this safety mechanism by just resetting the variable yourself so in order to remain in this declarative model you have to go through its declarative interface but you know that's not a bad thing because it allows you to use both models so you can again choose something that choose the model appropriate to the problem this is just an alternate syntax for instance variables you can you know declare my var it's kind of like an adder reader basically except it can be unified so you clear my var and the unify it and if you access my var it's actually a method call but kind of looks like the same syntax you know the same way that adder reader works and now we're gonna go into the current declarative concurrent model and this is where things get really interesting and it sort of it sort of seems like magic like it's a really it when you see it it's really exciting like what sorts of stuff actually is happening in the back end now this is for the cover of sick P it's not actually this book doesn't actually use data flow but it has a wizard so magic kind of makes sense we'll hope that data flow is as powerful as lambda maybe probably not but it's cool so now you'll see that okay here's the other really important point that I want to make especially for the Ruby community right so in Ruby everyone's really well known for being test heavy like everyone's like oh yeah let's test everything and I personally think that's great especially because of the mutability of the language but even even if you're not considering mutability you still want to at least integration test to make sure your end result is equal to what you want it to be right so what a really cool feature of data flow is that it allows you to thread inside your specs and also spec threaded code now normally whenever you deal with threaded code most of the time you're gonna like mock it to be synchronous because you don't want to deal with you know all the different paths that can happen and it's your you're you're delaying something in a thread and you don't want to have to like how do you intercept the result you're just like ah whatever it'll just be synchronous and you mock it out and stub it or whatever right so in data flow you don't have to do that so you can actually test the concurrent threaded code that that's actually going to be running in production so what happens here is we start a new thread and in that thread we're going to unify my var to bound but that threads executing so let's say the scheduler stopped it and it's you know it's gonna do this it's gonna reschedule it like five minutes later cuz let's pretend it's like really dumb or something right so we do my var dot should well in the Ruby implementation of data flow method calls are a form of activation so my var is going to suspend because it's not yet declared right so the the calling thread just sleeps and then whenever that unification happens it wakes it back up assigns it the value and now this is true this is a very simple example but you can do really cool stuff with this right so we're gonna assert that our sentence ends up being our base are belong to us and we're gonna start three threads and basically there's sort of there's different relations between these various variables right so your middle is going to use your tail your tail doesn't use anything and sentence uses your middle so all the dependencies are automatically resolved for you and end up passing and because we're remaining in the declarative model every as long as we write a test and get it passing once we know that it's gonna pass all the time so there's no concept of race conditions that's really powerful now if you would do something that would leave the declared model then of course you're gonna shoot yourself in the foot but question which one didn't wasn't in this one oh well the only reason it wasn't here is because I declared it as it's sort of like an adder reader right so it's kind of like adder reader versus just doing more like imperative style code that's just like in line you're just like creating a scope basically and it just creates an unbound variable so it's just like a fresh variable right if you try to access it the calling thread suspends and then whenever it does get bound then it'll resume again yeah so that'll actually turn into two string calls so the cool thing about Ruby is that a lot of like syntax features actually get turned into method calls so anything that ends up being a method call will work with this library of course if you have something like a true keyword like defined question mark that isn't gonna do anything right so that's that's gonna go right so implement internally this is implemented with a proxy and it proxies all the methods through so if you have a defined it's not you know Ruby interpreters gonna be like oh I'm gonna tell you what this is it's an expression or whatever right but for most things the cool thing is that Ruby is sort of built around this common interface that almost everything ends up being a method call even though there's syntax sugar around it you can sort of think of it as like a as like a pre compiler that actually turns it into method calls conceptually yeah so so if the third thread never occurs then your calling thread will just I mean enough it'll just hang there right but that's why you test right because if you're if your tests ever work once they're always gonna work the same way so if there's like some error in the logic inside of that thread and it raises an exception such that the other ones never end up unifying correctly then your tests will just hang and you could write a helper around your test that would do it like a timeout or something right so that's why the tests are important as long as you get your test passing once you're always golden so another cool application of this is to do asynchronous work so if any of you ever have you ever use Merb there's a thing called like run later right so you basically you have run later and it's in you pass it a block and you do some code that you don't want to run inside of the request but how do you know what's the result of that code well the cool thing is you can actually pass so in the C++ world there's this concept of passing references to methods and you know some people in a Rubyist might cringe and be like oh you know references passing those the methods just return stuff that sucks right well the reason why it sucks is because those references are dealing with memory directly so it's not going to be declarative right because anyone can write to that memory but it's actually not a bad thing so what you can do here is if you look at this call we say worker dot async and we pass it a freshly created dataflow variable right and what the workers gonna do is spawn off a new thread calculate the result and then unify the dataflow variable that we passed in to the result so you can you can have like work being done in an asynchronous manner and then your tests will get notified whenever the result is ready to be verified and you know in this case I did like worker dot async output equals nil so I turned it to nil as a default parameter that way in production if you don't want to verify results you don't have to so that's pretty pretty sweet now here's a little abstraction around this this isn't in the library right now but it's on my local hard drive it'll probably be there tomorrow morning it's it's something called flow it's basically an abstraction around thread so if you wanted to like override it with a thread pool or something you could do that but what it allows you to do is pass in that variable and as Ruby is for kind of use the idea that method calls will return the the last whatever is on the last line right so in the same way you can have an asynchronous action and it's going to you know return what happened on the last line so you do all your stuff and then the end result gets bound to the output variable and you can just assert it so as you can see like you start to be able to build like these really cool abstractions around this really simple concept here's another example so we can also have anonymous variables right so rather than using that local thing if we're dealing with an existing Ruby data structure and we just want to fill it with stuff then we can use data flow variable dot new right because they're just objects sure so we have these keys and values we map across them with domain and var for key and value and then we create a new thread for each iteration of the map and in that new thread we're going to make an HTTP request call and then just end up returning the variable so what this will basically let you do is issue to HTTP requests at once so this might be a little bit I mean it's so little code and it's like an awesome abstraction so why not take it a little bit further right so now this already exists in the current version of data flow it's called need later I think this is probably going to be the most used method in the library basically what it does is you can give it a block and it returns to you a future right so it returns to your data flow variable and then you can do whatever you want with that data flow variable and then whenever you try to access you know get you know call methods on it then it'll actually suspend or if it's already been bound just work so this is a super powerful abstraction it's really cool does that make sense to everybody that abstraction okay cool yeah and you can see like you know it just fits in the line like there's really cool stuff happening there okay so here's chunks sequential processing right so we have this array of 100 items we're gonna slice that array up by 10 we're using like you know Ruby one nine one eight seven enumerator syntax so we can just slice something up within each and then map after it so we have basically ten chunks ten chunks of ten items and we're just gonna sleep one just for an example right but you could do like heavy computation in here or whatever you want and then we're gonna add up all the things in the little sub-chunk and then in the end add up everything out of the results of each of the ten you know groups and if you time this that takes about 10 seconds but if you wrap a need later around it it takes about one second so you know you can get like awesome concurrency pretty I mean it almost feels like for free you know it's where what I was talking about before like it almost seems like like magic a little bit okay so this is matrix reference so we're gonna leave the declarative model and extend it a little bit with the ability to use to do asynchronous stuff because the while declarative model staying in there you should do it as much as you can and it's powerful it can't handle for example a client server scenario because you don't know beforehand what clients are gonna connect to you they can connect to you at any time right there's an inherent race condition in the client server model so we can extend data flow or declarative concurrency by just a little bit putting just a little bit of state in there with this concept of ports so what a port is a port is basically something you can send message messages to and you initialize it with a new unbound data flow variable that'll get bound internally as a stream and the stream is sort of like a console it's basically like a linked list of data flow they were a data flow variable head and then another stream tail right so we're gonna take our port and it sit here's in this asynchronous example it's not that interesting we send one we send two and then if we take two from the stream it should equal that array and you know that's fine whatever but it becomes cooler when we deal with it in asynchronous fashion so here's basically a little echo server that we have and we you know create our new port start up a new thread and then we're gonna each so so streams are innumerable so we're gonna each over the thread and then print out whenever we see the receive a message and then we're going to go over so that this should have a W here we're gonna go over an array of X Y and Z and each over that and inside of a new thread we're going to send a letter right what that means is when we take all the stuff that we sent there was a race condition there so we were not sure when we actually what what the actual order that it appeared in is but so so that can make testing a little bit more difficult so you have to know kind of the problem a little bit more in this case it's really easy because we'll just sort afterwards and you know you you get the correct output all the time but again you know this is outside a little bit of the declarative model so you don't have all the same guarantees but it's still it enables you to do the client server model another cool thing you can do now this is one of my personal favorites is this idea of a future queue right so we're all familiar with the queue that you can like push and pop from right here we're gonna unify queue to a future queue a new future queue and then we're gonna pop stuff off the queue before it even exists so you're like using stuff you're popping stuff off a queue that isn't even there yet right and then we push stuff onto it and you know and then we do it like a regular synchronous push pop and basically whenever something comes in the the one will get bound to the first and then the two will get bound to the second so it's it's really it's kind of weird when you first think about it it's like you know whoa like what's going on here but it's it's pretty cool and this is like the kind of cool abstraction that you can build with data flow I don't know if this is useful so in my local copy in my local copy I've only included this as an example file rather than a library class sure if you if you like it I can put it in the library it's right right now it's an example file but yeah it's it's really cool if we have time I'll show the source codes that it fits on a slide at the end so yeah this is cool and you know at first in my at first you might think oh this is weird but really it's kind of the way we think already right let's say you're at an ice cream shop you have to stand in line and you grab like a ticket right so now you have your ticket and then what are you gonna do you're gonna go talk to your friends and then whatever whenever the line stone you're gonna go get it or you're gonna stand in line and now you're bound there sleeping your thread sleeps and you're waiting until it's your turn right so it's kind of the same way like you pop that ticket off the queue and eventually you know it's your standard like customer service line queue example it's a queue but it's a future queue it's pretty cool no you do not because it's implemented with ports rather than data flow variables and I'll show the implementation later yep yep okay so another cool thing is you know we built ports on top of data flow now we're gonna build actors Erlang sell actors on top of ports so you can look at these little actors and if you squint hard enough it kind of looks like Erlang so you know we have a ping and a pong and then we're in a case on receive receive is actually a method call that's you know an actor subclasses thread so that's how that method call happens the receive will get used by the case statement and then do the messages and then you start off the chain by sending ping and I'll go you know ping pong ping pong ping pong so it's pretty cool and so this is so CTM the book that all this odd stuff and really cool programming ideas is based off I got in touch with the author like after I wrote this library and I'm like hey cool look you know I have like a Ruby library that uses your idea as like it can actually be used now like so he was pretty happy and he's like oh yeah well you should implement laziness too because laziness fits in with the clarity of concurrency really well really elegantly so I'm like okay it's just like three lines of codes or whatever so we have a might get used and we set it equal to a by knee now everyone's familiar with laziness so sorry to bore you but like he wanted this in so I'm just like yeah alright whatever so then we do might get used and if it's a even number then we'll use it otherwise it won't the cool thing is that this actually internally is as data flow variables right so it actually fits in the declarative model so you can have a by need and then you can unify to it and you'll get unification error if those two end up not equaling and if they do equal you won't don't ask me why that's useful but it fits in the model very nicely together so it's it's pretty cool the way like once you start looking at all this stuff how it all it all matches it's like a puzzle like it all works together it's pretty awesome alright so now we're gonna get to tips so this is another really cool thing about this library it goes back to the stuff I talked about originally with you know what yahoo to ask as far as like you have the string interpolation going on that delegates to a method call and it just feels like you know it's it feels like data flow variables are like part of the language right it's like oh yeah like data flow is part of like Ruby right and that's because of Ruby's interface of you know just delegating everything to methods which is really awesome so in our library we can just say you know as Rubyist my var should equal bound but if you look at so as soon as I release this there were like Ruby, Scala and Python copies of the library but in all of their libraries they'll have something like this they'll have my var dot wait right because they don't have a uniform interface like that what that means is all their code is not going to be modular so you have all these libraries like you have the Ruby standard library any gem that you have whatever you can just take a need later block and pass stuff to it and then or like a by need block and pass stuff to it and it'll all just work because it's all using the same interface so you maintain modularity whereas all these other libraries are gonna have to rewrite all their code in order to support data flow unless they write new code but you know so that's really cool another thing just like little tidbits if you want to debug you know originally like somebody wrote a post about this library is like oh yeah it sucks for debugging because my debugger will end up having the thread get suspended when it's trying to inspect the variables so I since inspect hopefully isn't used for any sort of logical purpose I have a special method on the proxy called inspect that will give you data flow variable unbound and then I just this is the ID the underscore underscore ID that isn't overwritten by the proxy is included here so you can keep track of stuff if you have a complex situation you just for convenience so that I'll be in by tomorrow too but the regular message that the ID is already there the other thing I have locally is in case you don't want to like include and like overwrite like if you're already defined local or whatever then you can just you know call it on data flow the module and it's kind of ugly so you should just include it but you know it's there for people want it okay use cases so if you want like general if you want to have like a nice architecture as far as quarter a nice order program execution and you want it to automatically just resolve a runtime like this is especially apparent for if you have like a game like my friend for this library on github and he has an example of doing a he used DRB and data flow to write a little rock paper scissors client and that he'll basically like set up all the different conditions in you thread in you threads that could happen you know all the different combinations of rock paper scissors and it makes sense because you can just read it step by step and then whichever one ends up getting used through through unification binding will get resolved by the thread and end up working so it's really cool for like you know for that like if you have like a complex UI or something you're just like oh I don't want to deal with like all this like on-click hook garbage like just set up threads and then they'll just get end up getting used whenever they're needed another cool thing is if you want to use like that that chunked that parallel chunk processing example I showed like I did an example of that with JRuby and I got like a free like 40% improvement on like adding up you know billions of numbers or something like that so that's cool and of course that only would work on JRuby or now I guess Mac Ruby once it's fully supported because they don't have gills but you can still use this library if you want to get rid of latency problems which is you know most people do web development so we'll get into web development worker daemons you know you can do cool stuff and worker daemons with data flow or if you're you know most of the time you don't want to do stuff make web requests in inside of existing web requests because a server a client server is already an implicit concurrency model that web apps have so you don't want to like you know everyone's like oh I need concurrency for my web apps but you know that already exists you can already you know horizontally add your servers the model is already there but so you probably shouldn't do this that much but that being said you know concurrency is going or web development is going more and more towards reusing all these little REST APIs these little micro services so you can just issue all these need later requests these micro services and then you know just render them as regular variables in the view and the view of course is going to try to two string stuff so it'll just all work out so that might be a cool application for web dev guys it's pure Ruby it'll work on anything J Ruby awesome GC since you're creating an extra proxy object every time it's nice to have a good GC for that no Gil native threads and it J Ruby also has a tunable thread pool pretty sweet Rubinius it's useful for Rubinius because Rubinius has a lot more stuff implemented in Ruby the key feature there is that that means there's a lot more hooks so whenever you have like you know method missing is like a hook right so a ray flattening Rubinius calls a hook method internally whereas that's not true in any other implementation that does does it natively so it looks like we're almost at it yeah we're pretty much out of time but you know future queue fits in the slide we have like two ports one for pushing and popping and we have a thread and we're just looping over stuff barrier is a new thing it's basically because you don't you want to wait until this list it takes a list of arguments you want to rate until they're at least unified that way your the next unification will just bind on the result of that data flow variable it's probably too much to explain in like a few seconds but trust me that it works and anyway it's only like a page of code and like the port example if you look the port source code there's some like backwards compatibility stuff for non-187 ruby but it's also very little actors very little data flow itself is complicated a little bit but also very little so that's the end you can pseudo port install data flow you can do it tomorrow to get some of the extra niceties that I added like hacking last night and song get hub I'm on data flow gem auto join it's only me if you want to talk to me and CTM it's awesome like if you want to read about a really cool programming language it's Oz and you'll see you'll be like oh man that guy's a douche he copied like everything from that book that's it