 Thanks, this is the third conge and the fifth year of closures being a public thing And I couldn't be happier to see everybody here and a lot of good old friends and new friends and so excited about The vibrancy in the community and obviously the creativity of everybody involved so Congratulations on what you're accomplishing Now what I've been accomplishing is something I call TBD and I'm a little bit frustrated because my my thing leaked You know it's like one of those Apple Apple keynote so TBD What does it mean to to better do and That should have a little trademark a trademark So to better do is a is a new massively parallel concurrent AI driven To-do list application And and our trademark is putting the personal back in PMAP That's all I have. They'll be a github repo tomorrow with nothing in it And now probably although it will ever be No, so today I'd like to talk about the language of the system Which is a which is a title that may not convey anything in particular But hopefully it will make some sense by the end so one of the things I Think happens to us all especially as enthusiasts of languages and some some people use their languages like it's just a tool or whatever And then you're like you find something that you really like and you become enthusiastic about it And you look forward to enhancing it or making libraries for to making things to interconnect with other things and You you sort of define your world Synonymously with the world that's implied by your programming language, and it's impossible to avoid this Right because the semantics of a language they eventually you know pervade your brain We say things in these conferences that you know people from outside the closure community like how come you can say that and nobody says oh You know, it's all that it's all data, you know, it's all the data. Oh, yeah I I know it is I hear you I hear you So a programming that sort of defines the world it and and and I'm going to say language here And I really mostly mean sort of the language in the corresponding runtime because we have languages and a lot of languages at The bottom the primitives are kind of the same this control flow and things like that and the runtime sort of enhances that with a bunch of other things But but we get involved in this programming language as a world and then of course if it's a functional language like closure We get even more involved with wow this functional part. This is the good world This is the world. I really want to live in and everything else is sort of like the ick You know, so I have the good world and we want to minimize the ick And you know we call it IO or something like that and by painting it as IO We almost sort of like would like to make it somebody else's problem and like Haskell is really good at this You know, it's like there's a monad and it's like stay out You know, it stays over there We don't really force that but by by convention and discipline we try to do that But it's important to note that you know, that's never been closure's approach to imagine that that part of your application was not important I mean the whole existence of the state model is there because you know actual programs need to do Interactions with the world they need to affect the world if you're not affecting the world. I don't know why you're writing software So really really is important So if we look at what constitutes a language and again sort of language plus runtime We get all of these facilities and this is in no particular order But some of the things that that really matter when we start talking about the bigger picture as being either present or missing Or the analogies either hold or don't are things like a memory model, right? So we have this presumption and Java maybe maybe enclosure you're isolated from this but as the author of closure and as the author of the The primitives that guard state and memory transitions the existence of a memory model and Java is super critical It's a big big promise and you know the fact that it's present It's true for all libraries written in closure or not that run in the same runtime That that's based upon you know a resource management structure like garbage collector that's shared is a Gigantic suite of facilities that's common Both to your your language other things written in the same language and things written in other languages Calling conventions this may who even knows what a calling convention is anymore See programs remember calling conventions because you had all these choices, right? And maybe maybe maybe even in the absence of you know, who's pushing what at the stack level We still have sort of conventions around Deciding whether we pass values or references even in Java though that's sort of disappearing But that would be one one aspect of it resource management. Like I said mostly in the memory space We know Eventually the runtimes and the languages start not helping us anymore with resources outside of memory There's all kinds of coordination, right? We have monitors We have a volatile and things like that to interact with the memory model to help us coordinate things And again, that's sort of embodied in the primitives enclosure, right swap and things like that Our coordination primitives that rely on coordination primitives down underneath and of course probably the biggest things that we derive from languages As we touch them that are that are more fun I mean again as there are the primitives for control flow or not are any of the tools for abstraction and Or type stuff and of course some languages emphasize this more than others and closure probably does not emphasize it as much nearly as much as some others So that's what we talk about when we talk about programming language and typically language when we talk about system We're talking about something bigger bigger than a program in Particular I'm talking about something bigger than a program. So the definition of system is is the roots of it or in stand together and By that I think the interpretation I would take is that you know one leg of the stool is not particularly useful thing and The stool with two legs is Dangerous, but you know when you compose enough of the pieces you end up with something that performs something a useful a useful function And it's actually these systems that most of us deliver how many people How many people have a main product of their effort that is a single program that doesn't interact with any other programs? How many people think mostly what they do is build systems or parts of systems right So we do that but the programming language is pretty much stopped before the system in other words The system is this composition of things whose language doesn't know anything about systems. It doesn't say anything about systems It's an ensemble of programs Of course there's lots of ways to build systems And I'm going to try to narrow the scope of that because in the old days You just any two programs could talk to each other any particular way and you know, that's a system and it is still a system I Think over time we've gotten more disciplined about how we build systems and now we tend to think of systems as compositions of Programs that offer services to other programs and that's an analogy We can draw out of what we do inside programming languages You can get libraries that give you services as you consume the library and then you know in the process space You have services that you can call and they have certain APIs and you call them and that's what happens But there there are many things about system that are very different in particular There's no global supervision anymore a lot of what we get inside the language is not there Right, there's no global resource manager. There's nothing watching everything. There's nothing that knows everything that's going on Could be more than one process in the same box. It could be more boxes. There's no like Person in charge of the internet making sure everything is okay and and the question is how do we connect these how do we connect these pieces and The purpose of this talk is that there's a way to talk about the way we connect these pieces that draws Analogies to the way we talk about how we connect pieces inside programming languages and it both informs the design of systems and I think goes the other way and systems should help Inform the design of languages or the use of languages So when we say language, what do we mean the root again is tongue It's obviously about communication But everybody knows, you know the old saw about programming is you know, you think it's about talking to the machine and And it and a certain sense it is but it's certainly also about talking to other programmers Right, so you write a program the other programmer could be you right later ten years later You look at your coach like well Who said who said that? But I think it does split out a little bit right so I think in all cases all Programming language and all the use of language you're going to talk about is somehow about programs talking to programs It's programmers talking to programmers But inside a programming language there's also the other aspect which is the programmer talking to the machine You know boot machine make this happen do this stuff but that a very interesting different characteristic of The communication that occurs between programs in a system is that the language that's used there is the Language for programs to talk to programs Almost definitely. It's extremely rare to see the interface on a service be one that's oriented towards people Or at least oriented towards people and human interaction fundamentally It's fundamentally oriented towards a program talking to a program and that's going to become really important as we move forward so one way to think about these two these two things is as stacks stacks of specificity and hierarchy and Encapsulation so at the bottom of a programming language is a bunch of Primitives language primitives for control flow for memory acquisition and things like that then on top of that We have core runtime facilities and core libraries and or libraries from Third parties and then finally we build our application libraries and our applications on top of that That's sort of all inside the program inside the program view If we look at systems, I think it's a little bit harder to sort of tease out What are the what are the primitives of systems? But certainly if you start with the communication side you end up with two very evident Pieces to the language of systems, right? One is Are the protocols? Right UDP TCP HP web sockets all these things Sort of the negotiated Transfer primitives that we have and the other are the formats. What do we say over these verticals? and I think that's pretty evident and and straightforward although I will talk more about formats But not at all any more about protocols then the analogy to the next level up though I think is an area where we're particularly weak In in having good language for it That's where the focus this talk is going to be and finally somehow at the top we end up with either portions of applications or entire applications Acting as services and or consuming each other as services And that's a system of course is a there's a joining here because those things that are the applications on the right We're written using the stack on the left But the stack on the left doesn't have a lot to say usually doesn't have a lot to say about the stack on the right So the first thing we have to talk about is say what? Again, and we talked about protocols and formats but formats are huge right how many different ways we have to talk over these wires What are we sending? XML Jason is probably the big winner right now protocol buffers And then of course quite common in this room would be Eden and closure data But there's also Avro and Hessian and Bert. How many know what all of these things are? Not too many how many know of those people? Could make a matrix of as to why one is better or different than another And yet, you know, this is actually pretty important, right? This is what we're going to be saying from one process to another It's a huge thing and it's full of decision points I think one of the things that's really cool about it is all of these things are representations of data What's not up here? What key Java technology for things talking to other things it's not here Well, that's not really what in this yeah with RMI right RMI. Yeah, big winner. How about a decom? Corba anybody Okay, they're not even on this list right they all lost that all lost for really good reasons So we're not even going to talk about that. We're already reached a point where every single one of these choices is Of data format It's so already we've got this great premise the way services are going to talk to each other is by conveying data Not through some hyper linguistic or extended linguistic thing where there's all these extended verbs And there's a notion of a program object being on a different machine and things like that. We're just going to talk with data So we have to talk we have to split out. What about the data is good or bad? What are the decision points one is? extensibility Right given this format if I have a new thing to say to you tomorrow. Is there a way for me to encode that? If there's not it's not extensible. Which of these things on the list is not extensible Jason There you go. That's not really that's really not good And at least a couple of problems we'll get to later and there's two notions of extensibility one is to new types The other is to new versions Right, so there's a sense in which for instance protocol buffers are really mostly about being extensible to new versions You can make things go to new types, but an existing consumer Can't be really aware of those but they can be Tolerant of new versions Self-describing which of these things is self-describing XML kind of sorter What else not protocol buffers Avro Eden Hessian and Bert and and Erlang's transfer, which is what bird is a flavor of What does that mean to be self-describing? It means that if I have a decoder that understands the rules of the format I can read anything that you said I don't need to know anything else out of band. I don't have to get a description any other way That's not true a protocol buffers, right? Somebody starts streaming your protocol buffer stuff It's like gobbledygook if you've never seen the schema and where's the schema in the protocol buffer stream It's not in the stream It must be transmitted out of band so we get to this other part Which is schemas in or out of band of the ones that are self-describing one of them has schemas, which is that? Well, that's optional though, but one has a one that's required for for reading them No avros a protocol buffers are out Avro. Avro has a prelude Schema thing so you then you have this question of the schemas in or out of band avro has schemas protocol buffer has schemas Avros are in band protocol buffers are out of band but both of those have more requirements on the schema interpretation than something like Eden or XML of course XML you can definitely read it you may not understand it You can read it without anything if you have schemas. They're sort of optional Why does why does it matter whether or not schemas are in and out of in or out of band? Me it's on the slide If you have schemas, what can't you have you have out of band schemas? What can't you have you can't have these things generic processors and intermediaries? It's really interesting that Google came up with protocol buffers imagine if the internet was built with protocol buffers How good was Google search B? It'll be bad right because they're in the intermediary business They're taking advantage of the fact that any HTT HTML processor can read any HTML If if everything was a negotiated contract, it just simply wouldn't work. So you really have to understand It's not to say the protocol buffers are bad. I'm not saying that right But what I'm saying is that there's a spectrum of choice and and and trade-offs. That's really important here It's as important as choosing a language when you pick your programming language But picking any programming language now leaves you with this decision when you move up to the system level Of course a lot of times this is not your choice Right, you're consuming a service that somebody else has made a choice and that highlights sort of the next problem with in this space, which is that There's nobody in charge when you use a programming language a programming language kind of sort of says Well, we're all going to pass arguments like this And we're going to define our types like that and everything else and with no one in charge Systems struggle against this set of independent decisions, which may or may not compose and the format problems is the first place This comes up So this this scheme is out of band is really tricky and that's one of the things where people like a Jason Why can't put dates in Jason right? How do you put dates in Jason? As strings and how do you know they're there? Out of band you go back to the napkin, right? It's like if the if the if the key has the word date in it Then the string is a date There we go And so there's another aspect of that which is that's not merely out of band, right? If you get a perical buffer schema out of band like it's not a napkin, right? It's very straightforward Jason is very very the people's use of Jason is extremely context dependent and a lot of times that context is not captured Anywhere except on a napkin It's like okay. Well, we've all agreed to send this and like you know this is coming and therefore you're going to go to the You know last edited field and you happen to know that last edited is a string that has a date in it So that context sensitivity is really bad so Obviously in this room, we don't have to talk about the value of values we like values and and I think the only thing to do here is to sort of again think about the difference in differences between programming languages and Systems with values, so we definitely have values and systems at least at one level on the wire, right? We just looked at all the popular formats for transmitting stuff. They're all data formats. They're all values, right? We're not really passing a reference to a guy that you're going to then call back on his RMI interface to go get more stuff and Have this big chattery communication with objects. We just convey the data that we care about so that's fine those are ephemeral and they're usually nameless and in Programming languages values are often usually nameless, right? We have the same notion. We can pass values We get our value as a return from a function. We just have it we start processing it I mean Java is not a particularly strong language for values because everything almost is a reference type But in languages that really have them as distinct things a lot of times values are completely anonymous You have an array of structs. None of the structs have names If however, you want to have a value in a system That is not ephemeral that means that either maybe it's large. It's so large I don't want to put it on the wire and send it to a hundred people I want to put it somewhere and let the people know where it is Or I want to have memory in a system I want to remember a value in both those cases you end up incurring a new thing Which is that your values need to have names and That's a definite change Versus your programming language. It's one that really matters because until we start becoming more cognizant of when we're manipulating values And that this is the a name that names of value We're going to keep making these icky messed up systems that don't distinguish references from values For instance, how do you know when a link is a permalink? You don't know when the link is a permalink because on the web page where you got it said this is a permalink and So when designing a system, you need to be more considerate of this and call it out So that brings us back to names and again here. We sort of have this difference right inside a program We have all these great scopes. I'm in a local scope. I have a lead just nobody knows about this now I'm in a function and we're also sort of cool and this function is a namespace. That's also sort of great and then the namespace is on github and Then what happens? Then we're all fighting for names on cool names on github We're used all up all their characters and all the stars and robots and you know names of food And So it's really critical once you lift up out of the system right and nobody's in charge anymore. What's true of most system names? They're global. I mean they're potentially global and you really need to think about that You really need to be considerate of the fact that as you start building systems as your name start escaping out of your processes That they are global names right and the really tedious things like Java's You know comm dot whatever dot whatever that stuff matters right because what's comm dot whatever where'd that come from? Somebody who's in charge right? There's a somebody in charge there in the absence of that. It's a free-for-all And so those those DNS names and and whatnot Become critical and using fully qualified namespace names that are truly global names is an important discipline for doing systems But it's also interesting to think about how different the names are What are the what are the things that what are the things that most what are the most of your names in a program? Especially in a closure program most of your names 99% of your names are what they're one of two things right? They're either locals or What the names of functions? And we have a huge huge number of names dedicated to functions in our programs. That's where most of our names go They're mostly verbs What happens in systems who likes to work with the system has a ton of verbs That's really interesting right. Why is that? There's all of these inversions as we get to systems aren't there right? We have lots of names of verbs hardly any We have this global control We don't have global control and we are gonna have a lot of names in systems But they're gonna be used for other things probably not verbs machines and things like that storage locations and then these values Right are gonna need names to another critical thing so So systems look like this every every process has a number No, obviously they don't Being lazy on Google images like that has circles and lines That's it's faster than me trying to learn how to do that in keynote Does anyone know how to make a line connect to a thing and stick like I moved the thing and the line is just sitting there Dude, can you make them connect? No, I can do that. I can do it there, but then it's like two things and then there's the internet has this picture So if you ignore the numbers the numbers are not important the numbers are not important But the last systems have this shape right it's fundamentally hierarchical. It's not like everybody's calling everyone It's this big big nightmare, right? It's generally some things call other things call other things come back come back And there's some sharing across there may be a couple of lines across at a level and there may be one guy at the top You know from your perspective that's you who I get to consume all this stuff Maybe they don't serve anybody else depends on how I'm situated But the critical thing here is that while each of these things in their bubble might make a ton of sense, maybe they're written in Haskell and like It's proven that they're correct or something awesome Right as soon as you start drawing lines between them, what happens? All sorts of new implications about what things mean have arisen Have emerged from the connections of these things and it's different in a in some way from consuming libraries You might look at this and say well, this is not different from libraries when I have libraries is the same thing They wrote the library and they did whatever then I'm consuming it But what did the library and you share a Ton of stuff all that runtime stuff you share all kinds of presumptions about memory Coordination locking threads garbage collection the whole the whole nine yards. What do you share between these things? Some wires Routers and things like that. So the question is you're where does semantics of a system? What does this mean? How can we define the pieces? Such that we can sort of get a grip on what this is So usually it's hierarchical, but that's not enough to really understand it and this is where I think we really run into trouble This is where the problem is Right. What what does that look like? It looks like object-oriented programming Right all these objects are connected and they send stuff to each other and whatever and And and it's and it's possible right it's possible, but that this that this system Built out of all these processes is exactly like objects at scale Right every process is like an object and it's stateful and it sends things over to other guys And then they change and the whole thing is really exciting Because because service is an arbitrary notion what does it mean to be a service, you know You send me stuff and I do stuff. I mean one thing that's sort of telling is there aren't a lot of verbs Which is kind of good, but you know all the services are still now is the fact that they don't have a lot of operations It's helpful about saying well, maybe they're not like objects, but there's nothing stopping them from being objects so Yeah, so so in what way is this not object orientation How do we keep it from being object orientation in the large because if we if we've you know Spent all this time doing functional programming in the small only to build object-oriented programming in the large Then our system in the large is still going to have the negative Attributes of object orientation So I think one way to think about this is To think about machines and production lines and things like that What we're trying to do here in the in the next few slides is to try to think about a way Obviously, we're saying change happens right we know that this is a dynamic system. It's producing stuff. It's affecting the world That's the point of it So we're not going to try to deny that But what's the way to organize it such that we don't end up with Object mess and one way to think about it like this this production line thing So what does a machine do a machine? applies forces to accomplish work Now think about like car factory I'm in a car factory Well, people go in there every day and they work real hard and they mutate the state of the car factory and then they go home Right, that's like objects. That's like an object-oriented program, right? Maybe you know some stuff. No, it's not like that, right? There's like one end of the factory and something comes in there what? Row materials parts, you know things iron and tires and stuff, right and and then something comes out the other end what? Hopefully cars right and So this notion of of of flow. I think is the key to keeping keeping a system sorted So there's a bunch of characteristics that you can combine that will even though that they Technically a certain percentage of them are not functional Accomplish something in a way that is not place oriented Right if you heard me talk negatively about place orientation, right that you know We all went into the factory and had a good time and went home and like the factory is now better this place orientation and This kind of flow orientation Cures that so what are the what are the things that we have in flow? We have transformation, right? We're gonna so one of these we're gonna be doing is transforming values I'm gonna take, you know the lugs and whatever things go in the tire I'm gonna screw them together and now I'll have a wheel instead of the parts of a wheel We're gonna move things for one place to another. We're gonna route them. Maybe it needs to go here or there We're gonna have decisions about that. We may remember things right and again The the word remember is a term that that is not incompatible with functional programming In a way that update is And I think the critical thing to sort of making systems out of out of these Parts is that you as much as possible keep them separate Right in other words when you make a transforming moving routing remembering thing It's really going to be hard to keep that from being Something you can't take apart and reason about or combine with other things Right, so even though each of these steps. I think this has this has a sound Use if you were to put them all together in one thing it would not it would not be sound anymore So you want to keep transforming separate from moving and moving separate from routing routing separate from remembering It's like that and this is the difference between flow and places But move and route and and remember are not strictly functional. That's okay. We know we need to affect the world So transformation, this is the thing that's easiest, right? We know transformation. It's just functions, right? It's basically straightforward. The only thing here is that generally there might be some Input to the function which is now not just sort of a local Input from a call from a programming language, but it's coming over a wire and there's output over the wire the thing It gets a little bit trickier sometimes with Functions at the system level is that sometimes you need to convey information out of out of you know off the wire You know, I need to put it, you know in a database so that you can see it later And I'm not going to actually put some huge thing over the wire to you in every message And in that case you now have this sort of stranger view where I need to run this function and what I have is not The value, but what the name of the value? And I'm going to try to distinguish the name of the value from a reference because they're actually different So sometimes you work to and from storage Otherwise though, it still functions. This is not straight. This is not hard now. We get to moving things around I think it's one of the things in closure. Maybe I didn't make clear enough because I Didn't need to wrap them is that the cues in Job you took concurrent are awesome if you're not using them as part of your system designs internally You're missing out and in the large cues also rule Because they have this really great characteristic. They're completely decoupling right messages. What happens with the message? a says something to be When a says something to be what does a need to know? be Right, that's a problem. If a put something on a cue who gets it Don't know So that decoupling is really good both in the identity of the consumer also in the availability If I put something in a cue and and the person who's supposed to consumer is not running And what does it do I care? Not usually there may be backflow and some other kind of considerations But the availability of the consumer is also something that you don't care about right again a directly connected message a said something to be if B is not around That's now a problem for a if a put something on a cue presumably if you can make the cue more available than be you get this you get this Independence both in the identity of the consumer and the availability of the consumer which is extremely strong The other great thing about conveyor belts and cues is that what do they do? What's their job Move stuff. What's their other job? There's no other job That's all they do Right, so it has that characteristic we had from before I mean when you get to pub so about you really you end up with routing and moving they're both on the slide, but That's that's really strong cues are extremely important cues are decidedly different from messages Right for those reasons messages. They need an available consumer and you need to know who you're talking to it's architecturally completely different All right now this memory. This is the part that's really tricky right because you do not have a ton of great options For memory that are not place oriented There's a new thing that's kind of good for this, but But but you don't need to even use that the key point I want to make here is that the epical time model the one that's behind closure it works in systems It works at the system level. I'm going to show you the picture again later But the basic idea is what we have reference types, right and we have values and the reference types only ever Contained values the only ever just point to values and they have semantics about how they transition from one value to the other There's nothing about what I just said that is about closure That is about memory that is about locking There's a little bit. That's probably about Kaz But not Kaz on the chip it's a very very general notion and Datomic implements that notion the large but you can also implement it yourself Right, and you're going to need to combine a couple of things you need to combine naming values With some sort of reference and some sort of a la carte coordination So this is my old slide of the epical time model Closure implements this right we know atoms are this refs are this agents are this And we can do this ourselves, but we're going to say we have a reference It takes on different states over time each of the states is a value You're able to obtain the value out of the reference as an independent thing And we just said before about values in systems that you're going to need to get a hold on are going to need to have what? Names they're going to need to have names. That's what's different and then we can transition from that values to values So we can see this in action in in The way datomic uses zookeeper and and things like react or s3 So react and s3 don't have the semantics required to do the state succession Right, they don't they don't have what you need to do that You need something along the lines of either CAS or Versioned updates or something like that but zookeeper They have that they have versioned updates So you can combine them and you can implement something like refs in zookeeper That point to values that you store in something like react or s3 or something a store that doesn't otherwise have the consistency or the Ordered transitional semantics and you can pull tools out like about right right now and do this for yourselves So the the important thing to note is that the closure state model is available at the systems level You do it this way and the only thing you have to do is put names on your values. What's a good name for a value? You you ID is that stew No That's good stew is always my spoiler. Yeah, you you ID. What's not a good name Fred? I Got this from wherever or any of this because what starts to happen when you have those those kinds of names People start to care about them. What should you care about about a value name? Nothing at all also because a lot of places where you're going to be putting values You really want to be conflict-free You don't want to have to coordinate about you want to keep oh is this Fred 27 or Fred 217 or you know, whatever You just don't want to be there So you you IDs are a good good thing to use to name values. You don't care because that's not the identity, right? What's the identity? The one over here Right, which you're gonna have very few of so for instance they in datomic You could have like hundreds of millions of items in datomic. You know how many? Refs you're gonna have in zookeeper for a database Three you know it now, right? You've built systems in closure. How many reps you end up having how many atoms? Tiny tiny amount by the best thing about closure showing people how little of that you actually need It's the same thing here, but the the strong names right the globally qualified namespace names will be the identity names That's really important that they'd be like that the value names You want to be a conflict-free tear-off names that anyone can create without coordination, and that's what a UUID is about All right, of course is my favorite topic errors and error messages and whatever so So this is really important paper at the bottom here, and if you read this paper over and over again Which I recommend You're gonna see a couple of facts about systems right and and and it's another way in which systems are really different from from programs right in a program Do what are you really are like afraid that some object you're gonna call is not going to be there? No, the whole program tends to like be around or not like all together It's like it succeeds or fails all all together. We get all confused because we live in this bubble It's like well errors are like when I made a mistake That's not right. That's just like programmer convenience thinking right in in the real world Failures are like they're all the time right the things that you depend on are possibly not there all the time Right a large system is in a state of partial failure almost continuously, right? The the math is against you for having like all of your 10,000 machines always work all the time So parts of your system right when you look at the whole thing will not be working It also means that those things that are not working Will not be available Those failures are going to be uncorrelated. They're gonna be completely independent right you still are fine But somehow the thing you're talking to has become unresponsive or unreachable or whatever and it starts to give you a whole new way of thinking about Dealing with failure right because the things you're talking to are unreliable you have to use timeouts You have to retry if you're gonna retry well you have this open question I mean I might not have heard back from you, but you might have heard my original request and done it So I need to know that my my future requests are item potent Who is worried about that when you're working on stuff in memory inside your program? You don't worry about these things at all But the thing is as soon as your program becomes part of a system this these error modes are gonna go right through your program You're not gonna be able to deny them. You're not gonna be able to convert them into something else You can't fix them right they go right through you as soon as they go right through you you realize that Distributed error Modes are the only error modes everything else is just like Programmer convenience error handling stuff, but it's not really what the systems error modes are about So I definitely recommend that you read the paper because You can't think about it often enough and it really is difficult to internalize and you'll still write systems where you presume the best And then you're like ah the best thing is not gonna happen sometimes So the other things about systems is that they're dynamic and they're dynamic in a whole bunch of different ways Right, they're dynamic in membership where you just said some machines come and go sometimes they'll come and go on purpose Right now because they failed because somebody started some more machines. They'll come and go for capacity Right as people trying to scale they'll also come and go for capability like the system will be running And all sudden somebody wants to do something new and they'll start up new stuff and systems that can become Dynamically capable of doing new things or really strong systems. It's the kind of system that you want to pursue and so all new kinds of Terminology is going to come to bear at the system level that you don't have inside, right? You can't scale one box, but you can scale a system Right. It's not usually the same notions of discovery, right? There's some what, you know, maybe if you're talking about Injection and things like that, but the true notion of discovery is a distributed thing elasticity is the same kind of thing So so we know that systems are dynamic That has implications for the programming languages. So there's a holistic approach to this Right, and there's a great example of the holistic process, which is Erlang Erlang is a language of the system It takes the approach of saying I am only going to be building systems I know that up front and I want these semantics inside the processes. I don't want different set semantics I don't want my bubble semantics and my system semantics. I don't my bubble interfaces and my system interfaces Just so you're not worried. This is not where I say we should all switch to Erlang I just everybody's like, oh my god Did he change his mind already? It's only been a couple of years No, so there's nothing wrong with the holistic approach, right? At an Erlang the fundamental units of programs are services. They call them processes But there's their little services they have communications capabilities, right? But they follow all the things that we talked about before in particular. It's not like RMI, right? Those little services are not like objects. They send what? Messages which are data, right? their data it is though custom communication that they use and and there's a very Specific model baked in to the language and they basically said we are going to do actors We are going to do asynchronous send only receive asynchronously no synchronous communication RPC you have to build out of pieces and things like that. So there's very very specific Model here, which I think is extremely well suited to making Communications programs, but what's the trade-off with the holistic approach? Is it Erlang a great number crunching language? No, is it is it really expressive in certain kinds of domains? No, right? It's definitely it's good at some things and less good at other things doesn't have a rich type system It doesn't have a rich abstraction Model or other things so the trade-off of a holistic approach is to sort of you put all your eggs in one basket I think the fact of it is you're never going to be able to dictate to everybody To use Erlang or use any one thing you can't say we're all going to do our programming in this one language Right, that's the whole there's a king of the world thing inside You know Erickson, maybe they can do that because they everybody's going to do Erlang But in the world on the whole I don't think you can sell holistic approaches You can't convince everybody to use the same language even if it's better So that leaves us with the heterogeneous approach, right? We have to have some sort of cross language notion Of how to talk about things to how to express the semantics of systems and what the language of systems are That crosses languages and runtimes and platforms and things like that and as I said at the beginning right we know parts of that language are protocols and formats and I think the the third part the thing that fills in this box are things I'll call simple services So a simple service is a service. It's its own process Right, it does communication using data to have a very small surface area in terms of the API right if the API is mostly data, it should have an extremely small number of verbs associated with it and It should do mostly one thing and you'll see that a lot of the facilities of programming languages and runtimes are now available as services Right, so we have cues right we have job you tilt concurrent cue and then how many message cues are out there tons Tons all with different characteristics and you know, you'll make different choices But there are plenty of message cues that are dedicated to that now unfortunately This says simple and you know if I knew how to use keynote that would be blinking and like on fire I saw fire was good Right, that's super important And I think one of the challenges for for this approach is Invariably people would like their service to like do some more and making it do a little more Olsen breaks the simple part So for instance cues usually have very very icky Durability things like once they start to get into that space and all of a sudden. Wow, this is not not simple anymore Coordination things like zookeeper are extremely interesting right if you've not used it or something like it It's very cool to think about all I have over here is just coordination And if you can constrain yourself to that of course again Zookeeper is durable and you could try to treat it like a database and now you're trying to make it do more stuff And not use it use it simply because it does do more But if you treat it simply it's a fantastic little Little utility just to do that part of the closure state model or the whatever Epical state model control flow right you have things like Amazon simple workflow right and storm We just saw an example of storm before look at storm. What is it? It is what I've been talking about. It's this flow model Although again, it sort of says this is the recipe that crosses all the pieces as opposed to saying we're going to compose cues plus Arbitrary consumers of cues and other cues the source says I want to wrap around your whole thing And I want you to play this coordinated game. So again It's it's less simple than it could be but as a as an architectural strategy It's an example of what I'm talking about. It's flow oriented, right? We're used to memory services right memcash is a beautiful thing people like oh memcash blah blah blah Most of the problems with memcash is people are using it to solve horrible problems with using place or into databases That's a sucky problem. That's not a suckiness of memcash Right memcash is brilliantly simple. It does exactly one thing I know of course they keep trying to make it do a little bit more But it does the one thing it does really well. So that's shared memory reddus is another popular example, right again Hopefully they'll keep it simple and to the extent they do it's the kind of thing you can compose together And of course storage has exploded S3 is global shared memory It's an awesome thing except what? Shared memory is dangerous, right? But we know how to make shared memory safe like closure has shared memory Uses it in fact it's quite fundamental to closure that you have shared memory and shared memory is important Right, you just have to be careful in using it if you combine the reference to immutable objects You can use S3 Just as safely You can use a key value store just to say for exactly the same way the only trick there is the transitions of the refs Needs help from things like zookeeper, but moving up the stack like DynamoDB has that semantic built into it a lot of the Memory caches like in finis band have it built in so you can get it You can get both together like we have in memory and systems So you want I think we want more of these and want them to be smaller still and to do to do as little as possible So I think one of the problems we have here is We there is something that we really like Inside our programming languages an important tool, which is the interface or the protocol, right? It's the thing that abstracts away from us the details of what we're talking to Where's the interface for S3? Right in a different audience to be people like gripping the arms of the chairs like no We've solved this right we use wisdom and then I use a BPL thing and I draw these pictures and Like I have die of systems And we were just naive in here because we like to build things out of you know smaller parts and this is we should be up there No, I mean there are things like that, right, but they don't get used I am old Amazon did not use Wisdell, maybe they tried did they try early on remember? Were there any schemas ever? They used to be right now. It's just like we'll read the docs try it You know and when you get it right you'll get a good you will get a 404 So you just don't see it you just don't see anymore and so what what you've seen now instead is you know S3 is so dominant that when Open stack wants to have the same kind of service. They don't have any Abstraction to tap into to say we also implement that abstraction. What do they have to do? That to directly imitate the protocol of S3 This is not a great place to be same things happen with memcash right people like oh memcash is cool Right, and people are like well. I have this other cool distributed redundant memory cache It's like well, I use memcash, but I mean this is more better, but you know I this way you know what do they have to do? Mimic memcash on the wire This is really a bad thing, and I don't know what the answer is because I don't think wisdom and things like it Are the answer either but it leaves us in a difficult leaves us in a difficult place This is an area that we can repair inside the programming language Right, there's all kinds of variants of put stuff out of place like S3 some of them mimic S3 and some of them don't But something like J clouds right can go and isolate you from that right so it's super imposing abstraction Now there's two ways to think about doing this right that superimposition of abstraction happens where? Is it a service who knows what j-clouds is All right fair amount so j-clouds is a library It's a Java enclosure library that has an encapsulation both over sort of like the ec2 elements of cloud services And of the storage and we just think about the storage right now There's this thing called blob store and it abstracts away the details of connecting to S3 or connecting to You know Open stacks stack or to whatever VMware cells or whatever Another vendor has and so they're giving you abstraction inside language if we don't want to do this inside What do we end up with what's the system version of this? proxy and That you tend not to see why? As a hop right it has a hop and it's like that, but it's still tricky We don't have interfaces, and I think we're suffering so what can? programs Tell systems what can systems what can our systems learn from our programming? One is we need more values values need to be first-class we need to name them We need to start using that ethical time model in our systems designs You can do it yourself today. Just showed you three ways to do it You just have to choose to do it right you have to take this flow orientation right This is something you may or may not be using like people talk to me a lot in closure Like I love closure. I have the functional part I think I'm getting a grip on it and every time I try to get the state even if I use the state stuff from closure Still end up sort of struggling with a model for the whole thing the model is this flow model, right? Just flow values around use cues inside your application It's not like this trivializes everything you need to do But you can do a lot by just emulating this inside and of course if that's your best practice inside It's nice to convey it out This is the way you're going to get more reusable things and things that are easier to compose I do think we're struggling with any kind of abstraction. We know it's good But we don't know how to do it at the system level And I think the biggest thing we suffer from here is a well. Yeah, how does somebody else? Provide a service like S3 and let you try to use it But the B side of it is what if you're trying to be a service and you're trying not to build in durability into yourself Like you'd like to be playing this game well and saying I'm componentized right well in a programming language We totally know how to do this you say I'll work with anything that implements this interface or anything that implements this protocol We now have a way to say that and and the person who wants to compose you with something else has this recipe for doing it Now what's the systems way to do that? What's the systems way for saying I'm parameterizable in my storage? It's really difficult and URI is not enough right? I mean you need to know what what method to talk over so what ends up happening right now is your service needs to embed something like J clouds or an implementation of an abstracting thing and you need to Individually support what your users are going to need or provide an extensible mechanism, but you're doing it inside yourself As opposed to sort of saying at the system level I have a way to say this is an interface that I use so that you can plug in the kind of storage you want with me So we're suffering there What does systems tell programs? I don't think I don't you know There's great papers great old papers that say do not try to make you distribute a system like your programming language And they're totally right especially at the time They wrote it which when objects were hot and people were trying to do Corba and things like that terrible terrible idea But we should also be able to pull so but some things are important like functional program is important I think it's not done a lot in systems. What can systems tell? Tell programs. Well, the one thing is this machine-like thing, right? Maybe it's easier to see When you have wires, right? It's quite obvious The only thing I can send over the wire is a value in XML So I've chosen to use that but now like well in this audience I don't need to say this but in Java people have a real question, right? They don't tend to send data structures around in their interfaces the way we do and they have this real choice I can send a data structure an object that has like all these verbs and knows how to do stuff and changes and dances and I Might as well send that it's only one argument. It's a lot easier, and I don't have to type and in fact the IntelliJ will just type it for me But so so I think in closure we're kind of spoiled right because we do this all the time But it is something that if you're trying to talk to somebody you're trying to talk to somebody else Who's building a system about maybe they should bring this architecture inside their program You have to make the rationale from that systems level this makes sense in systems And you explain to me why it doesn't inside the program because I don't understand why it wouldn't The other thing is this programmatic program-to-program Interfaces rule, right? Where do we suffer when we don't do that? When we when we only define a human interface or we find a human interface first, where do we suffer? Every single time we do it every single single time Right anybody ever try to write a program that manipulates any Unix program? Yeah, is it fun? Yeah You have to write parsers you have to figure out how the command lines work and all this other stuff I try to manipulate get from a program. It's like Terrible. I just did it. It's not fun What else is an example of that? sequel Right in both these cases they they wanted to support so some person's gonna be seeing at the computer And they are gonna want to like do stuff and they're gonna go blue and go and it's got to work And there's nothing wrong with that right that use case is important You want to make that happen, but when the only interface you define is the one for that you end up with no programmatic interface So what do we have in sequel? We have all this is simple You know people will say where and blah and that's really great and what do we have for programs? String building we got nothing we have nothing to work on so build your human Interface on top of a programmatic interface because programmatic interfaces are all you've got in the systems level always typing into Amazon AWS services almost like oh, I'm gonna like use S3 You know they don't do that So you want to you want to have the programmatic interface underneath The systems failure model is the only failure model you have to look at all of your error handling from that perspective As soon as you do you realize there aren't gonna be a lot of places for the I made a mistake flow It's got to be dominated by the the system is partially unavailable flow Systems are dynamic and data-driven It might be and I said the other user language that was also dynamic and data-driven again in this room I don't need to say that So I think people are building some great libraries. I'd love to see more people build Some services some simple services. I think this is a tremendous opportunity area for closure closure is really really well-suited To building these things and if you build these things it's going to give you the inroads into your into your Organizations right. Oh, can I build this new thing in closure? I don't know well. I built a service. You want to use it Oh, well, yeah, what does it do it does this? Oh, it's nice. It's simple does this one thing, right? And we're seeing some of that like the Raimond thing right who even knows well, you know, it's this cool logging thing But it does one job. It doesn't really well. It's a service like thing There are tons of opportunities We just saw a bunch of things that were done and storm is really great and things like that But there's lots more and when you build something like that You're gonna end up something that's much more reusable than a library now things will have to be libraries and libraries are great But I'd encourage you to build systems. I'd encourage you when you do it to avoid Custom formats of course again in this room. I don't really need to say that there's a good format We tend to all like it and we will try that. We'll try that first Even though you don't necessarily have a means of expressing at the system level the abstraction of your service Design it anyway, right at the point. You know, there's always all this stuff about a premature abstraction Whatever definitely a danger by the time you're writing a service. There's nothing premature about abstraction The thing is got a surface area this big. It's you're gonna spend time on that. There's no problem spending time on that It's never not worth it. It's never gonna be oh, it's overkill You know you wrap this thing with the thing, you know down in the small in a program You can over abstract up here. You can't up here I mean, let's just start making a lot of new layers before your service You want to have some abstraction consider a second implementation over your interface like maybe You've decided for speed you're gonna use you know Avro or something like that But if you also design an HTTP interface You'll sort out your abstraction just by that exercise It still doesn't give somebody the ability to say I'm gonna make something like it with the same shape But it will make your service better and the other thing is to design your service to be composed And again, I think this is a challenging area, right? Don't keep adding stuff inside yourself. You're gonna make a little monolith You're gonna become a stack yourself. You don't want to become a stack You want to allow people to plug in right if you need to store stuff Consider using something like J clouds now. You don't need to store discs are terrible. Who wants to write and program discs? You know it's a solid problem So as soon as you get to the oh, I need to put something somewhere Plug in something like J clouds, you know or anything or you can roll your own or whatever It has to make sense for your your thing But make it so that somebody somebody doesn't say oh, I'm taking you on and I'm taking on the fact that you store stuff over here Don't do that. Let them say this is how I want you to store Let them make things composable. Let them say this is the kind of cue I want you to use This is the kind of storage. I want you to use To the extent you can do that you'll build assist components that can become parts of systems that are built of services that are simple and That's it