 Okay, this is Thinking Cloud Native, Cloud Events Future. It's the serverless, or the CNCF serverless working group update mostly focused around Cloud Events. So if you're in the wrong spot, it's a long walk. I thought I'd give a quick TLDR of how we got here and then I'll move on to something completely different than we've done in other working group updates. Super quick, we as a group, the community, wrote a white paper in 2018, started in 2017, and the result of that was things like this, CNCF serverless landscape box that has been updated. And in 2019, one of the results of that paper was it's really difficult to use serverless because there's no standard definition of how events get to functions. So to try to address that, the Cloud Events spec was created and we'll go into more detail right now. So this is the serverless landscape that was originally created. It was a bunch of research through a lot of people in the CNCF community and Redpoint. And of course, it's evolved a lot since four or five years from then. In fact, there's not a lot of similar pictures on this picture, it's crazy and really great. And now we have Cloud Events here and that's what we're here talking about. But of course, we are trying to do all of this cloud native compute technology because we wanna be able to take these projects and assemble them into some very complex system and one of the things that the CNCF is trying to promise is that these things come off the shelf, they wire together and you can get up and going with very, very little friction. And so we started looking around and I'm gonna show you a couple examples of excellent projects. And I just noticed a little something that was reoccurring and that's that almost every project has either the ability to consume an event or the ability to produce an event. So Prometheus has this alert manager and I'm not totally sure what's in that box, but if we zoom in, presumably it's pushing events to some sort of persistent layer and then there's a bunch of custom code that sends to pager duty and email and probably Slack and all these other tight integrations that Prometheus has now has to manage. We can move on to Kubej and in fact they have another eventing bus system here and it could be that MQTT is the absolute correct solution for them but it brings questions like do they have to manage that? Can I bring my own? Can I bring a different version? How tightly are those events? Who defines the schema? And if we move on to one final one, Falco, another great project, part of the CNCF and I happen to know that the core of Falco is written in C++ and EPBF and they have various outputs that they support in that structure but they wanted to support more things like Slack and S3 and in fact cloud events and so they extended it with this thing called Falco Sidekick. It streams data from the GRPC server and then it does some munching in a more friendly language and this one's written in Go, hence the gopher and but they have the same integrations right? Slack and S3 and email and who knows what? So, eventing is everywhere and many projects either produce or consume events but a lot of those have very, very similar outputs, right? They're all sending to email or they're all sending to Slack and I trust Falco to scrape the system D messages off of my nodes but do I trust them to, you know, rotate my Slack credentials? It's not their job, right? So these custom integrations are also everywhere and my assertion is that they're very fragile. So and in fact in your own applications as you're assembling these things together, you probably also want to send to Slack and email and all these other things and so you then again have to duplicate all that code and so, you know, if you are making these choices, will they scale to what you need tomorrow? So I'm gonna coin the term and I don't know if this is a real term or not but it felt good when I wrote this slide but I'm calling them island architectures and I'm gonna define it as a piece of technology that brings a very strong opinion about all of the underlying technologies that it wants to use and if we're doing that, all the cart pick projects off the CNCF board and using them to orchestrate and architect our applications, we sometimes end up with these little islands of architectures that don't talk to each other and this is cloud events so the scope isn't gigantic, right? We're talking about eventing. Is there something we can do in this space? Little, we as a community can do that would start to fix this problem just a little bit. So we have a solution maybe or at least we have the North Star of a solution. I'm gonna walk through another example kind of explaining similar architectures but much more generic. So let's go again, right? What's the actual problem? The real problem is that we have these event producers and because the event producers come from different selections off of that CNCF map, their schemas of their events are in different shapes, potentially even in different formats like this one produces XML events and this one produces JSON events and so there's no way for me to generically ask a component like this event mediator that ingresses the event and then I write some custom code that inspects it, maybe decodes it halfway and then decides based on my custom logic to send it to some individual message bus or channel or queue so that I have an event consumer that's wanting to process that event and it becomes a problem if I add a new producer or the producer's been upgraded and it adds a new event and I haven't updated the event mediator but I did update the consumer and I have to do this very complicated coordination thing. So we have these event producers, you probably have many of them in your systems and they're glued together with this, I'm calling it the event mediator but it's this little custom piece of logic that is trying to inspect it and it's maybe different for every kind of producer and it's this custom necessary evil to be able to route things the right way. But what you really want to focus on is those event consumers because that's where the work is actually happening. So if you can't rely on some sort of common schema against all of your events, you have to write these routing configs and they become part of your application platform and they have to be developed and maintained. And so what if we could define the event but define it independent of the protocol? So we can define the event independent of the shape of the event that's the actual payload. If we did that we would be free to be able to choose the particular technology we're using to route the event and that could be the protocol that's the persistence layer. We want to save the event so we make sure it gets delivered or it could be the language that we are processing the event with. But often what happens is when you go to start a new project or you go to add a new piece you have to make a hard choice that kind of sticks with the project forever unless you do a big refactor and you have to decide which of those persistent technologies we're gonna use, which language we're gonna use, what's the custom format of the internal event or what's my event schema and if you're a big organization you better hope that it's at least a little consistent so that that router logic isn't extremely difficult. So what Cloud Events is trying to do is accept that you have the event payload, the occurrence, the weird shaped object that you're trying to send around and Cloud Events defines you an envelope to put that in, right? Just like the mail has a spot for the address and where it goes and the postal you adhere to that and then you shove your letter in and to deliver that event and the post person doesn't open the envelope and read the letter and say oh this one is for grandma I'm gonna put it in grandma's box. Now they just look at the outside of the envelope and they don't have to look, right? So if we can wrap that event we can treat everything as this cloud event object and a lot of the event mediator type objects become generic and off the shelf replaceable. So this is what we had and if we introduce Cloud Events to the system for the inputs and the outputs these custom routing components become configuration because we know how to ask the event what are the features of you that I can route make routing decisions on and as we add things we don't have to change the mediator except maybe update some configuration where we want that event to flow to this other place, right? So I'm gonna take a pause and so that's framing the technical point of where we are today with Cloud Events we've defined this envelope we can find other Cloud Event compatible components and just kind of drop them in but there's more that we could do and so I have a call to action to everyone that's here today and watching the presentation. If you produce any kind of event make it a Cloud Event because if you're producing webhooks we could take that and convert it to any other protocol and route it to wherever it wants to go with almost no code change on your site it's adding some header if it's HTTP if you're at the point where you have to make a choice adopt Cloud Events internally because eventually you're gonna wanna make webhooks too to connect another organization or another cloud and it'd be a whole lot easier if the external format and the internal format are the same it's this Cloud Event thing and if you're integrating with things ask that project to produce Cloud Events because if you're not asking for the Cloud Event support no one's gonna listen we're not gonna get a nice world where we can do fancy things like maybe the Prometheus and the Falgos and my projects all produce events but they ingress it into some of these off-the-shelf common components where I can provide my own persistence and then if Prometheus wants to provide me an excellent Slack integration that's listening for those events they wire it up to that event bus that's an off-the-shelf component this stuff doesn't exist I think we have to make it but it starts by agreeing that the Cloud Events format is the right direction and something that's worthwhile so that we can have nice things and we can delegate some of this responsibility where a new CNCF project comes up and says we are the Slack and email and alert manager and et cetera kind of we know how to manage and rotate credentials and keep that stuff safe and push events and all you need to do is tell us the kinds of events that you would like to signal on so that we can produce those results and then that freezes the Prometheus and the Falgos and your project and all like maybe 80% of the CNCF landscape they don't have to do those integrations in their project and they don't have to do those maintenance because they can decouple themselves from those integrations so that's my wild pitch is we're gonna move to a bridged island architecture because at scale this makes sense and yes it is very simple to make some of these integrations locally and test it but it doesn't scale when you go to production because there are other concerns not just the ease that it is to do that integration but you have to maintain it and you have to do the cert rotation and you have to keep up with the patches from that component and et cetera et cetera you have to follow that upstream API if you're not actually if you're not the Slack organization producing the component so there's overhead there that all of these projects could avoid okay so we're gonna go back into some of the more boring mechanics of cloud events so okay we're gonna zoom way in and I'm gonna try to answer the how does this help me sounds like you're asking me to do more work well let's say you're picking Kafka as a delivery mechanism for your eventing systems you have your business logic that's meant to it creates, it does business work it does create some sort of object and then that object is the occurrence of the thing happening like somebody logged in or a Visa card was charged and you have this custom glue that takes that object and converts it into the correct format that Kafka expects and then you delegate that object down to the Kafka library puts it on the Kafka queue over the wire and networking and then it goes in reverse and the consumer side consumes it right but you have to maintain that custom glue part because it's part of your application and so what the cloud events has tried to do is the community has been developing a bunch of SDKs in a bunch of different languages to help replace that custom shim with an SDK that you can use to leverage to adapt it to other protocols so you don't have to write that shim anymore you can if you want to though because the first artifact of this working group has been specifications that define the bindings between we call them bindings the bindings between the thing that your event this thing and the weird shape that Kafka expects which is different than what Nats expects and which is different than what the HTTP wants and so the point of the cloud events library is we got you whatever you configure with a protocol you pass around your actual object and we will help you create that protocol specific shape and back out is the important part and your business logic starts to just depend on the custom code and yeah okay fine yeah it is the same picture right but the difference is you have to maintain that glue I have to maintain that glue I don't want to help you and there's those you know I type and work on code our colleagues work on code and then it helps everybody and so that there's a scale factor if you find a bug you come and you help me fix it because I need some help but yeah so you know you get to really focus on that business logic of producing and consuming and this this applies to you know that selecting those CNCF projects to you know there in this case the event consumer and if they leveraged cloud events SDK they would have the ability to kind of at runtime select different protocol libraries and how neat would that be says you know okay let's say Prometheus uses Kafka internally I don't know I don't I'm just guessing maybe they do but I don't have any Kafka instances in my cluster and I want to use Prometheus so if they had an integration like this I could have in their configuration no no no not Kafka today I want to use NATS and it would just be a configuration on installation and I get to wire in I bring my own persistence layer and it is slightly more work for them to integrate but it's way less work for me to operate it so what is what does a cloud event look like literally I'm gonna show you HTTP and there's other protocols and they look sort of similar but this is HTTP structured there's there's two forms of HTTP and that's really because sometimes in some languages it's much easier to deal with the entire event payload as a JSON object because maybe you know maybe it's not as easy to inspect headers in JavaScript versus in Go or in Python or in whatever pick your language and what they really want to just I want to look at the just the JSON part to implement my router there's other protocols like NATS that just streams the JSON object so it helps to have a JSON format and this is what it looks like so at the top level there's some required attributes and there could be additional extensions with some rules and then there's the data piece and there's some nuances there but find me later and we'll talk about the nuances and so you know your weird shape object gets shoved into the data piece and so I just heard a question yeah so if it's an image it works too but it's because an image doesn't fit into JSON normally it gets base 64 encoded so any kind of data can get pushed into this data field with some rules and you know we pay for a bigger size but we get to send this binary data in binary mode in HTTP all of those routing metadata all the right so let's go back to why we're doing this right the reason we're doing this is because we want to have a consistent way to do routing decisions on events in off the shelf components that connect producers and consumers of events right because that was the friction point of why we can't adopt serverless so easy so the weird shaped thing here is the little the green part it's the data that's the stuff that you made it's weird the blue stuff is what the serverless working group has created and it's totally not weird no so we need a little bit of metadata to figure out if it's actually a cloud event or not and we need some hints to say what version of the cloud event spec you're following so you have to have the version type source and ID and there's some other fields that I can show in a second so there's some required fields like I said spec version type source and ID and then there's some optional ones that you can add that are standardized and specified by the spec but you can add anything you want so if you needed to have some sort of custom logic and adding extra attributes because that's what your application needs go for it there's some rules but it's that's intended so for example, here's what the GitHub binding would look like and if there's anyone from GitHub in the room we should talk because how cool would it be so okay I'm gonna pause it's not the script but I'm sorry you know okay so when you're doing GitHub webhook interactions you know it's kind of this two-faced thing where you get the payload you have to un-martial it halfway and look at some type metadata and then use this big lookup table and you get the real thing and then you un-martial it again and the cloud event spec might help that because you start pushing some of that metadata up into the headers and you start getting to do routing of the webhooks and I think they do some of this too in the headers but we also have a binding for what a GitHub webhook looks like in cloud events that we specify and there's a couple others that are in our documentation so yeah okay so you're sold awesome and you're, oh I just heard another question yes your language is likely supported so and if it's not talk to us maybe you could help us make the new binding all it takes is a bit of dedication and reading some of the binding specs to be able to figure out how to take the natural object in that language and convert it to the protocol specific integration in that language and there's a bunch of protocol bindings too I don't think this is all cause this list is slightly out of date but the following slide has all of them and so if it wasn't clear I'm gonna say it again though what a protocol binding does is it takes the concept of a cloud event with all of its routing metadata and it converts it into the AMQP version of that thing and then it also specifies how to take an instance of that AMQP version event and inspect it and convert it back to the protocol-less version of the event so it's a little bit of payment but you decouple yourself from all of these protocols because it can go in AMQP and then you can go from AMQP to NATS and then NATS to Kafka and then out of Kafka over HTTP and you should have the same event because all of those hops should be lossless with a caveat. Here's the full list. It's a lot and it's growing every day and I'm absolutely certain this is not the full list so if you're producing and consuming cloud events let me know and we'll get you on this list. Okay, so future updates. We have been working for a very long time on three new specs, the Discovery API, the Subscription API and the Schema Registry. The Discovery API says, well, if you produce an event please let's have a talk and what are the kinds of things that you produce? What are the sources? What are the attributes that I can filter on? Once you know what you can filter on you can talk to the Subscription API on that producer and say I would like to subscribe to this subset of things, right? Because it's great. If you're not producing events in your application you should but you also should try to upstream those filters as high up the chain as possible because if I don't need the event maybe you don't need to send it through all of the persistence layers, right? So the idea is that if we could standardize the Subscription API we could standardize the mechanism to do that filter propagation through big elaborate chains and then as soon as somewhere down the chain someone's interested in that event that also propagates up and the event starts flowing and we get interesting application scenarios. And then finally there's a schema registry. You know at the end of the day this is about your weird shaped object in the inside and so the schema registry provides a mechanism to describe the schema of what's inside the actual event because that's honestly what's important, right? The Discovery and Subscription APIs are about being able to orchestrate flows of events between producers and consumers but the schema registry is about maybe I could I know you produce this event and it has this kind of context maybe I could produce a function that just handles it. So the big, we've had these for a while and you're probably saying I saw this talk last year and you're right and it turns out that these three things are pretty tricky and one of the things we've been doing is independently a bunch of people have been implementing the specs to just read through them and make sure it's clear and the thing that we decided was there needs to be an offline mode because I would like to be able to point to at some repo that implements the generic Discovery service or my and there's a set of documents that describe exactly the ins and outs of the event. Maybe it has the schema registry offline documents and I can do a bunch of development without having to actually interact with a running server somewhere. So that was the feedback we're taking we're trying to roll that into an updated spec but we're still very close to doing an RC of these three specs. I bet in the next couple months we'll have an RC. So if you're interested in that too we are looking for more people to go do that exercise of read the spec and implement it. They're not that complicated. We just want to make sure that we have the spec concise and clear because being a working group that's based in specs, if the spec doesn't translate to interoperable systems, we aren't doing a good job. So we need to get that straightened out. So we need your help. And I have a few minutes for questions. Thanks for a great presentation. I was wondering when I was working with Cloud Events why you made the design decision that sometimes you put in the payload like the envelope and sometimes it's part of the hurdles. For me it was a more obvious choice to have it in the hurdles because it gives you the flexibility based on infrastructure, whatever, making routing decisions and also like based on the request filters that you have without un-martialing the whole object. Yes. Yes, so there are other eventing systems that natively speak JSON internally. And so it would be expensive to try to figure out how to convert a JSON thing into a non-JSON thing and back at it. And so the answer is that there are a lot of systems that do just look at JSON and they make routing decisions based on a very simple scan of the schema and they do that one level un-martial. They have optimized it to do that. There are other ones that don't want to modify the event payload at all. And so we have structured mode to support the native JSON. Like I think all of Amazon's just JSON, right? They don't really use headers to do this kind of thing. But there are other systems like GitHub Webhook where we don't really want to wrap their payload to convert it to a Cloud Event. And so adding headers is a very low bar of entry way to get this Cloud Event metadata to do routing. And the cool thing is that you can convert them in and out of structured or binary mode with a little bit of processing but no loss. But yeah, I also prefer the binary mode because it's a bit cheaper. Thanks. When I produce a message to a Kafka, for example, so I want to serialize the message with a protobuf or Avro or something like that, I would come to implicate if I use Cloud Event. When I serialize the message before the Cloud Event in the Cloud Event itself, that's it. Yeah, we do have an Avro binding that helps address some of that. And we also have a protobinding but if it's already been serialized, it complicates if it matters. So that's what I'm hoping to get across is that it'd be easier if upstream the event was produced as a Cloud Event versus needing to figure it out later. But also if the event is produced by Cloud Event, so I want a structure to my event. The event is a, I don't know, it's class of human, so something like that. The protobuf give me the class itself. If it's Cloud Event, I don't have the fields that I need. You understand what I mean? Yeah, well, if it's not working, let's talk and let's get it fixed. Because I think that's the point, is that it's a useful thing that people actually use. Thanks. Hi, I would like to ask that glue between the protocol and Cloud Event's library, if the protocol somehow changes, is there some collaboration between like those protocol specific applications and Cloud Event's team or something like that that they are notifying you or something like that? There hasn't been direct communication between libraries that are doing it, although there's an exception. So we are just another integrator with those libraries and the hope is that we open source that work and we accept contributions from people that are working with it and finding bugs. And if the upstream thing changes or it needs some update, it comes down and we can upgrade that in the SDKs and then that trickles out to everybody else that's depending on those, kind of bearing the load of integrations. A slightly other direction is we did chat with the open telemetry folks a bunch and the problem was if you're sending a Cloud Event as a producer of the Cloud Event and the consumer, you kinda wanna see the whole chain but the perspective of open telemetry was well, if the protocol switches it's a different trace but what I really wanna see is HTTP to NATs, to HTTP to a producer or a consumer and getting that whole trace wired up required a new field that got added to open telemetry or open tracing and now you can get traces that span protocols which is really neat. Coming from a perspective of a developer who's building applications that produce events or application that consumes events, I'm trying to understand the value because you had touched earlier because if the thing inside the Cloud Event changes I have to redeploy both anyways. So and in an architecture like that, I'm sort of like all the routing stuff I'm relying on some sort of a messaging system, right? So I don't play in that space. So I'm trying to understand where would this fit in? Is it really the platform, the folks that are running the platform that has the responsibility of making sure the event gets to where it needs to? They're the ones, that's what I'm thinking but I wanted to kinda hear from you. Yeah, so the way I like to try to answer this question tends to be the, so as a developer I wanna write that slack, that tight slack integration that I know that I'm listening to this thing and there's several events that I'm interested in and they cause other things. But in production it's a Kafka and it's very expensive and I wanna be able to test it in QA and if my function that's or my services is tightly bound to Kafka, the way I have to integrate the like unit test that is by standing up fake Kafka servers and pushing data through and if I wanna do an integration test I have to have like a real Kafka cluster and figure out how to tear that up and down and run those tests through it. But if I supported this idea of well I could shim it and I wanna operate on the cloud event not the Kafka event, then I can switch my protocol inside my QA environments or my CI environments to something that's cheap and cheerful, maybe doesn't have persistence, but in production it's rock hard and has persistence until for all of time or whatever. And so I think not having to think about that as a developer is the hope but it's also beyond the scope of what cloud events is but because cloud events is trying to define this common shape, we enable other, there's a few other groups that are building solutions like that that you can kind of replace the middle piece. One other question, it does feel like soap. Well, it's, so the cloud events group has been very specific about not defining the runtime yet and so if you look at the repo it's a specification that describes what should be available to route on and then a bunch of specifications that define generic thing to protocol specific thing and back and that's kind of the extent right now versus runtime characteristics that others like Cainative Eventing or Argo have kind of expanded on because they needed a runtime contract and cloud events is not providing that, the role of cloud events is to take the thing that was a protocol specific thing and turn it into something that's generic and useful so that you don't have to worry about that protocol piece. Hi, yeah, quick one. So you mentioned the discovery in the subscription API, the spec that you're working on, you mentioned that some groups are trying to implement that. So can you give some examples of the projects that are happening in that space? Oh, we are internally to the working group we're looking for volunteers to read the specs and produce an implementation. So before we go and provide an SDK and a bunch of languages, we just want to make sure that the spec reads and produces a compatible result with other implementations of the same spec. So there's no commercially available thing, although the schema registry is, you can interact with the beta version on Azure, I think. And for the other stuff, I'm just looking to see some concrete examples. It doesn't matter if it's ready or not, right? Ping me in the cloud events room in the CNCF Slack and I'll send you some repos. Cool, awesome. We've very specifically been trying to make several implementations in several languages to make sure that the spec translates into working interoperable code. Because if it doesn't, there's no point. Cool. And that's all, thank you very much. Thank you.