 Hello everyone. Thank you for coming to the talk. My name is Ryan Horn. I am a senior architect for customer data infrastructure at Twilio, but today I am here representing the SIG serverless cloud events working group to talk about cloud events, which will include an introduction to the project for those who aren't familiar, the current status of the project and some new things we're working on as part of the group. So to explain cloud events, let's first look at a situation a lot of event pipelines find themselves in. Here's a hypothetical, but realistic pipeline through which an event is produced by some system and eventually consumed by some consumer application. For an event to get to the consumers, it needs to go through a series of intermediate hops that physically get the event to the right place. In this picture, you have an event producer that sends to a queuing mechanism to buffer the events, which goes through a mediator to route the events to the appropriate channels or topics over which they're consumed by the consumer application. In a typical deployment, these hops are strung together without standard interfaces and without common message formats. They need to be specially coded against every event structure and the coding they know about, which is problematic as it introduces a coupling between them, in that if the event structure opportunities, the software spanning all of these hops in the pipeline must be carefully updated in a coordinated manner to avoid errors and data loss. Additionally, this mixes protocol and business logic, where the event contents, which are specific to an event type, are co-mingled with event mediator, which applies across all events, forcing pieces that might just care about the mediator to understand the event contents as well. Typically, what a lot of that stuff in the middle solves for is routing, a kind of necessary evil to get important events from where they originate to where their value is actually realized in the consumer. And so a goal for cloud events is to solve for that necessary evil. So your software can focus on the business value that lies in the consumption of those events. So what does cloud events do about this? The core of the solution, the core of cloud events, introduces an envelope definition, which we'll go into shortly, but first an important point I want to make up front is that we are, and cloud events separates the logical model for this from the protocol specific representations of an event. The core spec focuses on this logical data model for events first with specs for the various protocols as implementations or bindings as we call them of that model. So why have an envelope? Again, it's to standardize that information that is common across events and to extract the routing and more general concerns out of the application. If we don't, and let's say the format changes down the line, this becomes a potentially expensive change to make across all of your pipeline infrastructure. The focus on the logical model also allows for an ecosystem of tools and implementations to emerge around the common format independent of a specific tool. This is about making different tools, systems, clouds, vendors, products, etc. easier to use together rather than trying to pick a winner. So what does this look like? We think of the occurrence of an event as capturing the specifics of what happened and the envelope as something that wraps that occurrence and contains general metadata such as timestamps, event types, unique IDs, schema references, etc. This way the event can be routed and handled appropriately and independently without understanding the whole event. There are a variety of reasons that this is important. For example, the contents of the event might be application-specific both syntactically and semantically in a way that's unknown by the intermediaries. Or you can imagine there might be privacy concerns. To continue with a physical mail analogy, you might not want that person delivering your mail or your neighbors who are adjacent to you to have access to the contents inside of the envelope, but there needs to be some information visible on the outside to get it to the right place. The standard way of structuring these together is a cloud event. So how does this help with the coupling challenges described earlier? In the world where we don't have standardization on how the event is structured and we don't separate event content from metadata, if we want to make a change to the structure of the event, it's a complex and painful coordination with a high chance of error. You might, for example, update the consumer, but what if you forget to update something in between, such as the mediator? Or worse, what if we don't even have control over the event mediator, such as if an event needs to cross clouds, vendors, products, etc.? By standardizing on a common metadata format, that shape can be commonly understood and we can decouple these components in terms of how they interpret and process event metadata and where they look to access the actual contents. So this is a high-level view. Let's take a look at where cloud events fits into your code. Before cloud events, in your event producer, you had your business logic and then you had your custom glue code that mapped your application-specific representation of what will go into an event to the protocol and delivery mechanism that you're using to send it, such as AMQP or Kafka. And then the actual delivery mechanism or service for sending the event itself in the middle and something similar to the producer on the consuming side. Cloud events picks up where your business logic leaves off, standardizing and taking care of mapping from your application-specific representation to a common event format that is expressed over a number of common protocols with libraries that implement the cloud event spec. In effect, what does this enable? First, this allows you to focus on your business logic instead of your protocol integration logic. Additionally, because cloud events defines the format in a protocol agnostic specification with separate bindings that implement specification for a particular protocol, you can switch protocols without switching formats. So if, for example, you are going from one queuing technology to another, this should just mean stopping out the library implementations for the selected protocol. So enough of the diagrams, what does a cloud event actually look like? We've got two examples here. On the left, you have an example using JSON over HTTP to represent the cloud event. You can see at the root of the JSON object, we have the envelope metadata, such as spec version type, source ID, data content type, which I'll come back to you in a second. And then you have this data key, which includes the actual application specific event contents. You might look at this and say, it's just a JSON object with some common structure. So what? But this example on the left is just one encoding. In cloud events, protocol implementations can leverage whichever encoding is appropriate, but they largely fall into two buckets, which we call modes, which are either textuals such as JSON as easy on the left or binary mode, which you see on the right. Both of these are the same event, but what's interesting with binary mode on the right is it encloses the metadata separately from the event contents. The event metadata is sent via whichever mechanism your particular selected protocol uses to enclose message metadata. So for HTTP, that should be headers for Kafka, that's Kafka headers, etc. And with the separation intermediaries, such as routers, which need to just understand the metadata, or even applications, which might want to inspect the metadata before attempting to parse or understand the event can do so without accessing or understanding the event contents. This separation is also interesting because it makes it easy to start using adopting cloud events through extension rather than replacement of existing implementations. For example, let's say that you have a microservice architecture that is standardized on JSON over HTTP and REST as the communication protocol. You can start leveraging cloud events using the binary mode by extending your protocol using HTTP headers rather than changing your services understanding of HTTP bodies, etc., which is pretty cool. I believe this slide is just highlighting the metadata that is part of the envelope portion of the cloud event, so I'm not going to spend too much time on this. So you might be wondering, hey, this looks great, but how do I get started? So the community has been hard at work building out SDKs for the various protocol bindings and a number of languages and platforms. And I believe this is an up-to-date list of what we support today. So you can see the SDK languages that we support and then the protocols that we have bindings for. And this is an open source project, so if you don't see your favorite language or protocol represented here, come and take a look and contribute. So that was an overview of what cloud events is. What is the status of the product? So we're currently at 1.0 today, which we got to, I believe, a little over two years ago. And that was largely focused on getting that common protocol and format defined and usable. And we've had that in a stable place for a while now, which you can go and use right now in your software with the SDKs that I just mentioned. So what's next? What other problems are worth solving around this space? What other solutions does having a standard event format make possible? The group over the last few years has been discussing this, and we've come up with three new projects and specs that are being worked on as we speak. And those are what you see here. We have discovery, subscriptions, and schema registration. And these are not all independent specs. Well, the specs are independent, but they do work together. And together we believe that these will ease the end-to-end lifecycle management of events, how they're discovered, how they're received, and how they're used. So to go through each of them individually, discovery attempts to answer the question of, oh, okay, we've got this great way to integrate, interoperate our event systems. But as a consumer, how do I know what to consume and how to consume it? And where do I go to consume it? How do I discover all of the events emitted by an application producer that I know about? And what is the shape of those emitted events? The closest comparison here is something like an open API document. With open API, you can ask the service, what does your REST interface look like? And what does it return back? We don't really have this for evented systems today. And so we're working on a discovery API to try and bring this to life as part of Cloud Events. Next, if we take this step further, you might ask the question, okay, great, I'm able to discover what events are produced, how does my application actually receive them? How do I subscribe to them? For this, we're also working on a subscription API that provides this mechanism, along with some handy features such as being able to filter on made data to receive only the events that are relevant to your application. And finally, you might ask the question, this is awesome, I can discover what events are produced and I can subscribe to them, but how does my application consumer interpret the contents of the events if the contents are determined by a producer? How does the consumer know the shape of the schema of those contents? For this, we are working on a schema registration API, which allows a producer to register and consumers to access schemas that were registered by producers. And those schemas syntactically describe the shape of those events and their contents via a schema registry. So that's what we're trying to solve for at a high level. What could this look like in practice? So first, a consumer needs to know which events are available for consumption. They discover this through the service which produces those events, which exposes a discovery endpoint. This acts as a sort of catalog of services or producers that can be query-defined events of interest to the application. In this example, we fetch a particular service which returns a service object, which is an example shown here. And it looks like this, it includes a bunch of made data. Importantly, it includes an array of events this producer emits and information about the subscription endpoint, such as the URL and supported filtering dialects, which I will go into in a bit. So a subscription can be created by a consumer or someone on their behalf. So once it knows how to describe the consumer can create a subscription via this subscription API using the subscription URL returned by the discovery endpoint and includes the events the consumer selects to subscribe to in the request. It also includes a sync address and protocol. I don't think I mentioned sync yet, but a sync is think of a sync as a general abstract notion of a concept of the destination to which the events selected by a subscription are to be delivered using the specified address and protocol. You'll note that in this diagram, I separated the subscription API itself from the event producer. The specification as it exists today provides for flexibility as to whether these APIs are part of the same service, whether they're separate, et cetera. But in this example, they are separate and the subscription API manages the subscription themselves and notifies the producer of this, which in turn might go and configure the mechanism that handles event routing and delivery via the mediator. And so here's an example of what the subscription object looks like. In this example, we've configured a prefix filter. So if you remember two slides back, let me actually go back and show this again. The service object returned this field called subscription dialects that lists basic as the value there. And so the spec is written in a way where a different dialects can be added and implemented depending on what ways and what kind of expressions implementations want to support in terms of how subscribers can specify event filters. The core of the spec defines a basic filter, which defines a very finite subset of filters for subscriptions. And so this example uses a basic filter, which is, in this case, we're using a prefix filter where we are specifying comm.example. as the prefix. And this means that only event types with a prefix of comm.example. will be delivered to the configured sync. Bringing back a modified version of the diagram from before, now that the consumer has discovered the events it wants to consume and has set up a subscription, events will start playing as normal according to the rules in that subscription. To tie this all together, let's lastly look at how the schema API fits in. The events received at the sync, in this case the event consumer, will, of course, use the cloud events format and the consumer will receive an event like the following. Here we have a cloud event that includes the data schema property, which points to the schema that the consumer can use to understand and parse the enclosed cloud event contents. In our case, the URL points to a schema registry, which when called will return the JSON schema for the enclosed event. And, you know, following the pattern of being agnostic to implementations, schema registries and event producers are free to choose whichever schema format is appropriate for their use case, whether that's JSON schema, Avro, etc. So here you see the event consumer reaching out to the schema registry to fetch the schema so that it can use that to understand what's enclosed in the event that it just received. And here's an example of what we might get back from the schema registry. As I mentioned before, we're in this example of reasoning. Based on the schema, just like the envelope format, there is a separation of schema metadata and the schema document itself. Metadata such as the schema ID, what kind of schema it is, and the description are returned as HTTP headers while the schema document itself is returned in the HTTP body. So that's what we have in progress right now. What's beyond these initiatives? Well, this is largely TBD right now, and this is an area where we would love more folks to come with ideas and come collaborate with us. We do imagine that there's some work to be done on the security side right now that is left to solve out of band of cloud events to either an application or protocol of concern. You can also imagine domain or technology specific standards that aren't wide enough for us to focus on thus far, such as standardizing the shape of certain data that might be ubiquitous across certain platforms. In general, there's a lot of opportunity here, we think, and a lot of excitement around the project. So if you'd like to join, we meet weekly. We have a website and a GitHub repository, which is all listed here, where all of the specs and SDKs live. We welcome everyone and anyone who would like to contribute in any way, whether you want to come help contribute code to the SDKs, whether you want to brainstorm on some of the newer specs, or maybe you have an idea that you think could make this overall ecosystem stronger and want to contribute that, please come join us and work with us. We'd love to have you. We'd also love to have some examples and use cases, whether they're ideas, POCs, or production applications, this really helps us see what's working, what's not, and how we can make things better. And that is it for the presentation. I hope you come away with an understanding of where cloud events is today and the direction that we're heading in. Thank you all for watching.