 Hello everybody, welcome to the model of lightning bolts. So this will be five lightning bolts back to back. We're allowed for one question from lightning bolt, while the other speaker is, you know, putting out the laptop to the projector. So we'll try to do the online and be quick. And with that, I'd like to introduce our first speaker, Lionel Montreux, from Palando. We'll be talking about the streaming pipe-wise and the ecology that they've built and are using at the company. Thank you. So my name is Lionel, I'm a software engineer at Zalando, and I work in the team that takes care of our event streaming platform, which is called Nakadi. I bet not that many of you have heard of Nakadi, even though it's open source, but you've heard of Kafka probably. Who knows Kafka? Yeah, quite a few. For the few who don't know, Kafka in 12 seconds is a PubSub system. You create a topic which is essentially a a pendent log and producers write in sequence and consumers read in sequence as well. And it's used for service-to-service communication, for machine learning use cases, and a bunch of things like that. So what's Nakadi? Nakadi started as just a proxy, an HTTP proxy, providing a REST API with JSON on top of Kafka. And then we added a bunch of things. We had schema validation and schema evolution, and a subscription API, a little bit like the Kafka one. Then we had Nakadi UI, which is a web UI for you to do everything you can do with the Nakadi API. You can do with the UI. It's written in Elm. It's also open source. As far as I know, it's one of the biggest open source projects in Elm language. Then we got what? Connection to our data lake. So people who want to consume the data through Spark, Hive, Presto, S3, whatever, can also do that. And the latest thing is Nakadi SQL, not open source yet, but hopefully it will come soon. For those who know KSQL, it's very similar. So it's kind of SQL over stream engine. You start a query, and it gets continuously updated, and the result arrives in the topic of your choice. So Nakadi is open source, but Nakadi is also the main, the central part of our streaming system in Zalando. So how big it is exactly? We've got over 100 teams using Nakadi every day inside Zalando. They produce over 100,000 events per second, sometimes goes up to 200,000. We've got two to three gigabytes per second of consumption traffic at peak time. There's two and a half thousand event types. An event type is kind of like a Kafka topic. And I checked last week, there's about 100 terabytes of data that's transferred every day in and out of Nakadi. It's been in production for over three years. People seem to be quite happy with it. It's very stable. And so what I want to tell you about is that how does a small team, like the one I'm part of, well, the team isn't that small, but it's not that many of us, handles over 100 teams. We have around 800 users every single day. This is us. We are eight engineers, one producer, one manager. And we do everything about Nakadi. There is no SRE team. We do that ourselves. We do, of course, the development of Nakadi itself. We do operations of Nakadi, Kafka. We do monitoring. We do incident response 24-7. But we also do user support over the company chat, email, issues, call requests and everything. And none of us works crazy hours. So we must be doing something right because we still find some time to do what we really like to do, which is to write code and bring new features to the Nakadi ecosystem. So this talk is really about how do we do this? And I think there are really three principles that we try to follow constantly that allow us to safeguard enough time to write code. I'd like to have more time to write code, but we're not in a bad situation and things could also always improve. There's one thing we need to do on top, which is project management. We don't have a project manager quite yet, but we're hiring. So if you're interested, I would like to write more code and do less project management. Okay, the first principle is operational excellence. It's the principle that's kind of everywhere in Zalando. And there's a lot of different things that go into operational excellence. And one of them that one is specific to our team is we have the Penguin. Every week, a different engineer is the Penguin. And the Penguin gets to get that beautiful Penguin T-timer on his desk. And he's responsible for incident response and dealing with all user queries for the entire week, which means that the seven other engineers get to actually do things they like, which is to write code. Further into operational excellence, we also invest in every sprint some time to improve our operations, deployments that are automated, more monitoring, more... So it's easier to react to issues and problems that users find, and that always leaves more time for writing code. Second principle, easy to use. The API itself is quite easy to use, I would like to say. We follow Zalando's REST API principles, which are well-known in the company, but they're also open source, so you can check it out. There's a link at the end of the talk. And then we built Nakadi UI, where you can do everything. It's that M project I was mentioning. You can go create event types and resources and queries and subscriptions and change authorization and even look at which events are currently in Kafka. You can do all of that. And we found that, first, a lot of business users, since we put this up, because they don't like to use an API, but they certainly very much like to use a web UI. And second, engineers really like to use the UI instead of having to remember which parameters to use for the API. That saves us a lot of time, because if things are easy to use, then people will be more comfortable to use it, and they will ask less questions to the Penguin, who will then have time to write code, as we all like to do. And the last one is that Nakadi is almost entirely self-service. We try to make it as self-service as possible. What does that mean? There's like 200 teams, and maybe you have a new team and you want an event type for some, push some older related data. You know better than me what the schema of that event type should be. Who should get to write to this event type? Who should get to read from that event type? And when you need to change the schema and when you need to make any change at all. So we let you do it. What does that mean? Is that any resource you create in Nakadi you create it by yourself, but you own it and you see your responsibility. You can set authorization per resource and you can change it by yourself. So you're sure that someone else doesn't do something silly with your resource. But it's up to you to do what you should be doing and work as a responsible professional. Now, you could say that maybe it's not too wise to let people do whatever they like, but we have a lot of safeguards. There's audit logs. There's the authorization section. So not someone else can not do something to your stuff. And there's people reviewing things and looking at it. So it's working really well. Every time we have to perform an operation ourselves, this is a big bottleneck. People are dissatisfied and we are not very satisfied because we don't get to write code. So as a conclusion, all three principles are operational excellence, ease of use and self-service. That makes for very happy teams. I would add a fourth one. We're looking for a project manager. If you're a good one or if you know a good one who would like to move to Berlin or lives in Berlin already, that would be really great. There's a few other things that Nakadi has that make the life of users easy. We've got great documentation and we keep improving it constantly. That also reduces the number of questions we get. And everything in the API is documented down to every single status code you could get as a response to any query to any endpoint. And when you get an error response, you also get in the body of your response the details of why there was this particular error. So say there was a validation problem for your event that you were trying to publish. You will get an error that says I was expecting this field and it's not there or that field is not there and it should be. You've got a few links, Nakadi.io for Nakadi itself. You can find it on GitHub, Nakadi, Nakadi UI. A bunch of libraries contributed by other people at Zalando and the restful API guidelines that are also open source. And I think we have time for one question. Thank you.