 Thanks for joining us everyone. As people continue to join, we'll go ahead and get started. I'd like to thank everyone for joining us. Welcome to today's CNCF live webinar, Introduction to API Clarity, a Wireshark for APIs. I'm Libby Schultz and I'll be moderating today's webinar. I'd like to introduce our speakers today. Zahar Kauffman, Principal Engineer at Cisco and Alexi Krapfsab, Technical Lead at Cisco as well. And please introduce yourself and say your name the right way, Alexi. I know I probably did not. A few housekeeping items before we get started during the webinar. You are not able to talk as an attendee, but there is a chat box on the right side of your screen. So please drop your questions there. Say hello and we'll get to as many as we can at the end. In addition, please drop, please join our CNCF public chat slack channel. Excuse me. Hashtag CNCF dash online dash programs to continue the conversation later and address any questions you have that we didn't get to today. This is an official webinar of the CNCF and as such a subject to the CNCF code of conduct. Please do not add anything to the chatter questions that would be in violation of the code of conduct and please be respectful of all of your fellow participants and presenters. Please also note that recordings and slides will be posted later today to the CNCF online programs page accessible via your registration link or on our online programs YouTube playlist. With that, I will hand it over to Zohar and Alexi to kick off today's presentation. I know everyone is eagerly awaiting, so we'll get started. Thank you very much, Liby. Hi, I'm Zohar Kaufman and together with me today is Alexey Kravtsov. A few words about myself. I was the co-founder of Citera Networks. This is active in the cloud storage and enterprise file services area. Later I've co-founded Portshift together with Randilani, a startup that focused on community security that was acquired a year ago by Cisco. Alexey, do you want to introduce yourself? Hey, my name is Alexey Kravtsov. I started as a data path acceleration of security solutions in Checkpoint, led a team of data path acceleration, then joined Portshift as the first team members there until the acquisition last year by Cisco and since then we're continuing working on our security cloud security products. Thank you, Alexey. So maybe you can get in order to kill the echo. In Cisco we joined an effort around API security in Kubernetes and stumbled upon the problem space that we will present today. This work was inspired by Peter Bosch and Alessandro Domenico, so I would like to thank them both. And let's go to the next slide. So what is on our agenda today? Why do we need API specification reconstruction? Possible open source packages survey that we did in order to solve this? We didn't find anything or at least something that will answer all our needs. So we will introduce API clarity, a new open source that we've developed. Then we will do a live demo. Hopefully the God of live demo will be with us today. And we'll close by talking all out road map, a few comments and of course answer any question that you may have during this webinar. So feel free to write any question you have in the chat window. So what is the challenge that we stumbled upon? Cloud services are becoming more and more popular. Many of them are using an open API specification to define a standard language agnostic interface which allows both humans and computers to discover and understand the capabilities of a service without access to source code or documentation. Not all application have their open API specification available. They can be either legacy ones or external applications. And we would like to get the open API spec of the applications without coding instrumentation or modifying existing workloads. We would also like to detect drifts between implementation and specification application that they still use the deprecated API's also called zombie API's and undocumented ones also called shadow API's. Gardner recently published the hype cycle for API's where they state that every connected mobile, modern web or cloud hosted application uses and exposes API's. These API's are used to access data and to call applications functionality. API are easy to expose but difficult to defend. This creates a large and growing attack surface leading to a growing number of publicized API attacks and breaches. We looked for a cloud native open source that will allow us to do all this, but didn't find the solution that will answer all our needs. In the next few slides we will highlight some useful sites and solutions we found during our survey and then we will of course describe API clarity. So the first one is open API tools. It's a great aggregation of open of API specification tools and knowledge by categories. It also lists all the major companies in the field. Next site is API specification toolbox. It lists all the services around API specification. For example, a CNCF has a project called the micro rocks.io. It is listed here under the category of mock servers. One of the more interesting solutions we saw is from optic. Optic is an open source tool that helps developers to document, review and approve API changes prior to deploying them. It is lagmoo diagnostic, works with any REST API, observes development traffic and learns your API behavior and has a great mechanism to manually review and update the specification. However, brilliantly, Optic is gating API changes. It is not designed to monitor live multi service API traffic and deployed the clusters and seems better suited to sit on the developer's personal computer. Swaggerhub also has a great solution. You may generate API traffic from the web UI using their inspector tool, record it and create the open API spec using Swaggerhub. However, they lack integration with runtime environments and they don't have an open source available. Cloud Vector has also a nice solution called API shark. It has a live monitoring of multi services environment. It can automatically detect parameters and create the open API spec from the runtime traffic. However, it is not open sourced and it lacks the option to review the spec and detect deviations from it. I envision is another good solution. They do live monitoring of multi service environments, create open API specs from runtime traffic, support detecting deviations from spec and have a mechanism to manually review and update the generated traffic. So all this is great, but they don't have an open source that we may utilize. So we decided to produce a new open source called API clarity. No code changes are needed for any of your apps. When deployed in Kubernetes cluster, it observes all the API traffic and reconstruct all the relevant API specs. The user can then review the reconstructed specs and declare them as a baseline or alternatively provide official specifications to be compared to the reconstructed ones. Afterwards, we can note all the API events that are different from the approved spec and highlight zombie and shadow APIs. We also provide a UI dashboard where we can audit and monitor the API findings. You'll see that in a moment in live demo. So as a first integration, we achieved this by utilizing a service mesh running inside the cluster. We used Istio and EnvoyCycle Proxies and they both help us observe all the API traffic. And as a result, we can reconstruct specs, highlight API differences and other abnormalities while the user can review the spec and make changes as needed. So this was, you know, like a first implementation, we will see in the roadmap how we plan to evolve from here. So few spec reconstruction features it is important to highlight. We detect spec parameters, whether in the path of the query, in the parameters, in the query parameters, in the headers or cookies. We understand object references that might be included in the spec. We support file transfer APIs and for security definitions, we digest basic auth and auth too. So now I'll switch to Alexei to talk about the demo environment. Yes, thanks Zor. So in short, while I show you a demo of API clarity and my setup for the demo is Kubernetes cluster and Istio service mesh that is already deployed. I have API clarity installed, which I will show you in a second how we can do it too. And in order to generate traffic and try to learn the APIs, I have the stock shop demo app by web. And just to help me to generate some API traffic that I can show you how it looks in API clarity and reflected there. Okay, so in the demo flow, I'll show you, like I said, about the deployment. You can just clone, build if you want it by your own or deploy the YAMLs and the pre-builds images and binaries that we already provided you. Then we'll see in runtime all the API events and even non-API events captured by API clarity. And we'll show you also how to see the trends using the heat count graphs that are also provided. We'll show you how we do the open API spec learning from the generated traffic of the stock shop demo. And I will show you the review process to how can you give that human flavor to the automatically generated API. And once the guides are approved, we can try to see diffs against the actual traffic. And that also can be used, can be created using the provided spec, not on the generated spec. So, Zor, can you please unshare? Thank you. So, are you able to see my screen now? Yes, we can. Great, so that's the github of API clarity that we released just recently. It's API clarity, API clarity. Actually, it consists of three repos. We have the speculator engine repo and the WebAssembly filters that we created for you in order to integrate with the Istio service mesh. But that's definitely the main repo. The instructions are very, very simple. Of course, you can try and build it yourself. But like I said, we have our pre-compiled images. So, all you need to do is just to apply the YAML here, make sure that everything is up and running. And then in the WebAssembly github module, you just need to init this submodule. And you have a script that you can choose which namespaces you want to monitor and see the traffic from. So, you just run deploy with the namespace that you are interested in seeing the traffic. And that's it. Maybe a few words about the design here and the components that are running in the cluster. So, here you can see the applications and the pods actually in your Kubernetes cluster. Everything, all the traffic once the deploy, the WebAssembly filters are set in your deployments. All the traffic is mirrored using Istio. Two API clarity to its reconstruction engine. We also have a nice web UI in order to see it all and manage your APIs. So, that's all you need to do. Of course, there are more instructions on how you can build it and even run locally if you don't have a Kubernetes cluster or other applications. So, you can just run it with demo data to get the feel of it. So, I'll switch to the UI. So, the first page that you can see is the dashboard that shows you the trends that you can choose the timeline of your own. I currently didn't run any traffic in the past five minutes, but as you can see I tried it a bit before. We show you the new APIs that we never, the system never saw. We have existing APIs hit counts for APIs that the system already learned. And also, you can only select APIs that have some diff against the defined API specification. So, as soon as you run some traffic that is captured by API clarity, you immediately can see it in the API event screen. As you can see here, we have the time and the method of the HTTP, the path and some other attributes that you can find also in the drill down of the event. We also set up with some filters in order to search all the events efficiently. So, for example, you can check that host is hard, for example. That's one of the hosts and the microservices in the with microservice demo. You can also do some filtering on the path, for example. As you can see, the filters are aggregated, so you can delete some filter, so it's really easy and convenient. We also have the filter here for non-API events, meaning all the images and the media of your web files that you are not interested. So, we hide it by default. But, for example, you can see that we detect images as non-APIs, for example. So, if in the path we see, for example, big peg image, so you see we filter that out. So, if I will hide this, none of these events will be shown. So, that's convenient in order not to be distracted for traffic that you are not interested in, only traffic that we learn the APIs of your application from. Also an important part, we monitor internal traffic in your cluster and external traffic towards destinations like HTTP bin here, for example. And each API, we create an automatic entry in our API inventory section. So, these are the APIs of the microservices. These are actually the microservices of the WebSockShop application demo. And here you can see that I already provided some open API specifications and I also reviewed some reconstructed specs, but just to show you how it is done in real time. So, the last thing that happened here, I think, is the call to the delay API of HTTP bin. So, that generates some traffic in the WebSockShop demo. So, yeah, better to generate as much traffic as possible to hit all these APIs there, because we generate the specification and learn only from the traffic. And the user, of course, can also provide the spec of its own. So, we can see what has been called, whether some APIs are not called as you expected, whether some undocumented APIs, shadow APIs have been disclosed. So, let's try and generate some traffic after I register to the SockShop. Let's try to buy some socks. Of course, all this should be reflected soon in our UI. So, I think that immediately this should generate new events. As you can see in the catalog, the cards. So, let's just complete this purchase to have as much traffic as possible. So, let's try to update the shipping, to hit the shipping, micro-service. Let's try to also edit some payment information to hit the payment information and sorry, the payment API. Let's complete it and make sure that we generated enough traffic here. So, now if you see in the last five minutes a good number of APIs have been created. All that we saw can be visualized in graphs views. So, also what I mentioned about the filters. So, if you search, for example, for the user API, that affects also the graph view that you see. And once you generated enough traffic and you want to check out what API is created from all that, so you can just go to the API that you want, that you are interested. You can also do it right here from the event section if you click on the specification. So, you go to the API inventory to the API orders right here. So, let's try to review some APIs that I didn't rebuild and created or constructed specs for. So, let's hit cards, for example. Also an important part, so here I can choose the actual API of it and I can drag and drop the files here. And the reconstructed spec will wait for you here once you review the specification that we learned. So, you see here that these are the APIs or the paths with the methods have been called during the learning phase. Here we detected the parameters and in order not, we couldn't guess it correctly because it's up to the user to decide what is the name of the parameter. So, we can select and think that this one is user ID, for example, and we will show you what this was merged from. So, these are the actual calls that happened and we learned them. So, if something is not right here, so you can just unmerge it and say that's not part of this API. Maybe I can show you an example of this also and one of these, but here it looks like it was pretty accurate. All these UUIDs are actually in the ID. So, I think that no misses happened here. So, here we just have a name also in a different name. This is the item ID and we can see what is combined from. So, it looks good as well. Here, if for some reason you say that the cards here undefined, that should be a parameter. So, you can also click on it and say, hey, this is my card ID. That's not undefined. And as you set it up, we'll say that this will be merged and you can also merge these two because they're from the same structure, but let's do it for only this one. And you see that it's immediately changed to card ID. Again, you can say, no, sorry, that's not correct. It's undefined. That's actually how it should be. And we restored all the merged entries. So, here I can do this one. Once you're good with the review, you just select what you want to be part of your API and you click on Approve Review. And we say what will be included in your API. So, after a few seconds, you should get your API that you can immediately see in Swagger Editor. So, a very important thing to mention here, that no information leaves your cluster. So, what I did, I just forwarded to the API Clarity Service and also to the WebSockShop. Nothing is exposed to the outside world and no information leaks from your cluster. Everything is done locally and is not uploaded anywhere. And so, as I said, let's see what was created. It's because in the review process we reviewed only the pass, but we actually learned a bit more than that. So, as you can see here, we detect, like Zor said, we detect the structures here. So, we detect and use them in the APIs. Yes, so it's pretty much it. We detect it as if you uploaded the Swagger yourself. We keep improving to remove generated fields or headers that are not interesting, maybe like tracing headers. So, that's the work we are doing currently to minimize the diffs that you will see. And once the spec is in place, I can go back and you can see that now the reviewed API of cards now also have a reconstructed spec. So, and the same basically with external APIs. So, here I have HTTP Bean API. So, yeah, these are the last events that we saw. I can maybe generate some traffic using HTTP Bean. So, we're just calling the delay API for two seconds and just delaying my call. I executed it from a pod client of a simple curl. Maybe I can delay it for one second, not waste time. And, yeah, this should create several entries that we should see them immediately in the runtime events. So, you see that these were detected and if I click on them, I can go to the HTTP Bean spec right away. So, let's try to reconstruct this spec as well. So, you can see that we detected the parameter here. Let's give it a better name. That's seconds. And I can also see from which events that was created. So, one and two seconds that I ran. And these are the correct values. So, no need to modify here. Yeah, so, looks good. I can approve this spec. And again, as for the in cluster traffic, I also have my swagger and open API specification for the Bean, the HTTP Bean service, the external service. And this is actually not going to the swagger editor. We generate these files in API clarity service in your cluster. So, we generate these files and then we access it. So, no need to worry again that traffic leaves your environments. So, yeah, as you can see, we detect all the models and schemes. What else I think might be interesting for you. Yeah, so, once the specs are in place, I can see any deeps from the spec that I set to the actual APIs. So, let's try and create some traffic again in the web soft shop and see these deeps. Let's try to buy something again. Yeah, so it's better to invoke all the calls that you think you have to try to delete and add items to get as richer swagger output as possible. As richer specification as possible, sorry. I think this should do. And now if we go to API clarity, you can see that some of the APIs have deeps from the spec that we constructed. So, it can be something that, like in this case, that actually we see here tracing headers of envoy that shouldn't be here. So, we actually need still to ignore these. And you can visualize all this from the dashboard again. So, you get a numbers of API calls and volume of your calls in your cluster. And you can click on the latest deeps in your cluster. So, I hope that we have something more interested in this. Okay, that's pretty interesting. I think that it was defined as in 64 and all of a sudden it's a double. It's not actually that big of a deal. But as you see, things still catch and drift. And this is the method basically to detect shadow APIs. So, if the APIs completely doesn't exist, for example, this call doesn't exist in the constructed spec or in the provided spec. And now we see it in runtime. So, that's a shadow API that wasn't documented. Or we detected deprecated API that marked here is deprecated and still going on. So, we classified it as a zombie API. We should drop these updates in the upcoming days. So, worth waiting for. And we give nice icons to any type div. And yeah, I think that's a complete overview of the current capabilities in the UI. And I think it gives a general idea of how we capture traffic. How we learn from it and convert it to an open API specification using a user review that gives it that human flavor. Not always generated. You have the chance to give it some human names. And as you can see, once the specifications are in place, you can generate internal or external traffic and see whether as you expected in the open API specification. Okay. I think that's it, Zora. Thank you very much, Alexei, for the great live demo. Actually, the god of demo was with us today, so we're fortunate for that. So, let me share my screen again. So, hopefully you can see my screen now. And let's go now for a few more comments. So, a new report from IBM Security Exforce has found that the two thirds of cloud breaches can be traced to misconfigured APIs. In the report, APIs are fast becoming the technical basis for both B2B and B2C business models, such where when APIs are developed and deployed, there is no way to estimate all the possible places the APIs are going to get used. The APIs are silently but rapidly becoming one of the most critical pieces of software supply chain. Organization are now one vulnerable API call away from potential major breach. So, two thirds, according to IBM, of the breaches are related to APIs. So, after that, it should come as no surprise that Gartner predicts that within the next few years, API abuse will move from an infrequent to the most frequent attack vector, resulting in data breaches for enterprise web applications. So, we started all this in order to be able to use it in our secure CN offering for Kubernetes security that we are working on. And like said here in the slide, knowing the API spec is the first step to identifying your API risk. Given the API spec, one can also run fuzzing tests and automatically generate the client and server codes. The spec can also serve as a good documentation for future usage. So, it's like a basic building block that can serve for many purposes. We will utilize it for security reasons but others can do other things with it. So, going now into roadmap. So, the most important, I would say two roadmap items are listed here on the left. We support today OpenAPI spec version two. We can broaden the scope to OpenAPI spec version three, GraphQL and GRPC. Currently, we are integrated with Istio and Envoy. We should add more integration points like browsers, postman, API gateways and others. I want to use this opportunity and call out for the community to cooperate with us. 42 crunch that are also behind API Security.io and API metrics that are doing intelligent API monitoring already accepted this challenge and will join us in maintaining API clarity. We will be happy to see many more of you collaborating with us on GitHub. So, github.com slash API clarity, it's easy. And now I think that we are down for questions. So, I already saw a few questions in the chat window. So, the first questions was about HTTPS support. So, you know, HTTPS, it means that it's encrypted. So, it sounds like an easy question. So, it's expected that we will say that we are not supporting it, but we do support it in our Secure-CN offering. There we can actually inspect into HTTPS traffic and look into that and then actually see all the APIs also inside HTTPS. So, this is not in the open source, but it is in our product that we are developing. Another question was, do we support only Envoy? So, the answer here was given in the roadmap. We do plan to support many other integration points and we will be happy for the community to cooperate with us on that. Can I mention a few things? So, the first thing is about the encryption part. So, if you're running with Istio, so you can encrypt everything using Istio. And if everything is encrypted using Istio, you will still see that because Envoy sends the first traffic to API clarity. So, you can also do it with Istio. If the application encrypts the traffic and not Envoy, that's a bit harder. So, we are thinking about solutions that involve BPF, the registers on received and send maybe K-Probes, but yeah, it's much more complicated. And regarding the integrations, I think that if we just receive the HTTP trace in a certain format, so it's not only Envoy or anything that sends this trace in this format to us, to the backend of API clarity, we should be able to integrate with it. We just need to create the specific converter, whether it's API gateway or some plug-in to your browser or Postman or whatever. It's just a simple converter to get this in the format of API clarity. It should be no problem and we will add them down the road. Thank you, Alexei, for that. Another question that you may take is API clarity, will it work with HTTP2? Yes. So, Envoy treats them both as the same parser. So, it depends on what part in HTTP2. But basically, yeah, we still get the attributes that are relevant for the REST part and the JSON. So, that's maybe a bit of a way of transfer, but all the data that we need is still there and we can do it. Okay, another question I see in the chat is, can we access the data of API clarity via API and not via the UI? What do you say, Alexei? Of course. So, we have, it's also part of the open source, the swagger file that we built the UI around it. So, you can get everything that I clicked in the UI. You can get the API specification for it and create any tool that you want to query the information. Okay. Another question I see here is regarding the Kubernetes API server, whether we support also seeing this traffic. So, again, this was not our focus to look at the Kubernetes API server, but we were more after the APIs of applications. We do have other solutions for monitoring the insecure CN, the Kubernetes API, but this is maybe for another webinar, a subject for another webinar. Alexei, do you see more questions in the chat? I think not, maybe I missed something. I think we went through them. Very nice. Does anyone else have questions? We'll give them another minute and just see if I drum anything up. And then if there's anything else y'all want to cover before we wrap, you're welcome to. Yeah, maybe something that I can add. So, in the review, we saw that we give the user the ability to modify the parameters of the path. So, using APIs, that's what we support in the UI. But actually, if you're interested within the swagger integration, so we suggest your review that has the full content of it, and you can return us something that you want. So, we plan to extend the UI also to be able to review the inner schemas and then the objects and the types. So, not only the path, so if we missed anything, or like you saw, there was some headers, for example, generated trace of headers, so B3 headers for tracing. So that's, I think, is not relevant to the Open API specification. So, we will give the option, we will ignore it in the first place, but we will give the user option to ignore some more information if it was not intended to be part of the path. Excellent. I think we'll be shared post event. If y'all wouldn't mind resending that to me via email czar, that way I have the most current. And that will be posted with the reporting. Is there anything else? Right, if not, then thank you both so much. This was a great attendance. I think there were some great questions and if you were to hit anyone up on the Slack channel later, if you have any other questions, and all of this will be up on the website later on today. Thank you so much, so Aaron Alexi. Sure, thank you. And happy to come to us all. Yes, exactly. We'll see you all soon. Bye bye.