 Hello, this is Zeyno Dodd. I'm here representing a small team of data scientists and software developers, software engineers. I'm a software engineer myself, even though I do a lot of architecting for cloud deployments. This is actually one of our more open source experimentation in the R&D. Basically, we're trying to see what kind of graph-based inferences we can have within the Kubernetes environment, dealing with several different cybersecurity scenarios. So this is, I mean, kind of an interesting landscape. As you see on the left, it is all cheerful looking with the infographics. In fact, those are kind of verisome facts in this displayed in kind of a caricaturized fashion. That is basically kind of telling to go that there are two hand-to-hand development, one on the cybersecurity. And obviously, a lot of digital assets being available in the cloud, a lot of types of new attacks coming up, especially after COVID. We have ransomware and a variety of sort of techniques being developed. On the other hand, we have obviously a very sort of promising rate in cloud data technology adoption. That landscape, actually, you can see there's a big growth in the security and sort of compliance area, and associated with also the observability, because they kind of go hand in hand. As new things become available, people are eager to kind of adopt in terms of their modernization, digital transformation processes. But maturing takes a little bit of time. And in the process, that's a great opportunity for hackers or attackers to make some creative ways, just like people come up with creative ways of improving the performance, having increased capabilities and ease up certain processes. They have a creative way of finding new vulnerabilities, new exposures, new weaknesses, and exploit them. This is obviously some kind of a it's not a mystery if everybody is seeing this. Therefore, you would have some global industry and government and agency and institutional level cooperation. In terms of several agencies and institutions, first of all, publishing standards, specific cloud-native technologies, securing the cloud-native technologies. There was an executive order, I think, 2021, like October, that basically started a whole slew of hardening, Kubernetes hardening guidelines. And there were a lot of benchmarking and best practices that were published. CNCF just recently also, I think, beginning of the June, they are publishing machine-readable, fully compliant, well, at least translatable to NIST 800 to 18 standard, I guess, for compliance assessment. So there's a lot of work in that area. And then, on the other hand, you also have security experts and sort of folks individually or smaller platforms providing OS intelligence, open-source intelligence, in the sense that people report. There's Hacker News attacks are reported there. Vulnerabilities, man, they discovered their sort of, I think Hacker one collects the vulnerabilities. And there's all these individuals, SMEs, they basically make these available as RSS. So this is actually a nice place to monitor the activities. Oh, and why GNN, why graph-based? This is a complex, obviously landscape, right? There are many moving parts. Even though cloud-native sort of deployments are kind of exciting, there's a lot of complexity in maintaining the lifecycle. Observability is a huge issue, many moving parts. So it is kind of crucial to be able to represent the problem in a most realistic and dynamic way and not sort of be stuck in hardened frameworks. And what I was thinking, the inference is actually a lot more natural than you think about in terms of graph models. And graph models are not new. Back in my time, there's knowledge graphs and graph-based inference. It was big then, it was forgotten. Now, it's together with the deep learning making a comeback. And they have very impressive success stories in terms of drug discovery and other kinds of complex pattern discoveries. So within this, there's also, in the landscape, people are trying to coordinate, right? They're trying to wire these sort of SME-driven, vetted, curated networks released on maybe once, twice a year into sort of more dynamic databases like the CVE Common Vulnerability and Exposures and the National Vulnerability Database. They're all trying to sort of make some kind of mapping. But you can see from the release frequency I mean, you might have, I don't know, maybe at least five updates to your attack framework or any kind of framework that any kind of graph model that you're trying to capture sort of the types of techniques that are evolving. But at the same time, when you look at the vulnerabilities, it's like one published pretty much every day now. And this is the reason it slows down in 2020 because I think COVID affected everyone and the approval process slowed down. I'm pretty sure it's gonna make a big comeback and you're gonna have more and more sort of vulnerabilities hitting. So you cannot have this manually vetted and you cannot have these mappings done manually. There has to be some AI initiative to dynamically associate these vulnerabilities and weaknesses with the sort of existing frameworks. And Mitre Attack is one of the most promising ones and everybody's talking about it. They will try to capture different aspects of it. So for graph inference, then I started to think, okay, then we can have basically several scenarios. There are knowledge graph driven scenarios. So we can have threat exploration, right? Given the Mitre's capacity of identifying intruder profiles and agents that do the attacks, malware, software. And then the other side of it you can have if you can ingest the OSINT intelligence feeds in a reasonable way. If you can actually read the bad stuff, useless stuff or misinformation from that and capture together with your sort of data feeds from the databases. You can actually have an impact analysis on variation of vulnerabilities and how they may actually enable certain variations on techniques that you know. So that's impact analysis. And another obviously recommender action type, use of the knowledge graph is already explored by Mitre. They in addition to have an attack model, they're building a defense model. So it used to be that you just have based on the detected technique, you would have a course of action to mitigate. Now they have a full blown sort of set of techniques that you can use in combination to address the attack that you recognize. So all these things are not new. And there's a lot of sort of activity in many respects. But we want to start with threat detection and we want to start with Kubernetes. And because we thought the state Kubernetes is a organism and let's get a lot of information from it as frequently as we can. And not only look at things that we can measure but try to capture as much information as possible. This cannot be done with the thresholds. Your Kubernetes environment will have different diagnosis state. If you're in a production environment, development or test environment. Even the behavior will change from day to day, weekly or for certain seasons. So everybody will have their own baseline. So we wanna be able to capture that first of all and then establish a baseline and then being able to express this in open metrics. We can issue queries and we can look at the rates of changes that are critical in our environment in that they're pointing at some kind of technique in action. So we took the Kubernetes site is completely us trying to merge it into the knowledge graph. And we have a minor attack on one site but as you will see the minor technique is kind of generic. It doesn't really cover anything Kubernetes specific, very high level. Then we looked at the Microsoft security threat matrix which is far more Kubernetes specific to the extent that we can actually identify what to measure, what Kubernetes resource to go after and what to watch for. And we wanna also make sure that we don't have a fixed model because we can introduce different kinds of measures, different kind of resources and we wanna make it as dynamic as possible. Possibly employ some kind of online training with little batches and do online training as it comes and making sure that it doesn't forget what happened. So it will be training on small batches with the data flowing in streaming in and it will be making inferences as it learns from various patterns of activities. And for this we use the DGL graph library excellent sort of library with a lot of examples. Some of some are more sort of mature than the others for instance the heterogeneous graph that we're dealing with the link prediction wasn't very strong to be honest. I mean it just illustrates one particular scenario. So the sort of our experimentation is available on that particular website. Feel free to go look under heavy construction but what we're looking at is a full vision is having first of all some scenarios identify our Kubernetes agents and the types of inferences that we wanna tackle and we wanna basically not only publish the models that we develop for this specific also full pipelines of how we are gonna train and to end implementation as an example and probably try to contribute it into open source libraries like that because without anybody else's contribution I don't think we would have made any sort of progress. It is time to a lot of people implementing these approaches and algorithms that can get really complex that sort of helped us putting together our own examples. The other very helpful point would be not only the graph models sort of shared but the data sets specific to the libraries PyTorch or DGL for that matter TensorFlow whichever way we wanna use it. What we wanna do is we wanna basically capture some business as usual like healthy diagnostic states with some measurements in open sort of metrics format and also we wanna capture some data in terms of simulated attacks and how you're maybe if not lucky but if you get to sort of get an attack we may actually capture that information and we may encounter measures that we never thought would be related from that information so that will be a much kind of richer way to contribute data for people to take over the benchmark their models play with different variations in their sort of approaches to convolution however they wanna sort of proceed with it and also for validation and testing it would be extremely helpful. So that is kind of what we had in mind in this experimentation and that pretty much completes I hope I'm not, I didn't even see the sign but I hope I'm in time. Okay. Yeah so I mean and I hope too many images I hope nobody gets a stomach upset right before the dinner but I just wanted to finish with more images from OpenAI just playing with it and getting some random generated art using the keywords for my presentation. Anyway that's probably slightly more interesting than our experiments. So hope this is inspiring or just interesting and I'm happy to have any suggestions and so that we can sort of explore anything that you would think would be relevant we can explore that area so I'm open to any feedback questions if not I think it's dinner time. Yeah actually the slides should be in the schedule so you can pull up from the slides and you can, it's there. I put the latest and greatest version so update. Thank you, thank you for giving the opportunity.