 All right. It looks like I'm up next. So hopefully people are seeing this, my videos here. I'm going to be talking about Pixie from this beginner's lens. Pixie just came out, so I decided to take a deep dive into it. I think earlier you saw a little bit about the high level of what Pixie does and then what we call the T-shape, like how a developer would approach the platform and then drive deep into debugging their applications. But for me, I want to take a slightly different look at Pixie. I think it's going to do amazing things to help people debug. It's going to give you lots of metrics and dashboards, basically giving a way to visualize and add a workflow to your troubleshooting. But one thing that really picked my entrance when I took a look at Pixie for the first time was the ability to start building these multi-cluster Kubernetes system tools and utilities. I'm going to share my screen. I'm going to try to walk you through kind of my as a new user figuring out what I would be using Pixie for. So I'm going to share my screen, and then I'm going to spend a few minutes and just talk through what I think Pixie is going to bring to the table in terms of giving people the opportunity to leverage all the data that's inside of their cluster. So one thing that I have here is the way I like to think of this is whenever I approach a new technology, I try to figure out, can I recreate something that I already know? For example, I'm on my local laptop here. You can see on my screen it says Pixie Day, and I remember learning Linux for the first time. So you have this server, you drop in, and now you want to ask the system some questions. So one of the first commands I remember learning back then was the PS command. This is a classic command that just tells you how to get all the process information, maybe some command line flags, and you can see things like what the user is running in, what pit it is, how much CPU it's using, and some of the memory stuff. This is a really powerful way to administrate a system especially when that system is running multiple things. So here's the new challenge. What happens when our system starts to look like this? So this is Google Cloud's Kubernetes Engine, often called GKE, and now that we're moving to this container world where I'm not dealing with a single system anymore, and I'm also running a series of mixed workloads. So when I think about Pixie, it's like this big data platform. So what happens when you start building tools that are built on top of structured data? In the kernel world, most people are used to semi-structured data. For example, if you want to rebuild some of these high-level commands, you could parse these files, understand the format, and of course you could glue together your own view of your system. But again, this is kind of convoluted. Most of this is not really considered structured data like the way we think of it these days in 2020. We're thinking something like YAML or JSON or even XML if you have it. So we want to up-level this quite a bit. So how would you go about that? Well, on the surface of it, Kubernetes has a ton of metadata. So what do we mean by the metadata? So you can do a quick reminder of what you get out of the box when you're thinking about dealing with Kubernetes. We saw earlier I had three clusters, kind of spread across the globe, Oregon, Montreal, and Frankfurt, Germany. If I click on one of these workloads, this is what I mean by metadata, right? This thing has an ID, use this kind of represent it in this kind of YAML structures, how most people view it, and what we get from that metadata is things like the IP address, so forth and so forth. Now, what if we want other metrics, right? So things like how much memory it's using, how do I recreate that PS command, not just for a single cluster, but across multiple clusters, right? So if you've ever installed Pixi or maybe you kick the tires on that free trial, when you pull the console, you'll notice something here. You'll have a dropdown of like all your clusters. Again, this is Kubernetes native. You can pick different things like namespaces, but you'll notice this thing over here, this concept of a Pixi script, right? When you run these scripts, right? It's kind of a way of using the Pixi language to get a little bit of insight about what's going on inside of your cluster. And the nice thing about this is that you can actually run these from the command line, right? So for example, if you want to know what scripts were available, and again, I'm a rookie here. So if I'm doing something wrong, it's because I actually don't know what I'm doing, okay? But we're just gonna be learning in public today. So here, if I run the script, you'll notice most of those scripts that are available on the UI are also available on the command line, right? So one thing you look at here is say, oh, great. I have a bunch of these pre-built scripts. But the thing is, how do you create your own? That's what I'm most interested in. It comes with a lot of great stuff out of the box, but being a system administrator, I'm thinking what kind of tools would I build if I want to get those custom views? Again, I'm in learning mode, and my whole goal is to recreate that PS command, but I want to put a little twist on it. I wanna have an output like this across multiple clusters and seeing all the containers in those clusters. And in this way, I wanna treat all of my clusters kind of like that concept of the data center is the computer. So the first thing we have to think about here is, how do these things work? Well, what I'm gonna do here is I'm gonna show you what the output of some of these scripts are. So there's a couple of things we have to think about here. One is, I need to know what data is available to me and kicking the tires on the Pixi platform. I found out that most of the data can be had by running this PX run, give me the schemas, right? So when you start to really look at what Pixi's doing, it's taking data from the kernel using EBPVF. It's taking data from Kubernetes, including the metadata and starting to aggregate these things and link things in a way that's pretty unique. So what kind of schema underlies that? So let's just run this little command really quick. So if I run this Pixi schema command, based on whatever cluster that I'm pointing at, you'll see I have things like network stats, I get the pod ID, how many bytes it sent and received, and for the task that I'm working on, which is I wanna recreate the ps command, you also see here that a lot of the data that I wanna have is also available by default in this thing called the process stats table, right? And they even refer to these things like as a table of data. And they do a really smooth thing where they're collecting this data in a very efficient way. There's gonna be way more advanced workflows, but for this particular use case, I just wanna keep it super simple. I just wanna know what data have available to me so I can actually start creating my own commands. So what's behind these commands? Okay, so if you come from the scripting world, you'll be like, hey, I wanna get at that data, maybe format it, aggregate it, maybe take a subset of fields. I'm gonna show you what a pixie script looks like. And I think going forward, people will start to write and share these pixie scripts, like little buckets of knowledge or how to do things. And I think people will start to use those as the foundation of their troubleshooting, right? So you probably saw some of that stuff earlier, but I think the ecosystem will unlock here. Maybe we start treating these things like packages with versions and metadata. I'm gonna show you one of the ones that I've been working on. And this is how you're running, so let's look at the source code. So I have this little directory called script, so I'll make it slightly bigger. So inside of this directory called scripts, we're gonna look at kind of a pixie script from scratch. And I'm gonna walk you through this just a little bit. So we kind of import the kind of base layer of pixie here. And what we wanna do is we're gonna kind of set up some data here. And again, we're gonna re-referencing this process stats table. I'm gonna go back about one minute to actually start to get some of that data. Now this piece here is a little bit tricky. So what I can do is kind of have an alias to get metadata about those containers and pods running inside of Kubernetes. And if you notice what's going on here, this is more of a declarative syntax. Even though it's not mentioned this way, I kind of look at this as like a sequel tier for my infrastructure, right? So I look at it as the declarative way of saying, go get all of these data points, I can format this data just a little bit, and then I can actually organize that data or group by or aggregate it however I want. So the way I started looking at these pixie scripts is like, ah, okay, I can treat them like sequel. I can model and retrieve my data. And then we'll see what happens when I wanna have a presentation layer, right? So you can imagine in Bash, we call a bunch of utilities and then we may pipe it to something else to format it. For example, I'm gonna run this particular pixie script and then I'm gonna format it on the command line using JQ to see what we can do in terms of our next step. All right, so let's see what kind of data we get from this. So the next thing I'm going to do is I'm just going to show you how to run this. So on the command line, I'm using the pixie CLI, I'm gonna run with output equals JSON, and I'm going to run that script that we just picked in the pixie directory. And since it's pretty Kubernetes native, I can give it a particular system or namespace inside of Kubernetes that I would like to get information for. And what this is going to do is effectively run this over my clusters, okay? So what clusters do we have available? I'll show you that really quickly. So as I was developing this thing, I also were creating little utilities like list my clusters, right? Again, this is just going to be using some of the native tooling that's built into pixie. We'll run it really quickly so you can see what clusters I have registered in the system. So I'm actually using this kind of JSON output formatting it with JQ. And you'll see here, all of my clusters are being printed out. For example, you'll see the same cluster names that I see inside of the GKE console. It gives me the cluster ID, a little bit of metadata about the version. What I can then do is take this particular data and then start building tools that are multi-cluster aware. All right, so let's run that custom script that I built. Okay, so we're going to run back to this PS command. So we hit enter here. And what's going to happen now is pixie is going to go out to those particular clusters in this case, Frankfurt. And it's going to start pulling back things like the pod, the container that's running in there, the pit ID, some of the stuff that we can get from the kernel, the namespace and so forth. And I have a ton of data here. And again, I started looking at this as like, well, I don't know if I would want to build like an end-to-end command line utility with like flags and formatting and tables. I'm pretty sure that the pixie DSL over time could get better to support all those cases. But I'm going to take the view that this is more like my SQL layer. I should just go in and retrieve the data, lightweight formatting, and then maybe use another tool to wrap it. So one thing that I did really quickly was I took the pixie base and created my own library. Now this is super hacky. All I'm really doing is wrapping the command line for now. But I can imagine in the future, there might be like a go native library that allows me to kind of query this stuff just using my favorite language of choice. But until then, all I'm doing here is taking that same pixie script that I wrote before and had it in a text file. I'm just kind of storing it in my program and treating it like a library. And we've seen people do this for like SQL statements, right? Like you get a SQL statement that you want and then you give it a high level name like pod stats query. So now I'm going to just be treating this like a query going forward. It has its own unique syntax. I can has a lot of power in it, but then we're just going to extract that away. What I'm going to end up with then is my own little library where I can just talk about things like clusters. You know, the command I ran before that gave me the name, the ID and the status of the clusters and also things like those PID stats. Now I have a native way in going of working with this. And of course I'll show you my little hacks. All I'm really doing now is calling out to the pixie command line tool, making sure that the data comes back as Jason and then I'm just parsing that data, decoded it into the native data structures and then returning them back as a library. And things get a bit tricky when you start to have input. For example, when I'm going out to pass it in my own custom pixie script, instead of running an existing pixie script, I need to do something interesting here. I can say, hey, the file is not going to come from this. It's going to come from standard in, all right? So the nice thing there is I can take that pixie script, turn it into a buffer and then just pipe it to the command line when it runs and I'm doing the same thing as before. I'm getting an adjacent data structure back and then from there I can easily parse that and turn it into native objects. In this case, I'll be returning an array of pod stats. So I really think for the next level of pixie developers out there that are going to be writing pixie scripts, I think in addition to the command line, I kind of treat the command line like a REPL where I can kind of prototype out things that I want. And once I get the perfect script, one thing you can do is again, there's this really nice integration with the web UI for the scripts that you run on your local laptop. So maybe this data is meant to feed the web UI or it's meant to feed the command line. All right, let's tie this all together. So reminder, a whole goal here is as I'm learning pixie and pixie scripts, I just want to recreate something that I'm familiar with. So we've already figured out a way to get the data we want and we figured out a way to kind of figure out or how to get it across multiple clusters. So now that we have the raw data, I'll show you what that looks like again, so we can all follow along. I'm just wrapping it in a shell script called ps and all it's doing is just calling out to that command line tool and printing the raw data. I've converted this raw command into a library. So now let's pull it together. So I have a folder called ps, right? So this is going to be my go version of a ps command that treats multiple Kubernetes clusters like a single machine running in a data center. And we'll take a look at that code really quickly. And I can imagine people doing this in Bash, Ruby, Python, you name it, but I really think this is going to be possibly one way forward, which is give me a pixie library that just turns my infrastructure into something that I can query and then I can start to do things like get a list of clusters, loop through that list of clusters and then do things like filter out the namespaces that I'm interested in. In this case, the Kubernetes system namespace and then I can just format that data. So I'm just using the standard library in GoLang to just give me some tab output by formatting some of those fields, right? So we saw the data structure we were getting back like pod stats. And in there I can get the cluster name, namespace, the name of the pod and the PID and some of those things that look very familiar from the pod from the PS command on my Linux machine. All right, so let's put this together. Let's compile it and see if it runs. So if this were to work, we should just call GoBuilds. I'm sucking in that pixie library. And what I should end up is with this kind of new, I guess if you will, cloud native multi cluster aware command line tool to mimic some of the behavior that I was doing before on a single system. So if I run PS now, I kind of hard coded the flags inside of this command, but I'm just learning. So we'll just use this as a prototype. So I spit that command out and I'm printing each of the clusters that I'm looping through. So now I'm going through all three of my clusters and it's pretty noisy here. So let me try to make this a little smaller. I'll run it again and then maybe we'll grab out something like Envoy and we'll try to see where Envoy is running across my machines. It's gonna be a little bit of a noisy output, but I'll try to step through it to see what we have. So as you can see here, I have four instances of Envoy running across my clusters. So I have three, so I have three clusters, Oregon, Frankfurt, Montreal. And even though I'm getting all the namespaces here, here's the PL namespace, which is the pixie labs namespace where I installed it. And it looks like pixie team is using Envoy. So if you look here, I'm actually printing out the command that the process is running. And I'm able to get some data a little bit lower than what Kubernetes provide. I'm able to get the PID ID, how much memory is being used, et cetera. So this is kind of a high level view of like what you could start to do with the pixie scripts. So once you have pixie installing your cluster, it's gonna do a good job of grabbing data from your processes, from your kernel, doing a good job of aggregating everything. And once you have it, of course, you can use it for your troubleshooting and debugging workflows, or you can start to build some of these new tools, right? If I was debugging something that was running inside of Kubernetes, I built a lot of pods that I've deployed, and I've always wondered what the hell flags are being in that container. So instead of me running kubectl, get pods, and then parsing out the arg line, I now have something that feels super native to the way I've always been working inside of the Unix world. So I think this has got a preview to how people will be building these pixie scripts. And I think if you're interested in this, you should consider contributing to pixie, kicking the tires on it. And I think there's gonna be a GitHub repo where we can start to, as the community, we can start pushing those scripts up and sharing with other people. So hopefully that makes a lot of sense to you all. Love to see what kind of scripts you all build in the future. And I guess we're gonna get ready for the next segment.