 Hey there, my name is Natalie and I'm here to talk to you about pixel scripts which are our API for working with data in Pixi So for this talk first, I'll explain to you what a pixel script is Then I'll talk a little bit about how they work and then finally what you can do with them Just a little bit of background on me. I'm an founding engineer at Pixi. I work all over the stack But as a lot of full-stock developers go I focus a little more on the back end than the front end so I put that little joke on that right there and At Pixi I tend to focus on pixel or compiler and our execution engine So as you've seen in some of the demos in the other content Pixel or Pixi ships with a lot of different views of your system You can look at the state of your cluster dive down into one of your services And here I just want to answer. How is it that we can create these views for you? So everything in Pixi graphs charts and tables. They're all produced by an API that we call pixel So let's take a simple example table on the left So pixel is a 100% scriptable interface So on the example there we can see the script that generated that table on the left We can go into the syntax in more detail later, but at a high level the script is you can think of it like a select star For loading the HTTP events data set for the past 30 seconds We're grabbing the pod that each of those requests came from and we're returning a subset of the columns as the result table So backing up a little bit. We have the scriptable interface, but what do we design it to do specifically? The first task that it has is it needs to be able to query data that we auto collect in our system The second thing it needs to be able to do is actually collect new types of data sources and Finally, we really didn't want to invent another language We think that the world already has a lot of those so we didn't want to reinvent the wheel there So let's go into how we want to avoid building yet another query language we needed a flexible API to work with data and Anyone who's familiar with Python may recognize some of the syntax on the right there That's because all pixel code is valid Python Now we don't actually execute any Python under the hood, but we can get to that a little bit more in some upcoming content The hope here is that users that are already familiar with Python don't have to learn a new syntax However, narrowing it down to just Python syntax isn't enough. We need to make it easy to perform data analysis and machine learning in Pixie One thing we noticed is that pandas is a popular tool for data analysis in ML and Python and It actually matched a lot of what we needed to do in pixel. It supports easily expressing operations like filter join aggregate and running inference on data It has an established community and lots of existing docs So what we decided to do is make all pixel valid pandas as well So pixel like pandas is you can think of it as like an embedded domain specific language in Python So to recap we want to avoid reinventing the wheel and make pixel more accessible So we made it to follow the API's in pandas and use Python syntax So like sequels pandas and other languages pixel is a data flow language What that means is that queries are expressed as a declarative series of operations on data So you can think of it as like operators are nodes that the data flows through Because it's a declarative language what we can do in our execution engine is Plan and optimize the query so that the user doesn't have to worry about exactly how the computation happens They just tell the system what they want it to do So now let's talk a little bit about how we represent data in pixel And it's a concept that we and some other systems refer to as data frames so you can see this line of code right here in the example that we've been working with that we initialize a data frame in that arrow and That is basically the pixel version of a table a Data frame can basically be thought of as a set of rows and columns and specifically the columns are typed in Pixel we have a raw data type for the column like string or int, but we also have what we call a semantic type Semantic types basically tell you a little bit about what the columns meaning is While the raw data type is used to make the query execution more efficient the semantic type helps you understand the data better and These types are propagated through your data frames throughout the entire lifetime of your query So you can see this one pod column We know that while its raw data type is a string its semantic type is actually a pod So let's talk a little bit about what this buys us and go back to the screenshot of the result table that this query produces You can notice on the right that we've inferred that the latency column is a latency duration And we've added units and highlighting for that in the UI on The left the UI knows from the semantic type that that is a pod column So we've added is a deep link to a view for each of those pods So when you're interacting with these tables, you can click that link and then go find out more information about that pod in The query we didn't have to do anything to make this happen. It's just automatically infer based on the types that we track in pixel data frames One more point on the stuff that we track in pixel data frames Every record in our system. We store a per row context that's accessible throughout the lifetime of the query In that context we track things like the service that this record came from the node that it came from the pod that it came from So even as the data is transformed, you can still access that information More information about how we track this context in store in our system will be Talked about in an upcoming talk. So please check that out when it comes So now you know a little bit about how data is represented in pixel, but how do you actually query the data? So in pixel, we use transforms to do the various steps of analysis on your data set Things like aggregate, join, filter and things like that The query on the right that we have there is pretty simple, but we support lots of different transforms in pixel data frames These are expressed as methods on the data frame itself So all pixel data frames are immutable All pixel transforms produce a new data frame What that means is that the common logic that you use to make your data pipelines can be expressed as functions and actually used by multiple result tables So for example, I could add a new function that uses the HTTP data function that we've defined above To produce a new table that let's say lists all the pods that have received HTTP requests And both of these output tables are actually using the same logic without affecting each other So we think that this is a really powerful feature of the pixel language is that you can make these composable data pipelines by We're factoring out common logic into functions So if you want to find out about the available functions that we ship pixie without the box You can check out our docs to look at the available functions in the px module, which is the main place we store functions right now We'll have more coming soon and please let us know if there are some that you would like to see and also coming soon is We are built for developers by developers and we want people to be able to make their own modules for stuff That's useful to them So in the future you will not just have px, but you can also have your own modules that you've defined with your own functions So a lot of that was like how do you query data? How is it represented? But it was all based on data that pixie ships out of the box with but we said in the beginning that pixel actually allows you to Express the collection of new data So we can already do things like HTTP requests. We can do things like network statistics But how do I get a custom source that pixie doesn't know about yet? Like collecting the arguments past to a go function I'm running We use a concept called mutations in order to create new data sources in pixel Now there is other content available discussing go probes in more detail But I'll give you a high-level description of this example mutation for the purposes of discussing pixel So on the right I have a pixel function here or a pixel script here that basically says I have a function called sum it takes in two integers and I want in pixie to Write down the values that that function is called with during the execution of my program And I don't have to modify any code in order to do this. All I have to do is run a pixel script So what the code on the right is saying is please define a probe that listens to these arguments to my sum function. I want you to store the results in the table called sum table and Once I've done that I can actually query some table like any other pixie table And so I just treat it like any other data frame as you can see on that last line down there So how can you get started with pixel scripts? The easiest way would be to start running the scripts that we ship pixie with You can do things like look at your JVM data. Look at your various database events trace network requests But for a lot of users they would actually like to put in their own scripts And so we want to make that possible as well The best way to do that would be to check out the open source scripts in our pixel github repo And then you can use those as a jumping-off point to write your own scripts We really think that we've just scratched the surface of what's possible to express in pixel And we would love your ideas on what we should do with it or scripts that we should add to make it even more powerful Thanks a lot for checking out this video and please check out our other content videos coming soon. Thanks