 My name is Chris Henn. I'm a student here in Portlandia. I go to Reed College. I also work for a company that does analytics. And so a big part of that job is building data visualizations. We've been using Ember for about a year. And I think that Ember is a really good framework for building data visualizations. But I also think that most people don't take advantage of it in the biggest way they can. Data visualization is kind of interesting in that there's so many examples out there on the web for how to do it. It's very easy to go from the most vague, slightest idea that you need to visualize some sort of data. Maybe you need like a bar plot, or you think you need something like a bar plot. And then you Google it, you know, maybe something along the lines of like D3 Ember bar chart. And that's all there is to it, because you have a complete working example in almost every single case. And so this is really great. But I also think there's somewhat of an issue with learning exclusively through examples. You can imagine if you were learning some other subject like MVC, if the only way you learned MVC was through to do list examples, you wouldn't learn it very well. And I think data visualization is similar in the same way that there's general principles that underlie it as a whole. Yet it seems like no one ever talks about those. They're always focused on very specific types of visualizations. So what I want to do today is think about data visualization in a more general sense. Specifically, I want to ask this question, how do we split a statistical graphic into parts? If you think about Ember, especially as the framework has progressed over the years, most of what we build now in our applications take the form of components. In Ember 2.0, this will only continue to be the case. You thought you had controllers and views. It's sort of like just kidding. Everything is a component. So it's somewhat important to really step back and think like, what does this mean for how we're building our applications? From an architectural standpoint, it means that any problem you have in your application, you're attacking with composition. And so for any problem, you can ask the question, how do I break this problem down into parts? And so I want to do the same thing with data visualization. And I think that's going to be a good approach in Ember. If we do this, we'll get a couple benefits. First and foremost, if we have some problem and we split it down into a bunch of parts, we're going to be able to think about each part in isolation easily. So if we have some complicated data visualization, maybe it's a scatter plot or histogram or something like that, if we break that down into separate parts, we don't have to think about it all at once. It's like no matter how smart you are, you can only keep so many parts of the moving application in your head. So a better approach is to split something into parts and then you don't have to think as hard. If we also do this, we're going to get some value out of knowing what the parts of a graphic are. And so if you know what the different parts of a graphic are, you also know what you can change. And so there are named the graphics out there, right? The scatter plot is a type of graphic. A bar chart is a type of graphic. But you tweak these in every situation that you have data. And so knowing how you can tweak them is going to enhance your creative process. And this leads to the third point of if you have a graphic broken into a bunch of different parts and you know which parts you can change and swap out, it's going to be really easy to build new visualizations based on what you already have. And so you're going to be conveying data better in every situation that you need to. So I want to have a demo to give an example of this. I've got some data up here, what's pretend it matters. It's about a bunch of cars, each row is a car and then the columns are the attributes of that car. And I'm interested in exploring the question, do heavier cars get worse mileage? So I've got the weights of a bunch of cars and also their mileage. In JavaScript, we would represent this like this. It's an array, a bunch of objects in it instead of columns, we have properties on those objects. So here's a scatter plot. I thought this would be the best way to visualize this data. I haven't added any labels to it yet. And so the vertical axis I can just tell you is mileage and the horizontal axis is the weight of the car. I've also encoded one more piece of information in this graphic and that's the color of each point. Color of each point represents how many cylinders the car has. So you can see the really light cars have four cylinders, the medium car is six and the heavy car is eight. So that's cool. We can instantly see, oh, we've solved our data visualization problem and we can see what the correlation is. Heavy cars get worse mileage. But how is this implemented? So here is some code. This is sort of the top level for this scatter plot. It doesn't tell us too much, but we can see that we have sort of a general my scatter plot component. We give it some data and we tell it what attributes of that data we want to use for the various aspects of our plot, for example. Mileage on the y-axis. And this is cool. You can imagine you would use this in your application in a couple different ways. Maybe there's different scenarios in which you need a scatter plot. Let's pretend that in this particular instance, I also want to add a regression line to the scatter plot. A line of best fit to the data. And so we've got the scatter plot used in a couple of places but we also, in this one particular instance, want to add something more to it. So I've done that here, if you remember in high school entering a bunch of calculations for doing simple linear regression, totally terrible. I put that in a component and then displayed it on top of the plot. To do that, it looks like this. This morning you heard Yehuda and Tom talk about something called block parameters that's new in number 1.10. So I've used block parameters in this scatter plot to add a regression line. And so you can see that instead of just using a simple my scatter plot component, I'm using the block form of it. And then I'm passing some attributes to the separate regression layer in that visualization. Prior to ember 1.10, this wouldn't be possible because for that regression line to make any sense, it still has to use the same scaling functions as the original scatter plot. And so if you can see this line, the end of line three, this as plot, that's what's allowing me to use the same scaling functions. So the visualization makes sense. And then as a third step of this, I thought it would be interesting to show how easy it is to now add multiple regression lines to this plot. So that's what I've done here. I've done it as a factor of the number of cylinders of each car. So I'm operating on subsets of data and no longer just the original dataset anymore. And you can see that they're colored. There's one for each set of scatter points. And this was easily implemented as well. So you can see that instead of one regression component inside the scatter plot block content, I've got I'm iterating over subsets and adding a plot for each point, right? So now the plot regression component has different data, but it's still using the same attributes of the plot. It might also be useful to look at how the scatter plot component is implemented. So here's the template for that. There's still no markup in here. It's just more components. The first thing we see is some axes. If you've ever had to build axes in a data visualization, you know, it's a huge pain. And so I certainly only wanna have to do that once. It's a lot of fiddling around with margins and simple math that somehow gets complex. So I made that a component, so I never have to do it again. And then inside that component, in the same way that I added a regression line to my plot as its own component, I wanna represent the layer of points in the graphic as its own component as well. Because you can imagine that in another situation, I would have something that isn't a scatter plot, but I might wanna add a layer of points too. If we dive one layer deeper, we can go into the, this is the template for the points. And we finally see what's going on here. There's just some SVG circles. So you can see I've broken down this data visualization into a lot of different parts. The reason I've done this is because I wanna have a flexible system for putting together data visualizations in different ways. And it gives a lot of value because you're able to easily convey data without doing a lot of work, but you can do it in new flexible ways. But there's still the question of like, why did I do it this way? Why did I choose to represent the points as a different layer and regression as a different layer? And so going back to the original question, how do we represent a statistical graphic in many parts? It's not an original question. It's actually been studied many times before. One possible answer to it is called the grammar graphics. It's from a guy, Hadley Wickham. As the name suggests, it's a way to describe the components, the components of a data visualization and also how to put them together. It's just like the word grammar in the English language that tells us how to put together sentences. This grammar graphics tells us how to put together data visualizations. It takes the form of a paper. I think it was published in 2008. Super easy to read. I highly recommend it. There's a lot of value there. These are the guidelines I followed when building this example. I had a scatter plot. One thing we might try and do with it is piece out a couple pieces of the grammar and then I'm gonna describe the whole grammar sort of just give a quick introduction to it. One thing we noticed about the scatter plot is that we're representing aspects of our data sort of spatially in the plot. This is like the hallmark of data visualization, right? We take some hard to understand data table and then we do something visual with it. So for example here, I'm representing the weight of the cars on the x-axis. The heavier the car, the further along the x-axis it's gonna get. And we're doing something similar with mileage in the y-axis. And then there's a third sort of mapping where I'm taking the number of cylinders that each car has and representing it as the color of each point. So we could say that there's three mappings in this graphic. Mappings from variables in the dataset to aspects of the plot. The grammar formally defines this. It's called data to aesthetic mappings. So you can think of those as aesthetics of the plot. There's the y-aesthetic and the x-aesthetic. And this isn't just something you would find in a scatter plot. You would find this in many types of visualizations. So maybe you can imagine for a histogram, what are the data to aesthetic mappings for a histogram or a box plot or something like that. Every graphic is gonna have at least one data to aesthetic mapping, but many times more. This graphic has three. Another thing, now that we've defined that every graphic has data to aesthetic mappings, is that there has to be some mapping function, right? If you've got a car and it weighs five tons, how do you get from five tons to maybe halfway along the x-axis, right? 40 pixels along the x-axis or something like that. You need a mapping function. The grammar defines these as scales. So for every data to aesthetic mapping you have in some general data visualization, you also have a scale for that data to aesthetic mapping. There's different types of scales. You can imagine that scale that maps spatially along the x-axis and the scale that maps from the number of cylinders to the color of a point. It's gonna be a different type of function. So we have many different types of scales, but it's one scale for every data to aesthetic mapping. And then something else obvious about this is that we've chosen to use points, right? And this may seem like no duh, but we could have done other things as well. We could have connected these points with lines, maybe drawn some sort of area. We saw the regression line, that would be sort of a different shape representation of each of these data points. So the grammar formally defines that as a geometry. So just looking at the simple scatter plot, we've already picked some things out of it that we think can apply to every sort of data visualization you would have. One of them is the data to aesthetic mappings. It's the scales for each data to aesthetic mappings. There's different types, linear scales, categorical scales. And then a third thing that the grammar graphics defines is something called layers. So you noticed in the plot I had before, I had a point layer, and then I was able to add another layer that was for a regression line. Every single layer you have in your plot has a geometry, and then two other things call this. One of them of which is called a stat. You can imagine in certain data visualizations you need to compute additional information about it, like a box plot, you need to compute the upper and lower quantiles of the data. So that's some like just in general statistical transform you have to do on the data. It adds data to the data set. So the grammar graphics calls that a stat, and every layer has a stat. You could think of the point layer as having a stat, but it was simply the identity function. It did nothing. So that's one more formalism we can sort of define. And then for each layer you also have an optional data to aesthetic mapping. In the same way that I operated on slightly different data sets in the three regression line example, you don't necessarily have to use the same data for every layer of your plot. And in many senses it makes sense not to. The grammar graphics also defines one more item, or actually a few more. One of which is a coordinate system. This may seem pretty obvious. Almost in all graphics we use the Cartesian plane or X, Y points as the coordinate system. But there are other ones out there. If you have a very circular plot, it's gonna use polar coordinates likely because that's gonna be a little more convenient. If you have something like a mosaic plot you actually get funky, higher dimensional coordinate systems. And if you have geo data, like you have a map and you wanna display some information over it, you have an implicit choice of a map projection. The Mercator projection is the most common one. Grammar or graphics also defines one more thing called faceting. Just for the interest of time I'm not gonna talk about it today. But again, it's a really good paper and I highly recommend you read it. It's gonna help you a lot with knowing how to compose data visualizations. So that's super formal, right? It's a academic paper you have and it defines all these different aspects of a plot. But it's helping us answer the question how do we split a statistical graphic into parts? So now the thing that remains to do is translate what are each of those parts in like what do they look like in Ember, right? Data to aesthetic mappings. This one's pretty easy, we've already seen it in the example. This is just on the outer layer of your component. You give it the data and then you also map from aesthetics to properties of that data. Scales, what do scales look like in Ember? Well, in JavaScript in general you usually have some sort of constructor function which will create a scale. So a linear scale, this is just some pretend function that creates one. I'm giving it domain and range those are sort of specifics of scales. Scales are pure functions. If that excites you at all, there is excitement to be had. There's a lot of cool things you can do with them. I don't wanna go down that path too much. Basically all you need to know is that a linear scale will take a number like five and compute it and we'll transform it into something like 150 and then if you give it 150 it can also invert it back into five. So a scale is just a function and it's inverse. And there's linear scales out there. There's scales that will map from data sets to colors. In Ember, I like to represent those as computed properties on a component. We've already seen in earlier talks macros. If we have a computer property that we're gonna reuse over and over again in different data visualizations, it's gonna help us to define a computer property macro and then very easily we can just define scales as computer properties on components. So this is a computer property that returns a function. Layers, what do layers look like in Ember? Well, remember the three parts of a layer are geometry, stats, and then you might have a data to aesthetic mapping specific to a layer. I like to represent each layer in my data visualizations as a separate component. I think one approach there is if you just go out on the web and look for examples, they're very self-contained so they're gonna have everything in one place. So no one is gonna suggest you split up your graphics like this but if you split out each layer into its own component, layers are generalizable across different types of data visualizations and so you can plug them together in new ways. It becomes really easy. It's kind of like playing with Legos, always fun. So for each layer, a component, it looks very much like the top level data visualization. Pass it some data, give it some aesthetics. You also are gonna have to pass scales to it because for a layer to make any sense, needs to use the same scales consistently throughout a graphic. Geometries, most people use SVG to represent the sort of very graphical component of their data visualizations. This is great. Here's one for a line. Some other geometries out there might be like a point. Here's an SVG that represents a point. Pretty straightforward, we bind the data in the template to data on the components. So that's the formal grammar, right? And it tells us how to break a graphic into parts. It doesn't tell us a few things though when we go to build data visualizations in Ember. One of those things is interactivity. The unique thing about the web is that when you visualize data on the web, you can have users interact with the data and that's gonna convey it to them much, much better. So the grammar specifies nothing about interactivity because it was built for something that isn't interactive. It's built to produce graphics that are printed out on pieces of paper. So that's one thing that isn't really considered by this grammar. Another thing is animations and transitions. These are hugely important for conveying data and data visualizations. There's all these studies out there that show if you can have transitions in your plot. It's not just like a pretty animation like no, that looks nice. It actually really helps you understand the data because that's the whole point of a data visualization, right? Is to convey data. It's important that you have transitions in your graphics. I think for both of these items, they sort of follow naturally from breaking down your problem into lots of parts, right? Because the way that transitions take place in most data visualizations is that they're this awful like long chain of imperative calls to some sort of transition library. D3 is the most common one. If you break your graphic into many parts, you only have to do a little bit of that at a time. And so their transitions are sort of inherently complex. They operate over time and that's not really a dimension we're used to working with in a declarative object model like Embers. But if you break your graphic into many parts, then it's very easy to add these complex snippets to each individual part only when you need it. Seeing those for interactivity, you have to do a lot of sort of manual understanding of events of the browser. Like drag events can be really complicated. Same thing. If you break down your graphic into many different parts, it's very easy to add just little snippets of interactivity here and there. And you're not gonna complect the interactivity with the overall complexion of the entire graphic. Then one more thing to think about is that if you add transitions to your graphic, you now have a performance issue because you're gonna be updating graphic many times per second when you do a transition. So it's just one more thing to think about. So I presented sort of a general way of how you split a data visualization into parts. And I've tried to map each of those parts so what they mean in Ember. And hopefully by doing this, you can convey data more effectively. One question to ask then is like, when is this appropriate, right? It's a super formal abstraction. And it would make sense if you had like the most general situation where you might have to implement any sort of data visualization. I like to think of it as a spectrum of sorts. On one hand, you could be building general purpose plotting library, like matplotlib or ggplot or something like that. And then on the other hand, you could have an application that just has one data visualization in it. If you have an application that just has one data visualization in it, it doesn't make sense to apply this full on formal grammar to it. But if you are building a plotting library, it would make a lot of sense. And so I think for everyone today that needs to convey data to users in their Ember application, you exist just somewhere along this axis. It's helpful to know what the parts are in order to decide like how involved with this you wanna get, right? In this demo, I tried to demonstrate sort of how flexible it can be to think of a graphic in the different parts. But my goal isn't to build a general purpose plotting library. And so it's up to you to decide in your application how much of the subtraction you wanna apply. So that went pretty quick. There's an example on GitHub, if you're interested. It's got all the code for that demo. Lots of cool things in it. I put little transitions and such that also demonstrate that aspect of data visualization and number. Does anyone have any questions? Cool, thank you.