 Great, great. Hey, OpenViscount, our team from Interactive Data Lab at the University of Washington. I'm happy to present you Vega-Lite, a grammar of interactive graphics. So we already have so many languages for creating visualization. And we actually like them for many reasons. Why do we need another language? Let's first walk through what we like about this language and see how Vega-Lite goes beyond these languages. To support an expressive range of graphics, many popular visualization tools adopt the idea from the Grammar of Graphics by Lee Wilkinson. Just like how grammar informs us how to compose words into sentences in English, a grammar of graphics provides primitive building blocks for composing an expressive range of visualization. For visual encodings, these building blocks include data that we want to visualize, transform like fielder, aggregation, and building, graphical marks such as bar, point, and line, the encoding mapping between the data and property of these marks, scale of function that map data values to visual values, and finally, guys including existing legions that visualize the scale functions. With these building blocks, tools for customization design such as D3 and Vega also fight grain control for composing interactive graphics. However, these two, as you may know, require verbal specification and some technical expertise. For example, creating a symbol bar chart in D3 require at least a few dozen lines of codes and some level of expertise in JavaScript and SVG and Canvas, and so on. In contrast, to support rapid exploration, bar in terms of exploring data and exploring design, tools such as ggplot2 and Tableau allow more concise specification by omitting low-level details such as scale and guides. These two then infer sensible defaults for these omitted properties. They also allow customization by letting users override these default values. However, they provide limited support for simplifying interaction techniques. In designing VegaLite, our mission is to facilitate explorative data analysis by providing an expressive yet concise language to specify interactive multi-view graphics. Instructed by ggplot and Tableau, VegaLite provides a concise language to express a broad range of visual encodings. In addition, VegaLite goes beyond this traditional grammar of graphics by providing operators for composing multi-view graphics. And finally, we also present building blocks for composing interaction using selections. With VegaLite, all of these building blocks are available in a single unified language. In this talk, we first show you the design of VegaLite. We will then show you how you can use VegaLite as a programming tool and how it can enable higher-level application and recommendations. Next, Dominic going to show you a single-view specification in VegaLite. When crafting a visualization, we create a visual representation of abstract data to amplify a cognition. The main task of a designer is to define how abstract data should be encoded visually without having to worry about low-level details, such as axes, scales, legends, and how to draw these. Take, for example, this abstract data data with records for Seattle. The raw data has no inherent visual representation. To see the spread of temperature in Seattle using the data on the left, the analyst can decide to create a strip plot. In VegaLite, a single view is specified by encoding data properties of marks. A strip plot is a visualization where we encode the data value as the X position of a tick mark. Temperature is a quantitative variable that shows a magnitude. In our syntax, you need to specify the data source, then the mark as data representative mark, and, most importantly, the mapping of data values to visual properties of this mark. Our language to express this is declarative JSON, which is easy to process with common programming languages and is also native to the web. Behind the scenes, the VegaLite compiler resolves ambiguity about low-level details, such as scales and axes, and does so with reasonable defaults. Omitting these details allows for a more concise specification. The defaults in VegaLite are designed to follow best practices in visualization and potentially a style guide. However, users can overwrite these decisions made by the compiler to customize a chart. The strip plot that we created shows the spread of temperature in Seattle, which ranges from a couple minus degrees Celsius to around 36 degrees. OK, but looking at the strip plot, we can see the spread, but not how many days have a particular temperature. A histogram shows the number of records that fall into particular ranges. In other words, histograms visualize a distribution. A histogram is just a bar chart with a bar as the underlying mark. The X position of the bars is determined by a binned field. And the aggregated count maps to the vertical position or the length of the bar. How might we transition from the stick mark, sorry, from the strip plot, to the histogram? First, let's bin the temperature data. Now the temperature is discretized, and we can aggregate it. We can now encode the count aggregate as the Y position of the tick mark. And then finally, switch from a tick mark to a bar which creates the histogram. And so now you know how to create a histogram for quantitative data in Vega-Lite. To help you make sense of histograms, Vega-Lite uses a binning algorithm to discretize the quantitative data in a human-understandable way. Beyond that, Vega-Lite also determines sensible defaults to parameterize the binning. In the histogram, we bin the x-axis with a positional channel, which is a positional channel. Instead of showing an ordinal x-axis, Vega-Lite created a quantitative axis as the guide. We know from the visualization literature dating back to Cleveland and McGill that position is highly discriminable. And because of that, we can use a large number of bids. If we instead use a non-positional channel, such as color, opacity, or shape, Vega-Lite generates a legend with range labels as the guide, and it uses fewer bins, as color in this case is less discriminable. Having a visualization grammar allows us to reason about the specification and optimize the visual output as we've seen. With a grammar, we can also incrementally optimize a chart as I've shown or add encodings. For instance, if we encode the dominant weather type with color, we get a histogram that shows us what the dominant weather type on days with a particular temperature was. Let's customize the colors so that they better correspond to the feelings that we might associate with some of these weather types here, such as grave or fog. When we add the color encoding, the Vega-Lite compiler automatically stacks the bars instead of layering them on top of each other. Because of the stacking, the overall shape of the histogram is preserved, as you can see on the right. To do this kind of reasoning, the Vega-Lite compiler took into account the channel, which is color, and the mark, which is a bar, and automatically enabled stacking, which is a layout transform. So these three components interacted in a non-trivial way. And this way, it created a chart, which is more expressive than a bar chart with overlap. To show you this overlap in the left side, I can overlay the bar chart with the line chart. And you can notice that the bar for snow was completely hidden behind the other bars. The stack histogram more accurately shows the data than a native version would, naive version would. However, it may still be hard to compare bars for particular weather. For instance, you just have to see how these two gray bars for fog compare, because they do not share a common baseline. We can change this by switching from a color encoding to a row encoding. And this way, we lay out the bars in space. And this is the first example of layered and multi-view composition in Vega-Lite. Vega-Lite has a grammar to compose views in new ways to fit new opportunities. The grammar we developed has four basic operators. You've already seen the facet operator. It partitions the data by an ordinal field, and then creates a view for each partition. Facet is the only operator where each view gets a subset of the data. In the other operators, every sub-view gets access to the full data. The layering operator stacks multiple views on top of each other. And with concatenation, you can combine arbitrary views to create complex multi-view displays. And lastly, repeat is a concise and data-driven way to concatenate many charts. As a running example, to walk you through these four operators, I will use a bar chart that shows the monthly precipitation in Seattle. As you might know, Seattle is beautiful, but we try to scare tourists away by claiming that it rains all the time. In this chart, the x-axis encodes the date, which is grouped by month, using the time unit keyword. The length of the bars shows the mean precipitation per month. And you can see in July, it's actually quite low. The layering operator allows us to draw multiple plots on top of each other. This is useful, for example, to add annotations to a chart. To add it, you nest the specification under the layer keyword. We can then add a second specification to the list of views, in this case one that shows the mean precipitation using a rule mark. Note that we only specify the y-position of the rule, so that we get the overall average throughout the whole year. The two layers automatically share an x and y-axis, and the rule spans the width of the whole chart. Another operator is CONCAT. With it, to use it, you can take a single view and visually concatenate it with other views. In Vega-Lite, you invoke concatenation by wrapping multiple single-view specifications in an array and nest them under the CONCAT keyword. A recurring design pattern that we identified is that concatenated views are the same specification. They just use different fields. And because this is such a common operation, we promoted it to its own operator, and we call it repeat. To use it, we start with a specification of a single view. Then, we nest the specification and add some parameters. The first one is a list of fields that we want to create views for. And secondly, we update any field reference that we want to switch out to the repeated field. And now, the repeat operator creates a view for each value in the list. So we get one view that shows the precipitation, one for the temperature, and one for wind. With repeat, we can create multiple charts that are similar, but use different fields. And we can do so in a data-driven way. You can also take this a step further and use the repeat operator to repeat over multiple lists. The operator then creates a view for each entry in the cross-product of the two lists. And the result of repeating a single scatterplot is a scatterplot matrix, or splomb for short. Of course, you can repeat arbitrary views, not just scatterplots. Scatterplots or splombs are used to investigate the relationship between multiple quantitative variables, for example, as part of a dashboard. You can, of course, use VegaLite or any other visualization library to create dashboards with many views. Some of them may be composed. However, as you put together these displays, you have to manually manage the data and lay out the views. This is where the grammar aspect of composition comes into play. To a little straight desk, let's dissect the dashboard on the left. First, we have the single views, and we then use facet, repeat, and layer to lay them out in space, combine, or annotate. In VegaLite, we can then recombine these composed views again using the four basic operators that you've already seen. For instance, the three layered views on the right can be created in a data-driven way by repeating the layer. But this hierarchical composition can continue. We can take the output of the repeat operation and concatenate it with the splomb, and then again with the childish display at the bottom. So now we have a complete dashboard described with a single specification, and because it is declarative, VegaLite can optimize the underlying data flow and the visualizations. The VegaLite JSON specification for describing hierarchical composition is as you would expect it. In the specification, you can use a nested specification in place of a single view specification. And VegaLite automatically reasons about how data management and scales and access should be combined. Dashboards are a powerful way to look at multi-dimensional data. However, to interrogate the data further, you want to interact with linked views and uncover hidden relationships. And now Arvind is gonna show you how to express these in VegaLite. All right, so the core abstraction to do interactions in VegaLite are these things called selections. And they're very much analogous to how the encoding channels work. So they define the defaults for three components. The first is the event processing that happens, or what triggers the particular interaction technique, are they mouse moves, button presses, so on and so forth. The second is what are we actually interacting with on our visualization? Are they data tuples or marks or things like that? And finally is this thing called a predicate function. It's just a Boolean function that allows us to select far more points than what we actually interacted with. So these are all a little abstract right now. Let's make it more concrete with an example. So here is a simple VegaLite specification for a scatter plot. What we're showing is the horsepower along the X axis, the miles per gallon along the Y, and all of these cars are colored by their country of manufacture. Now the most basic definition for selection is a name here picked and a type. And we can use some conditional logic to change the color of the points based on what is being clicked. So how is this interaction working? Well, it's the selection type that is driving a lot of the default values. In particular, the single selection is saying on click, select a single point. Now I can swap out that single with an alternate type, like multi, to be able to select multiple points. And now what the defaults are saying are on the first click, select a point. And on additional shift clicks, just toggle those points in or out. Now just like the encoding channels, I can go ahead and override the defaults so I can instead say, you know, drive the interaction technique using mouse hovers. And now I get this sort of interesting paint brushing effect which I could use to maybe indicate a region of interest in my data set. But if I always had to go in and override these defaults, well, interactive specification would be very tedious, right? When, oh, now it's back. So besides these selections, in Vega-Lite we also introduce selection transformations. Now this should sound analogous to data transformations and they are. So they're manipulating some part of what it means to be a selection. So here is our single selection from earlier. And as a data analyst, we might have a question of, well, how does the number of cylinders a car has play into this trend that I'm seeing with miles per gallon horsepower? I can answer that question by invoking something called a project transform. And what the project transform does is it rewrites the predicate function a selection uses so that when I click a particular point, all the other points that share the same number of cylinders are highlighted as well. And so with this interaction, I'm able to see these sort of three groups of cylinders, right? The eight-cylinder cars, which are the most powerful, right at the bottom. The six cylinders at the mid-range and the four cylinders up at the top. But just sort of this direct manipulation clicking doesn't give me a sense of how exhaustively am I searching through my data set. Other cars that have maybe three or five cylinders, I don't really know and I probably don't have a good way of finding out. And so besides sort of direct manipulation, we can also invoke a bind transformation to add some dynamic query widgets to our interaction. And now not only does the same clicking interaction work, but as I scrub that range slider back and forth, I can see, aha, there actually are some three and five cylinder cars, but they're only a handful. And so I was probably not likely to click on them directly. And because VegaLite is such a concise specification language, we can go ahead and make very small changes to rapidly generate and evaluate hypotheses. So here, for example, I've just added a couple of lines of additional code and now I'm looking at the effect that year has on this trend. And perhaps luckily for us, we're seeing that as I scrub that year slider forward in time, my cars are getting more efficient, right? Those highlights are slowly moving up the scatter plot. But all of these selections and selection transformations are still part of a grammar of interaction. We're not talking about interaction templates whatsoever. And so they're reusable across a variety of selection types. So here, for example, is a continuous region selection called an interval. And by default, it allows me to select a region in both the x and y dimensions. But I can go ahead and use my bind transform to just restrict it to a single dimension. Here the x axis, or excuse me, the project transform to restrict it to a single dimension. And I can use the bind transform to bind the selection to the scale functions. And he guesses what'll happen when I do that. This happens. I start panning and zooming my scatter plot. This can feel a little magical, but let's unpack what's happening. And what's happening is that we're establishing a two-way binding between the selection and the scales. So the selection is being populated by the scale domains, and then the scale domains themselves are being driven by the selection. And all the interactions I've shown you so far have just been restricted to these single views. But the reason we designed those multi-view operators alongside this grammar of interaction is because these interaction techniques are most effective when we're coordinating multiple distinct visualizations. And so what I can do is just add a single repeat operator to go from a single scatter plot to now a matrix of scatter plots where I'm brushing and linking between them. And I can also add another interval selection to pan and zoom all of these scatter plot cells in concert. And with these sort of selection and transformations, a whole number of common and custom techniques fall out quite naturally. So here, for example, I can start setting up what an overview plus detail or focus plus context interaction might look like. I've got an area chart, and I concatenate it with another area chart. And in that top chart, I add an interval selection to give me a brush mark. And then in the bottom, I use that brush to drive that detail interaction. And the final example with interaction that I wanted to walk through is how we might do a variant of cross filtering. So we'd start with a histogram. This is flight's data, and we've got the bind hour that the flights took off along the x-axis. And we can repeat this for other dimensions in my data set. So the delay and the distance of these flights. Now this is, we wanna go for a layered interaction, and so we can add a second layer and color that layer in gold. And what our interaction technique is gonna do is have these gold bars fill in the blue bars instead. So we've got our visual encoding set up. Let's collapse that down and focus on our interaction specification. So of course to do cross filtering, we need an interval selection, and that gives us a brush in each of our histograms. And this is cross filtering. So the final step is to take that brush and filter our data. And that's really all we need to get this sort of cross filtering interaction working. And what we're talking about is 35 lines of JSON in a very composable manner like I've shown here. So Ham is gonna walk us through the last part showing how we use VegaLite in a number of applications. Cool. So let's first see how can you use VegaLite as a programming library. To use VegaLite, we provide compiler that compiles a VegaLite specification into a lower level Vega certification. You can then use Vega's runtime and create visualization ordinarily on the web and via server. Or you can also render output in either Canvas or SVG. For example, if you like Canvas like Kai in the talk this morning, you can just change one parameter and just use VegaLite specification in Canvas. You can also leverage Vega's reactive architecture and render output, sorry, for streaming data. And another point is that Vega also supports theming. So suppose your organization has type guide, you can easily customize or recharge in the same organization. By using a declarative JSON syntax, VegaLite can serve as a file format for altering and sharing visualization. And in addition, it can support binding for different languages. To perform interactive analysis in Python, you can use a library called Altair led by Brian Granger and Jack Wanderplatz. With Altair, we can create this histogram that we have shown earlier using a native Python API that maps directly to VegaLite syntax. This can be convenient if you do data analysis in Python and Jupyter. We'd like to also highlight that Altair's API is automatically generated from the VegaLite's JSON schema. This means that whenever we release a new version, the Altair team mostly need to just push a button and get their API updated. So this approach is also not limited to just Python. There are also other projects that try to wrap VegaLite in other languages as well. To give you a sense of how well this labor has been received, in a recent article by Dan Saber, which compared different Python data visualization libraries, Dan concludes that it's the one-to-one-to-one mapping between thinking, code, and visualization. That's his favorite thing about Altair. And of course, the underlying VegaLite language design. The Altair team themselves are so very excited about our design and mentioned that they see Vega and VegaLite as perhaps the best existing candidate for principal linker-funger for data visualization. So in addition to using VegaLite as a programming tool, VegaLite can also enable higher-level application that automatically generate and recommend visualizations. For example, I would like to show Voyager, which is a visualization tool that augments manual sophistication with recommendation. With the goal to promote breadth and reduce tedium in data exploration. To support this goal, Voyager used VegaLite to enumerate and rank both data and coding to make recommendations. Here's the Voyager interface, showing a dataset about cars. In Voyager, we provide multiple interaction methods for you to explore data. Before you have to manually create any view, Voyager first show unique wire summaries of all variable as the initial view. This will help encourage analysts to examine different variables before diving into data relationships. For example, we can see distribution of number of cylinder of car in this dataset. We can see distribution of all other variables. You can see that here, many of them are formed in the 70s and 80s. After examining different variables, Anus also have freedom to create any specific view using a drag and drop interface, similar to Tableau. For example, dragging horse power to the edge shelf, create a dot pod, like you have seen here. Under the hood, Voyager create a VegaLite specification that maps horse power to the X channel and use VegaLite to render the visualizations. Based on the specified view, Voyager also suggest different type of related views to help analysts to score relevant variables and alternative ways to summarize or encode the data. To recommend related summary views, Voyager first consider the VegaLite specification of the dot pod up here and determine if it can calculate summary views by binning or computing mean. When it applies binning, it also consider that applying binning only show if a particular bin range contains value. That's not so helpful, right? So Voyager also add count and you get something like this, but you can see that it may not be the best mark for this data. So another part of this enumeration process, Voyager also use VegaLite to enumerate different alternative encodings and choose them based on perceptual effectiveness ranking. In this case, buy the best mark for showing a histogram. So that's how Voyager use VegaLite to consider design alternative and recommend a related summary view. Voyager also recommend other type of related views such as plots that show horsepower with other additional variables. This can help analysts consider a relationship that they might otherwise overlook if they have to manually create views. To give user more control to create a collection of views manually, Voyager also provides a wild heart to specify a wiring property of the visualization collection. For example, on the left we have list of wild heart fields and if we drag quantitative field wild hearts, then we create a gallery of scatter plots showing horsepower with all other quantitative variables. Of course, Voyager use VegaLite to enumerate different quantitative fields on the wide channel under the hood. With all these different interaction methods, analysts can rapidly explore different aspects of the data and also create fixed view visualization to address their goals. So we have seen that VegaLite can be used to support recommendation of data and visual encodings. One area of future work that we're excited about is applying similar approach to recommend interactions. With VegaLite's unsized syntax for composed interactions, we can just add one small targeted change to a specification and generate a dramatically different interaction techniques. With these small changes, we can easily write a program to enumerate this interaction design space for visualization and recommend interaction techniques as well. To summarize, in this talk, we have shown you the design of VegaLite which includes building blocks for satisfying a broad range of single views and compost them into interactive multi-view graphics. We have also shown that VegaLite can be used directly as a programming tool either on the web or via wrappers. And finally, VegaLite can enable graphical user interface and recommendation in a system like Voyager. Today, we are happy to announce that we are releasing VegaLite 2 Beta which includes our multi-view composition and interactions apart that you have seen in this talk. We have tutorial and documentation online at vega.github.io slash vegaLite. And if you're interested, you can start playing with VegaLite in our online editor without the need to install anything. And finally, we'd like to thank our contributors to VegaLite and related projects including a number of students at UDAP, our research collaborator at UDAP and at Tableau, our colleagues at Bokub, especially Jim and Kedem to improve UI of Voyager, and of course, Jake and Brian who built Altair. And we'd like to also thank our research sponsor. With that, we hope you are excited about VegaLite.