 Hi, everyone. Welcome. And thank you for joining us today for the first call in our Developer Works Open Tech Talk series. Developer Works Open is IBM's open source incubator and showcase, and is designed to help connect IBM's open source innovators with potential contributors and anyone else who has an interest in their project. Our goal is to shine a spotlight on these innovations in order to help grow their community and ecosystems. We invite you to explore the Developer Works Open website, including project overviews, blog posts, developer stories, and community links. This Tech Talk series will take you deeper into specific projects to help you understand the technology, goals, challenges, and plans for these innovations. Please be sure to view the various resources here in our Tech Talk environment. We'll keep adding more as the series progresses. Our presentation will be recorded and will be made available on demand shortly after the conclusion of the call. Today's topic is Brunel Visualization, our presenters are Graham Wells, data scientist and architect, and IBM Analytics, and Dan Rope, STSM, data engineering, IBM Analytics Solutions, Office of the CTO. We have several demos for you today. If you experience issues with bandwidth, you can view them from the handout section. And also a note that we are not using chat. We're using moderated Q&A, so we will answer your questions as we can. And once we answer them, then the question will become visible. OK, without further ado, I'd like to turn the call over to Graham Wells to take us into Brunel Visualization. Take it away, Graham. Thank you. So what we're going to talk today is about an open source language we designed for interactive data visualization. The goal of Brunel is to make it easier for people to use visualizations, specifically visualizations which we want to present on the web. And to do that, we wanted to look at a language which was absolutely minimal in the sense of we didn't want extra craft, we didn't want extra facilities. We've designed a lot of languages at IBM, and some of them were more detailed for responses where we wanted to do things like document building or specific presentations or to conform to particular styles. But for Brunel, what we wanted was simplicity and power. Basically, we wanted to allow people to do the sort of displays that you want to do when analyzing data and trying to get value out of data as rapidly as possible. So looking at the slide here, we have a couple of examples, three examples, running across the top. And for each of them, the complete presentation is given just below it. So we have, running across, we have a tree map, a network chart, and a state map. Below those are the three Brunels to do it. So looking at that, the goals there are simplicity of usage and the ability to allow data scientists and people who work with data to create that a lot of fuss. So the way we do this is using a language, as we mentioned. The language is important because it helps us amplify the design thinking process. We think in terms of language as human beings. It's one of our major big wins as a species. We can talk about language and part information on it. And because we have a language, it allows us to talk to computers as well. Computers don't speak the same way we do. So a language, a small domain specific language, allows that free communication, which is particularly important when we're talking about data science, data, computers, and people all working together. It's amplifying our design thinking process. However, it's important when we do that that we don't lose a set of power on top of it. So what we did is when we designed Brunel is we designed it as a system which lives on top of an existing powerful low-level API. In this case, we chose D3 as our system there. It's also compatible with IBM's RAVE solution as well. And what we do is the Brunel language describes a chart. And we take that information and compile it down to a D3 specification so that if you want to get at that low-level power, you always have it. Our experience is being that people working with visualizations always have interesting, unique, and informative solutions. So we want to make sure that we don't take away the ability of people to add that kind of unique specification there. But to get started, and for a lot of cases, you just want to spend less time fiddling with the details. One question we all get asked a lot in these talks is, why do we choose Brunel as the name? Well, one of the people, if you look at it now a while ago, about people, and Isabel Kingdom Brunel came up as one of those people. Not a well-known name. He was an engineer, and his designs revolutionized modern engineering. And we like this name for a number of reasons. First of all, this is an engineering product. It's not a theoretical idea. It's not something designed to do something new, cool, and wonderful. It's designed to make life easier. Also, his emphasis on bridge, transportation, and tunnels fits with our goal to form a bridge between data science and visualization between the data, the idea of what you want, and then a quick solution. Going places easy, fast, and safely. Finally, I'm a lousy typist, and I can type Brunel way more easily than I can type. People who might use Brunel and how they might use it. And we're going to focus on these three areas here. Data scientists, which is kind of our primary goal, someone who's interested in data, not necessarily as interested in the details of the presentation, but wants to be able to understand their data. Data journalist, someone who is interested in those details and wants to be able to disseminate his information over his point of view to a very large audience. And the application provider, those people whose customers work with data, and they want to allow them to modify and understand data. So looking at data exploration, our first case is the data scientist. Someone who uses a Python, Spark, our notebooks daily. And one something which is easy to use. They're already learning a set of languages already. We don't want to have something confusing, or we don't have to make them learn D3 or do some visualization there. So we want to be able to do something here for them. There are specific visualization solutions for each of those environments. For Python, there's some good solutions there. Boca is one I'd recommend, and with R, there's a lot of grammar and graphic-based visualizations, like a GGplot, GGplot2, some of the work Havley-Wickam has done. Really good stuff there. But if you want something which will work interactively on the web, and you want something which will work from all data notebooks, and you want it to be easy to understand, there isn't that solution. That's why we looked at Grinnell. We're going to hand over here to Dan Robles, my colleague and one of the primary engineers on this task, and we'll walk you through some of the visualizations and some of the demonstrations. Thanks, Graham. So right now it's said, I think the most popular use of Grinnell is by data scientists. One of the goals we want to achieve is we want to make it as convenient as possible for data scientists to make use of Grinnell and the visualizations that it can do. So where we're starting with that is we want to integrate Grinnell into these tools. So, and the first place we're starting is with notebooks, in particular Jupyter notebooks. Now, if you're not yet familiar with notebooks, the concept in general, what these are, these are essentially an interactive analysis environment where you have a language running that's typically used by data scientists, such as Python, R, Julian, several others as well. But this is entirely running in a web-based front end which affords a lot of conveniences to the data scientists, specifically with sharing and so forth. And the whole thing is open source as well. So what we're doing is we're integrating into Jupyter notebooks to start, but there's a lot of other places that we'd like to integrate as well, that's used by data scientists. And what we'll do is we'll provide, typically there's something called a magic function if you're familiar with the nomenclature. This allows us to add our capabilities to the languages you might be currently using within the notebook. So we've got three demonstrations here. These will be pre-recorded videos. And by the way, if you're having difficulty with the live stream, there's also links in the hand-up section to YouTube videos of these as well that you can watch either now or later. And so we'll go ahead and get started with these three demos. Now the first one shows some essential graphs and basic graphs. It gives you a good sense of the syntax of Brunel and the kind of things you can do with it. And it also covers some of the interactivity capabilities that you can do and then ends with how it shows you how you can use Brunel along with some of the machine learning predictive analytics types of features that are within Python. So this first one is a Python notebook. Let's go ahead and take a look. OK, we'll start by loading in the whiskey data set and importing Brunel so we can work with it and taking a look at the fields that are in the data. Now the first graph I might want to try is a heat map of using the categories of whiskeys by the countries that they're produced in. You can get this syntax from our website if you want to get started. So let's go ahead and execute that. Take a look at it. We can see the accounts for the individual areas. Now, next what we might want to do is filter this by the alcoholic beverage content. It's very simple to do, adding a filter statement, and then we can slide the slider and see high values of alcohol, beverage content, and low values. Now, next what we might want to do is instead of filter it that way, we might want to juxtapose this graph with a second graph. So how we do that is we'll add a second graph described by the Brunel. And in this case, what we'll do is we'll look at a line where we're looking at the ratings, a relationship between the ratings by the prices of the individual whiskeys. And let's smooth that so we can sort of see more about the trend. And while we're at it, let's go ahead and just add in the underlying data points to that graph as well. And the last thing that we'll need is a little bit more space. Let's make the hole width a little bit wider so we can have some room to have one on the left and one on the right. Go ahead and execute that. Notice Brunel has automatically scaled the data on the y-axis to make it easier to see the patterns, but we can go ahead and change that to be a linear scale if we want, just by adding linear onto the y-axis. What we'd like to do here is connect these graphs so that when I click on a cell in the heat map, I can see the resulting plot on the right for the data that's in that cell. So to go ahead and do that, we'll add an interaction onto the heat map saying that when there's a selection made in that graph and then we'll have the other graphs respond by saying there's an interaction and we're gonna filter the content of these graphs, the two points in the line, to what had been selected. Let's go ahead and execute that. And now you can see when I click on a cell and a particular cell on the heat map, I can see the data for just that cell on the right now. Now the last thing I might wanna do is actually see some of those outliers and what those values are is what I can do is just simply add on a tool tip so that I can reveal those values when I mouse over them and now when I go back and click on a cell and move the mouse over a point, I can see the individual brands of the whiskies in this case. And lastly, we'll show things to get interesting when you combine what you can do with Brunel with the statistical and machine learning capabilities of Python. So for example, using scikit-learn here, we can build a decision tree model, try to predict what the price would be given the age, weight, and whatever content of a whisky. And then what we can do is we can plot the residuals of that and take a look at how well our predictions are doing or how well our model's doing, and of course include a tool tip so we can see what the actual values were. So that's an example of some basic charts. We have another example here. Again, this is using the Python language in notebooks. And in this case, we're gonna look at some different styles of charts. And we'll cover things like bubble charts and even things like geographic maps. In this notebook, we'll demonstrate the kinds of non-standard charts that you can do using Brunel. We'll show this using data from the US states. So first again, we'll take a look at the data, the fields that are in the data, and we'll execute some Brunel. Let's first start off with a bubble chart since those are often fun that shows the, well, each state is sized by its population and its color is corresponding to the word for in the presidential election. And now notice we can group this by region by simply adding another field to it since there are states within regions here. Now these are not divided by their regions across the United States. Now notice that this is actually quite similar to the structure of a tree map, which is an important aspect of Brunel, that all of the actions are relatively orthogonal to each other and simply by changing bubble to tree map, we can now see how that data looks in a tree map. And finally, changing it to a cloud gives you a tag cloud. We'll see in a second here. Of, again, it's the state names sized by their population and still colored by who they voted for in the election. Next, we can use Brunel to draw a map of this data. This field had a field with a state name in it, so Brunel can recognize that with a map action and draw a map colored by presidential choice. And of course, we can use all the other actions as well. So for example, we can use opacity and apply those to the values of the populations in this map and we see the results. And we have one more example. Now this next example is going to use a different mind, which is going to use another open source project from IBM Developer Works Open. It's called Spark Kernel or it's now known as Tori. And what this is, what that team has done is they've done a notebook implementation that allows you to use Spark and use Scala notebooks through that. And what we've done is we've done an integration on top of that so you can draw Brunel graphs. And so we'll take a look at use within that style of notebook and this will include also how to use Brunel graphics on some of the data mining capabilities. So the first step would be to allow Tori to use Brunel code by loading the jar using the magic command. And then next we have some Scala code here. And what this is going to do is it's going to load data from the Titanic data set and it's going to form a set of association rules. Now, first it's going to extract a set of unique items like males, females, crews, what adults are first class and so forth. And then it's going to calculate a set of rules between those. And eventually what we're going to do is separate this into two data sets, one with the items and another with the rules. And then finally what we need to do is we extract only that the rules where we have a survive yes or no so we can have what that consequent is. And so let's go ahead and execute that. And again, we're going to wind up with two data frames as they're known in Spark. And we will use these to show the network of the associations between those items in Bernal. So we'll use the edge element on the rules and we will connect the contestant and consequent and then the network will be the individual items and that will give us a network graph. So we see below, we can see those resulting in a yes survive and no survive and what is driving them. Now we also have confidence levels for each of those rules. And so one thing we can do is we can apply opacity to that level of confidence in the network graph and the result of that will be that those rules having a higher confidence will appear darker than the ones with lower confidence. And of course we can add a tool tip as well and we can see what all those values are for each of the rules, the confidences, the frequencies and so forth. So hopefully that gives you a sense as to how the language works and a little bit about the individual commands which we call actions in Bernal and how they're orthogonal to each other, can be used in conjunction with each other and it makes the language fairly robust and you don't really need to get down to the highly detailed level specification type things that's what can often be required for visualizations software, visualizing languages. And also hopefully this gives you a good sense of possibilities with the integrations. Oftentimes you can use different languages, different data science languages have particular strengths and they might use one language for one thing and another for another thing, but it's sometimes you always tend to wanna draw pictures. So having that consistency of being able to produce pictures across different languages in a similar kind of style language is we think is a helpful thing. So for data scientists, we do, for to be able to get access to use Brunel, we do have these notebook integrations. You saw the ones for Python and Spark, for Python were available on PyPy today. For Spark, the percent adjar magic command will allow you to use it in the Tori notebooks. We do have availability for our notebooks and we also have a deployment of Docker image that includes a sample Jupyter notebook with Brunel running it as well. So those that are more aggressive of course, you can, this is an open source project that all the source code is available on GitHub. It's Apache license, fully usable, redistributable and so forth. And we also provide, and we're gonna show an example of this in just a second, there's an online application where you can use Brunel directly to get used to this and that syntax and see different examples and try different things out and so forth. That's available at the link that's given there in the slide. And we hope that this will promote people to understand it better and share ideas on different visualizations and so forth. And we do provide a language tutorial online as well. This will give you a good introduction to the syntax, the different capabilities. So they're the live tutorial, we can actually try things when you're reviewing the features of the language. So for our next type of user here, we believe that Brunel can be a useful tool for data journalists. Now over the last several years, probably seen in newspaper articles and bloggers and so forth, a lot of articles can be written about data. And these essentially are, they can almost be like little mini research projects to do these things. So a lot of work that's involved in developing an article that you might read that's about a certain topic of data. And if they go deep enough, you can get into research tools, find the conclusions and then publish those results. So we think that actually that Brunel can serve a helpful purpose here because tools can help. Sometimes when you're using the analysis tools to do what you wanna do for your article, the results of that isn't necessarily compatible with the medium that you wanna publish in and sometimes on the web. We think that Brunel can help here because it allows the person who's defining the content of the article to think in terms of the visualization and finding the visualization. And at the same time, the output of that is it's essentially JavaScript in D3, which can be much closer to the medium that these things are often published in as well. So we have an example here, another demonstration. Graham is a big Doctor Who fan. And so in this case, he found some data on the web about villains in Doctor Who and he wants to look at this data, examine it and write a blog article about it. So we have a demonstration video here, I'll show you how we could do that. We start off by going to the public application for building Brunel, brunelvis.org slash try. And now we're going to upload a data set consisting of the Doctor Who villains taken from the Guardian's public data sets. Whenever we load a new data set into Brunel, it tries to match the fields from the new data set into the fields from the old one to create an overall chart. It does a pretty good job here, but we need to modify a couple of the fields just to make it a little bit better again. In Brunel, you can just edit the text and then press the return and it'll automatically reprocess it and rebuild your new chart. Now I can try some variations, for example, by splitting up the bubble chart into a more hierarchical version, you know, a little bit small. Maybe adding some tooltips will help or modifying the tooltips to be a little more specific. Now, you know, I'll leave the tooltips but I think I'll get rid of that motivation. That looks pretty good overall. Maybe one final thing, I'll just re-sort them so that the earliest villains are in the center of the chart, the oldest villains in the edges. Yep, I'm pretty happy with this. I think I'll go off to publish this in my WordPress site. So let me go off to the deployment stage. I'll publish this in an iframe, which means it's gonna be interactive and fully operational within my WordPress site. Let me head off to brunelviz.org, new post. To put the iframe in the page, all I need to do is paste it in here and then edit it a little bit. WordPress doesn't like pure HTML, so I'm using a plugin for iframes. I just need to reformat it to make that work. A little bit more text and we should be ready to go. Okay, looking pretty good. Yep, all interaction. It's working as it should do. I'm ready to go and add some more text and turn this into a real article. So hopefully you can see it's relatively easy to define a visualization and publish it. I want to mention a few other things about that online application if you go and visit it and try it out. Specifically, in case it went by a little too quickly, one of the things that you can do there is if you, there's a lot of, there's a gallery and there's a code book, so there's a lot of example visualizations there that you can peruse through. And as you're doing that, if you're interested in seeing how your data looks in that visualization, that's what that upload feature can do for you. Any visualization that's there, including any syntax that's been pasted, you can essentially just upload your data as a CSV file and it will give you a sample visualization, that same visualization, applying it to your data. And typically you might want to move the fields around a little bit, it tries to do a decent job, but it can be a good way to get started getting familiar with the syntax and it can also be a quick start for getting some graphs that you might want to use as well. And even for the data scientists, you could easily take the resulting syntax and bring it back to your notebook and then work with it further there. So we're hoping that that application can be useful for people to use in several ways. The other thing I'll mention about it is that there's actually other ways to deploy as well. So you saw in the video of taking iframe syntax, but there's a couple other ways to deploy visualizations as well. You can actually get at the underlying HTML, JavaScript, and CSS. So if you're more of a low level type of, you know, working with JavaScript and so forth, you can do that. You can get that code, you can take it away and essentially the visualization will run as just standalone JavaScript at that point. Okay, so we think that Brunel is a fairly innovative language for data visualization. We think it's at a high enough level to provide the simplicity, but it also provides a flexibility. So it's intuitive. We think the individual language elements are expressive. They're not biased towards any specific language that it might be integrated in. So you can easily use it across different languages. That's one of the goals we'd like to achieve. It's robust, as you can see. You can, it'll work, often work with different kinds of actions that are put together and try to, it'll figure out the best graph that it can do for that. We try to include some best practices for visualizations so that some things can happen automatically for you. You can oftentimes just override those things if you're not happy with it, but it can save time. And finally, it's engaging. The interactivity is part of the language. What we strive to do is provide some common types of interaction techniques so you can specify them quickly without having to get into detailed event type programming and things like that as well. Okay, so our last example of how to use Brunel is the application integrator. So in this case, what we're talking about is somebody who's trying to create software for somebody else to use, and that software is going to be including some visualizations within it. So there's a couple ways that you can use Brunel to do this. A fast way to get started is you can actually use this little prototyping tool. So sometimes you just want to see a particular graph in the application. Well, the end result after using our web application is pure JavaScript. So you can essentially take that JavaScript, pull it off the site, and then put it in the application. You just quickly see how some visualizations might look inside of your application. That's one way you can use it. Another way is, oftentimes, you're providing a software application and it's got some number of visualizations that you want to have pre-created for the user that you want to carefully curate and so forth. You can do that using Brunel, but oftentimes what happens is that end users wind up asking for one additional visualization or one twist on that particular visualization. So we feel that if you have the type of users that can like maybe power users that can work with the syntax, which we think is really not much harder to learn than Excel, then you can open up that language to those end users to get that kind of flexibility right within the application without having to go through a cycle for you to have to release a new version of your software. So we think that that can be a potential use for Brunel as well. Now, to talk about that a little bit further, I'm going to turn it back over to Graham and talk about some of the details of the architecture and integration and so forth. Thanks, Dan. A little bit more about kind of the technical details now and for those of you who aren't interested in the technical details, then hopefully we provided a lot of links you can go off to and browse the web while you just kind of generally listen in. But a lot of people are concerned about that and there's a lot of details and necessary features. When we're talking about something which is essentially a capability brought over by the services, the ability to show visualization is really a capability to show data and help people act on data. So it's really being used by people who are interested in the data, not necessarily the visualization per se. Great, great, pretty pictures, but our goal really is to create visualizations which will work for people to help them make decisions, provide value to their customers and so on. So the details of the application can be quite important. So Brunel provides a set of APIs of Java and REST. We've used in a number of different situations. We have just standalone thick client solutions which just have it in there. If you have a thick client browser solution like JX Browsy, you could just write a straight Java application which just uses it and shows it internally. You can write an application which will then push it out to the web. And a very common instantiation as you've seen is kind of like a server type instantiation. The notebooks themselves actually run the Java locally so everything is all completely local. If you run the solution with a notebook, then your data is kept on your client machine. You run it, you show it only on your client. It doesn't go anywhere. You could also run it as a service on the web. And we package it up. If you go to our GitHub site, you can grab the war files and everything you need. So you can just install it as in a standard web browser and then use the web server, sorry. And then use the standard REST APIs. The basic input is pretty simple. It's the Brunel syntax, which you've seen a lot of, something like this top statement here which is just simple plain text. The colorization we've added here is just for convenience. It's just basically straight text like any other simple domain specific language. The data, currently we have in the native forms, you can get straight arrays of objects if you're in Java, script or Java. You can also send stuff in the CSV files. We'll interpret that. And also we translate from a number of data frames. So our data frames, Python data frames, Scala sort of Spark data frames. Also we're looking in the nearest future to make sure we can accomplish with Python 2 as well as Python 3. And also Python Spark data frames as well. What are these variations? Basically, because we don't have a lot of strong requirements on the data, we just need really just kind of sets of objects. And we have an optional smart data typing mechanism which will say, oh, this looks like it's a date. I'll just assume it's a date if you want. You can turn that on and off as well. So you have a kind of a lot of different options. We also looked at things. People have graph layouts, we have graph layouts here. People might have graph ML or other XML based layouts. And essentially our recommendation there is just use any of the standard sets of tools for converting those off into these simple formats. We don't do a lot of that conversion. We really looked at the data scientists specific conversions and any of the other ones there. There's usually lots of facilities in these languages to tie these all together. So that's kind of the input step, what goes into the Bernal system. What comes out is a set of JavaScript, CSS, HTML, the sort of things you've put together on a web page and designed to be used with a rendering engine D3. So the goal here is we have this information, we put it in there and we get kind of like JavaScript coming out and that's kind of the way it works. So let's work that in kind of a little bit more detail so you can understand the flow of it. Specifically those people interested in data flow can see how it works. First of all, we've got a client and service based system. The service is really the Bernal service. The part that will take our existing data, take the input information there and build the things that are needed by the client. This has been specifically designed as kind of a two step process. The client will assemble the information needed, send it off to the service, which may be local or maybe a remote service. It will then process that and send it back to the client and then we'll be completely disconnected at that point. So all the interactivity will not require the service at that point. And this is a little bit different from other systems. I've got some comments, I think people are talking about R Shiny. The R Shiny service and a lot of other systems require you to be constantly connected to the service. The interactivity is done by sending messages back to the service and requesting it. We don't require that. So we can be used in those kind of environments like an R Shiny type environment or some other system where the service is continually providing it. But we have a slightly more interactive and a slightly more responsive system of doing it. So we make sure that all the interactivity we need to do can be done within the client and the service is not actually needed at that point. So taking this example, we have a little bit of Brunel at the top here and we send that out to the service. And the service will look at that and say, okay, we'll understand this. We'll do some language parsing. We'll understand what's going on. And we then pass it to the data analysis stage and it'll say, what do we need? In this case, it says we need variables called summer, region, and we're going to summarize these by count. So we know that kind of the minimal data table will need. Well, there are a number of ways then we can proceed. One, the client could have sent us the data, in which case where it says the little box here with like big data engine is actually just like a local file. Or it may be that it's something that resides in our data service engine. If we're attached to a more complicated scenario, we can use that data there. Big data engine also is kind of like a flight misnomer when you're working in a notebook. Those are really the data frames living in the notebook. We can go take those, if nation we need summer, region, and count. And what we do is if the client doesn't tell us what data we need, we'll just look for those in the local data frames. If we find a data frame which has those, we'll use those directly. Again, the emphasis being on simplicity and making it work really, really nicely. So that will build us from this possibly big data engine. We could have a Spark system with billions of lines of code or whatever, billions of rows. We'll build us this kind of working data set, regions, counts, and sizes there. That information then gets compiled up and the build generation step sends two sets of things off the client. They're all in the CSS, but essentially we have not only the Viz command, we need it to build it, CSS, the styling, and the JavaScript building on top of D3, but also the data build commands. So we're gonna pass down the minimal data needed. Since this chart is pretty simple, we don't need much. But if we had a filter step on it, or if we had two charts communicating with each other, we couldn't just pass the rolled up data for each chart. If you imagine that we have this chart here, but we also said, oh, we also want to be allow you to filter on another chart, or we want to have two bar charts. So that when you select from one bar chart, we see the results in the other bar chart. We need to send more data than just the aggregated data. So what we send to Brunel on the client side is what we call semi-aggregated or the working data. It's not only the basic data, but it's also the commands so that each chart can re-aggregate based on the smaller set of data to create the chart. And that allows us to do this interactive linking, this interactive filtering that we can do. That's why Brunel will work in this fully divorced situation. Once you've got that, we can pass it down to the service and we've passed down to the client all the data that is necessary to do the interactive. So this kind of works through this in a little bit more detail. When we have this working data, we might have, for example, two sets of elements in the charts, one is a set of build commands and another set of build commands. And this filtering will allow us to take those and then reprocess them out there in that situation. This we really think is a necessary step. I work with a lot of interactive systems and I've actually used the kind of server interactivity response, which can work okay if you've got a local service, it doesn't work too badly at all. But for a lot of people, and especially for this kind of more modern web-based systems and trying to chain things together, as you might with BlueMix, or it is kind of more server-side or server-orientated architecture, we think it's really important that the visualization interactivity is a local facility, does not require the server. Otherwise, you're tied at the speed of the server. You can never get below that, kind of like 10 times a second interactive speed, which makes things feel smooth. You're instead resorting to selection, do something, wait for the update, get it back down again. One of the other, I should just be moving on, one of the other nice features I just saw and the question is a question of how many data points would it need to be to work with? So the big start of service will kind of, you can do whatever you like and set that up. When things come down to the client, they're really limited by the abilities of the client at that stage. So if I'm running at our, kind of our worst case scenario is the one like we test out as our worst support on, is if you've got an iPad 2 running it and you're running it there, it'll work fine. That tends to lose steam at around maybe about 5,000 data points. My five-year-old MacBook Pro, quite happily will deal with about 20,000 data points, even up to 100,000 data points, so long as you're not drawing a scatterplot of all of them locally there. So so long as your data rolls up to that kind of size, it works pretty well. We've actually also had good success in showing a map here. We have a new-ish map service which allows you to build the maps where it'll automatically find the right map for you. It will automatically draw the map. All you need to pass down is like the key names. We've compared that to our database of world names and I have the best maps for you. So for that, the map polygons can be quite complex. At the moment, I think our polygons are a little bit overly complex. They're designed really for high-end browsers, so we'll be building a solution of that to just improve that a little bit and we'll mention that a little bit later. But overall, at times of data set size, the stuff which comes down to the client should really be thinking of, in terms of maybe a thousand, up to maybe tens of thousands of data points is really a good thing to think of it. Your back-end service, which builds that kind of thing, that really is your back-end Spark server. We've had people do prototypes on effectively like infinite amounts of data. So summarizing some of the key technical points here, this is a 100% open source solution under the Apache license. That basically means you can download it, use it for free, anything you like. We provide this because we believe that core facility of what we need in the modern world is to be able to understand data. And we think that a lot of applications will build on those solutions and ways to do that are good. They should generally be available. And frankly, we shouldn't be making money off just displaying the data. We should be building money off how we use that in applications and how we make the whole task easier. Brunel is one part of that step. Brunel runs in Java to create the solutions and then builds browser artifacts so that once you've built it using your Java solution, you don't need to go back to the Java again. We also have a very small footprint. This is extremely important. Our goal is to run client-side code. We run it about the vital support library we need to do the extra stuff in there, the aggregations and those kind of features is about 80K of uncompressed data. So it's a very, very reasonable size footprint. The average top 100 or maybe top 500 websites, a survey of those shows that they average 1.5 megabytes for their page size and pull-down. A lot of that is cached. This Brunel client-side runable gives a cache runable. So on top of the actual code, you need to run the specific chart, which is pretty minimal. We're talking probably about 80K of code there. Also, because we've designed around a best-in-breed visualization system D3, which is in itself designed to be very efficient, we have built specifically around that. So there's no extra kind of crux, no extra things you have to do. You don't have to reformat it. You don't have to do anything complicated. The JavaScript we present, which is actually human-readable JavaScript, so you can go and look at it. But in fact, one thing we didn't mention here is you could take the JavaScript that we create from the Brunel service and just copy that JavaScript and then use it in your application and completely ignore the fact that it came from Brunel. So we could even, although we kind of like you to use Brunel all the time, if you feel like you just want to use Brunel just to create a traditional chart, you could take that and modify the JavaScript yourself and you have perfectly adequate goal. And really this kind of, this whole thing is built on the basic premise here. We want just to make it easier, faster, to build great visualizations. We don't want people to have hassle and the house still doing it. When we build other applications to analyze things, we want this to be a step we can take to say, okay, let people just do it. Okay, I'll hand back to Dan now. We've got about 15 minutes left. We'll wrap up some of the interesting things to show you, but we'll also make sure we stay online and answer questions as well. Anyway, Dan, tell us a little bit more about future plans and ways people can move on. Right, thanks, Graham. Okay, so just to summarize here, so we've, Brunel Visualizations, an open source, Visualization Language is designed specifically for those that work with data, essentially designed by people that work with data, or people that work with data, if you will. We believe that it is simple. We think the syntax is really simply to learn, simple to use, it's flexible. There's lots of different things you can do. So even some of the, if you haven't seen the kind of chart that you want, it may even be possible to use the syntax to actually create that chart as well. Now it might involve, we may need a new feature here, there, so forth, but there are quite a lot of things you can do by combining what is there right now. It's interactive, the language itself encompasses the interactive features. And smart, it'll try to do best practices for you and when possible. Also, it's integrated with modern tools used by data scientists. We've had a start to that. We'd like to do a lot more of that. We think it can be very useful for that audience. Also, it's accessible to both professional and aspiring data journalists. You want to write a blog article about some data, you're able to gather that data, analyze it in pretty easily, really easily create a graphic burnout and publish something interesting about what you've done. So this is an open source project and successful open source projects work around a community and an ecosystem that grows. And so we're interested in growing a burnout ecosystem. And so just, there's lots of ways that we think burnout could be used to grow this ecosystem. Just a few ideas are on this slide here. So one obvious one is new and different language integration opportunities. We have an integration, but there's a lot more that can be done with it. I saw a lot of mention in the chat about shiny and so forth. And so there's a lot of other, a lot more improvements that can be done for our integration. Somebody's interested in doing that. Julia's, another language integration could be done as well. You saw an integration for Sparky's and Scala. So that's an integration for Python Spark as well. And similarly, we have Python 3 support, but not yet Python 2 support. And there's other notebook technologies out there. Apache Zeppelin is an example of a different type of notebook technology. So having Brunel Graphs work within Zeppelin would be interesting to us as well. And any kind of plug-in opportunities are interesting as well and potentially very useful, especially with things like blog tools. So you saw the video with Graham embedding inside of WordPress of Brunel Visualization. Anything that can make that easier would be an interesting integration opportunity. And of course, anywhere, any tool that's using data, managing data, manipulating data, analyzing data, any tool that has any user data is potentially a very useful integration opportunity we think for Brunel. And finally, since right now the output of Brunel is D3, there are a lot of tools out there today that have some form of a D3 plug-in. D3 is enormously popular. And so anything that can accommodate D3 visualizations potentially is a candidate for Brunel to be used for Brunel as well, because Brunel can drive those D3 visualizations and plug those right into those tools. So there could be some really interesting ideas that people come up with in that respect as well. So I think some of you have been asking, so what are we planning on doing next? Well, here's a few of the things we're thinking about right now and starting on as well, specifically to do with graphics features. So chart and chart capability, things like small multiples, faceting panels, that kind of thing. This really can multiply the number of charting type possibilities by an enormous factor. You know, you can have things like, you know, you see a couple of examples there on the right, maps with graphs embedded in them, facetings, just essentially, there's a large number of additional chart features that can happen just with that one feature there. So that's something we're taking a close look at. We want to make improvements to how our data pipeline works, especially when we're talking about integrations into things like Python and so forth. We've got, you know, it's working today. There are improvements that we can make in terms of performance and speed and things like that and more efficient use of things like data frames and so forth. So that's one of the things we're looking at as well. You saw some examples of networks and maps, but there's more features that we're interested in adding that can make those an even wider realm of possibilities for those. So for example, custom layouts for graphs and also, you know, more types of maps as well. And general improvements to those. And lastly, oftentimes, a lot of examples of incoming data that can be multiple response data sets. In other words, a single cell in a data set could actually be a set of responses, multiple choices and so forth that have been done. So having native support for that within Brunel can really save a lot of time when trying to manipulate that data into some sort of relational format, if you will. That can be really useful for things like surveys and lists and so forth. So those aren't a few of the things we're considering. I mean, there's potentially others as well. The, you know, on our site on GitHub, if you go there, we're certainly interested in hearing the kinds of things that people would like to see there as well. There's a place on GitHub where you can enter issues. And so you can, that's one way to communicate with us about the kinds of things that you'd like to see, the kinds of integrations you'd like to see and ideas and things like that. And of course, all things like reporting problems and issues and so forth are a great place to do that as well. And we're also making use of a new feature in GitHub called Gitter, which is kind of a chat room type of thing. So if you've got some specific question, feel free to go ahead and go to the Gitter chat room, ask a question and we'll do our best to get back to you on quick questions and things like that as well and have discussions in that form. So we'll leave you here with a thank you and we've got about 10 minutes left. I think we can continue to answer questions in the chat. The slide here has several links. You can find us on developerworkshopen. You can find our GitHub project. We are on PyPy, which you can directly download and use within Python. We have a blog site that we have, brunelviz.org, which we will post information about Brunel. We'll also do what the Brunel is for and that is blog about data that we find is interesting as well. So we'll drink our own champagne there, if you will. And so you feel free to visit that site. And of course the online application, which you can directly use Brunel and try it out and get used to the syntax and then publish graphs from there and so forth is available as well. And there's also a link to the tutorial. And recently we've put up a YouTube channel as well. We'll include some of the videos that you've seen here today. There's others as well, some introductory videos and so forth and we'll try to keep that populated with the latest content. So I guess with that, I think we can, we'll hang around probably till the end of the timeframe here, answer questions, so there's no audio for those, but we'll continue to answer the questions through the chat room. And I think I'll turn it back over to Kathy and thanks again. Thanks Dan. I just want to remind everyone that we will be presenting calls regularly on different developer works open projects. And that our next call is April 13th. Our topic will be agentless system crawler, a cloud monitoring and analytics framework that gives you deep visibility into cloud platforms and runtimes. So make sure you register for the call and we will send you a reminder. You can learn more and register by going back to that developer works open browser tab that you used to launch this meeting. Okay, and I'm going to ask that if you don't mind, please fill in the poll question and provide your feedback on this talk. We'll stay on, Graham and Dan and Peter will continue answering your questions. I just want to thank you again for taking part in the developer works open tech talks and don't forget to visit the developer works open website to see more open source innovations. Okay, well, we'll just hang on and keep answering your questions and in the chat and also live. So I'll just go back on you and we'll stay on. Okay, we are at the end of the hour and I believe all of the questions have been answered. So at this point, I will go ahead and end the meeting and I want to thank everyone. I want to thank Dan and Graham for presenting and thank Peter for helping answer the questions and we'll go ahead and end the meeting. You can continue posting your questions. This questions will stay here and the replay will be available within just a couple of minutes. Thank you, bye.