 Well, thank you very much indeed. So I love technology. So I've got a 400 DPI screen in my pocket, and I can order pizza from my console. But I've got a letterbox inside a box, inside a letterbox here. It's beautiful. I'm not quite sure how that's happened, but we'll just have to run with it. So I submitted this with the title Architecting Visualizations. And then it's almost immediately regretted. It's a little bit grandiose, really. So I'm going to kind of refactor that a little bit to building better charts. And what I mean by better is charts that are as effective as they can be to communicate, but they're easier to build and they're easy to maintain. So just out of curiosity here, so who's got like a formal computer science background? Right, OK. And who's got like a design background, or they work as a designer? OK, so it's about to split down the middle. So I mean, I think because of the web and because of particularly the value of interactivity to data, it's kind of like it or not, we're pretty much all in the software development game now. And developing charts is kind of a small space. And I think it's been very distributed because you've had people in lots of different languages who've been working on this. But there hasn't been sort of much sharing between them. And I think now it's kind of coalescing more around JavaScript to kind of, there's more people who are sort of running into the same problems a bit more. And so what I kind of want to talk through today is some of the things I've learned and how I think sort of best practice is developing through my sort of vast half decade of experience and sort of present that as fact and then kind of see what you guys think. So before I go on, there'll be links to everything I mentioned at the end. So don't worry about sort of any references as I go along. So I've actually always worked as a software developer, but I actually trained as a graphic designer. And I was really lucky because my teachers were sort of hardcore warehouse types that actually trained there. So they took this really formal approach and rigorous approach to information design. And so during my course, I found myself producing these larger and more complex pieces of data as I went along. And I was doing this sort of mostly by hand and illustrated, like adjusting the dimensions of 2,000 boxes. And it got pretty painful pretty quickly. And I'd done a little bit of PHP and things like that before. So I started effectively using programming to pull these things together. And initially that was sort of JavaScript, ironically enough, in Illustrator. And then sort of processing and Ruby. And as you go along and as you sort of do these things, and I think most of us in terms of visualization programming are self-taught, you learn how to do it better because it's less painful that way. So you learn not to use magic numbers all over the place because then when you go back six months late, you understand why the hell anything works. And you learn that there's no such thing as software that doesn't have tests. It's just whether it's you doing the testing or the computer doing the testing for you. And slowly sort of essential good practice sinks in. And you get a little bit better. And I was particularly lucky because I had some really good mentors along the way that really helped. And so I've kind of been in this career now of a sort of data visualization developer for a little while. And that sort of involved everything from like a database design to distributed event systems for a bunch of different firms, a couple in finance and now in security and for the Guardian. And there's been some patterns that emerge. So what everyone kind of needs is reusable charts. And I ran with Donut charts because they're kind of like the kitten charts of visualization presentations these days. And they want charts that they can reuse and modify and they can tweak as necessary for different contexts because developing these things is expensive. It's time consuming. You kind of want to do it once. And if you're a reasonably competent developer, you're going to say, OK, so this is the problem I need to solve. I'll take a look around and see what tools already exist to help me solve this problem. And the funny thing about charting is if you look around, there is a lot of tools to do it for you and not just quite a lot, a ridiculous number of them. So like if you search on GitHub for JS PubSub, which has got to be one of those things that everyone implements, there's something like 200 or so. You look for charts, there's like 900. It's ridiculous. There are thousands of people implementing the same charts again from scratch. And it's kind of weird because that doesn't happen in most software areas. It kind of coalesces around a couple of tools and then everyone kind of uses those. And I didn't really understand why this is the case for a long time. And it didn't really fall into place until I was working at The Guardian. So if you're not familiar with the organization, The Guardian is a UK national paper. And it has a fantastic graphic design team. And so you've got this team of people who produce these amazing graphics like this basically on newsroom pace. They're knocking out charts every single day for stories as well as putting together big things like this. And they're craftsmen and wouldn't cross women. And they take what they do very seriously. And they work really, really hard to take really complex pieces of data and make it understandable to a really, really general audience, which when you're dealing with complex financial stories and things like that, it's really a difficult thing to do. And what my role there was was to help put them more of this content online and to do it in a better way. And we're lucky enough that this time we actually had Irene with us for a few months on Sir Conman. And this was kind of our brief, really. And so we sat down with the graphics team. And they had these huge, paced up books, which would have every single chart they'd ever done. It was kind of their way of keeping track of how their craft was evolving over time. And so you go through a few pages and you start seeing some patterns pretty quickly. So here's like a classic house style line chart. And it's a pretty simple thing. So you look at this as a developer and you think, OK, so we need some kind of API so you can create one of these. So you're going to have to pass in some data, obviously, enough, so you've got a series there. That's straightforward enough. And then you look at the y-axis and you think, OK, so I can't just work off the data because we've got stuff that's below the data here and we've got stuff that's above the data here. So maybe I need to set a min and a max. But then I need to set, say, a number of intervals because I don't want weird intervals of like 37.6, 96.4 because then it's really hard to read the damn chart. So I'm going to have to either set a min and a max and a number of intervals. Or maybe I can pass them in as an array. So there are a couple of configuration parameters around that as well. And then you look at the x-axis. And this is where it gets kind of funny because it's a time series, right? But it's set in months. And months are kind of awkward because months are a unit of time, but at the same time, months aren't consistent because there might be 28 days or 31 days. If you put months on the bottom of the chart, you get these weird miss-baselines and people think you fucked up. So you've got something which is a time series but not actually overlaid over time exactly. So it's a little bit fiddly there as well. You need to think about how you're going to implement that. But it's not the end of the world. Your reason to be a competent developer is easy enough to knock together. So you go through a few more pages and you find something like this. You think, OK, that's simple enough. I can put an area underneath and job done. And then just the annotation. And that's interesting. So now you're going to need a separate set of data, which is your annotations. And then you're going to have to specify where they sit. And that's kind of awkward because if that 11th of April was on that spike over there, it'd be sticking out at the top of the charts. Now the shape of my chart's changed. So maybe what I need to do is for every annotation, I need to say whether it's above or below the line. And then if I had two lines that get really awkward. Or maybe I need to specify the size of the tick. But then also is that a piece of text or is that actually based off a piece of data in there? Is that formatted somehow? And then how long is that tick? Do I want that to be short or do I want that to be automated or they're always the same? And it starts getting a bit more fiddly. And then I look at this bit at the bottom. And I think, shit. If you really want to annoy a web developer, create a design where everything overlaps slightly. Because it's like one of those rules is you draw in the damn boxes. So you've now got this problem because if basically this line, let's draw in and say SVG, is now going to be over the top of the x-axis. But you've got to then also consider how this is going to play out internally. So now if you basically want to draw that so it can go off the bottom but not be displayed, you're going to actually need two scales. You're going to need one scale, which is your actual data. And one scale, which is like a max drawing scale so that the 150 becomes the bottom and then anything below 150 actually gets drawn at 150. So it starts getting a bit more complicated. And then you turn a couple more pages and you find this. And then you start getting worried. So you've got a whole bunch of other shit kicking in here. So you've got this gradients coming in all the way through. So you've got to think, okay, so this time series now has three data points for every point. One is the min, the max, and the middle. And then you've got this time series here and you've got this, it's suddenly a dotted line. Ooh, I won't go any further that way. So now you've got, for one of these time series you're going to have two data points because one is whether it's now in dotted mode or not. Or maybe you do it as two series but then you've got another series there and then you're going to have to keep the labels consistent and it's going to turn up as two on the legend. So that's going to be nasty. And your code's starting to get a little bit fiddly at this point and then you turn another page and find this. And I'll just stop for a second because I actually love this chart. So this is actually graphing government estimates of GDP growth or shrinkage in this particular case as it went into the financial crisis. And you can kind of almost see the point where the invisible hand of the market is kind of like taking a civil servant's head and shoving it on the sand ground. And just each one of these estimates is getting more and more horrific as they're forced to admit just how bad things are going to get. It's a beautiful thing. But you've now got to think, so once again each one of these time series has now got three points in it but you might have one that doesn't. So if you're trying to put all of this in one chart it starts getting very, very complex. And the really horrible thing is you start coding this and you realize that there isn't a linear relationship between the complexity of the features you're trying to add in the complexity of your code because these features interact on top of each other. And so if you want to support, say, having that area graph you've got to think how that's going to interact with having support for the dotted form as well. So how the hell do those two go together because you're not drawing a dotted lining or you're drawing like dotted areas. And so you get this sort of cumulative effect in terms of complexity and it starts looking pretty horrific. Unless this seemed like some kind of hypothetical scenario and put these together, take a look at high charts. So who's worked or looked at high charts here? Yeah, a fair few of you. Have you looked at the high charts API recently? It's got over a thousand configuration parameters. Not just like comfortably over a thousand configuration parameters. And it's like everything from the left border width of the hover data labels to reversing the legend in certain circumstances is insane. And then you go on their forums and it's full of thousands of more feature requests. They just keep coming. And it's not just additional features, it's more and more granular features. And it's like staring into the abyss. I can't imagine what coding this thing must be like. It must just be coming this intensely complicated ball of hell internally. And I can see why they've made such a commercial success of this because there cannot be any joy in programming that thing, right? This is exactly the space where you can make money out of software. So despite the fact there's like 300 open source competitors to what they do, they're making bank off this thing. And I can see exactly why. And so coming back to that original point, if you imagine this is the problem space of line charts, I think what kind of happens is that every time a dev comes along, they try a couple of tools and there'll be one feature they need that isn't there. Maybe it's they need dotted lines or they need diagonal labels or they need some kind of different data format. And so they create a chart that covers their problem space and then someone creates one for theirs and theirs and theirs. And so everyone ends up creating these libraries that cover the problem space inside line chart problem space for them, but not for everybody. And so you end up with more and more of these tools. So that's kind of where we are now. We've got this ridiculous number of charting libraries. And then on the other side, we've got D3. And D3 is kind of amazing. I mean, it's built around this really simple idea, which is that you have a direct binding between the properties of your data and the properties of your chart. And it's a really simple idea, but it's really well executed. And part of that comes down to its lineage. So D3 is developed by Mike Bostock. And before he was working on D3, he was working on ProtoViz, which was basically built around the same idea with like a thicker abstraction between you and the drawing API. And ProtoViz emerged from the ashes of Flare, which was an earlier library, which implemented the same idea on Flash. It was a re-implementation of Prefuse, which implemented the same idea on Java. And I think what's happened is, as it's been iterated, it's kind of been refined and chamfered as a concept in API to the point where you've got this really sort of beautiful intellectual diamond of an API that's been developed that gives you exactly what you need to do charts quickly. And I've come to think of D3 as kind of like the jQuery of database because it enables you to do what was already possible, but far easier and far faster. It takes care of the really fiddly bits and allows you to focus on building exactly the thing you want to do. And it gives you a set of tools that make it really easy to do that. But web app development has kind of moved on from just working with building apps with jQuery for a good reason. And I think the concern I have a lot of the time with D3 charts is around reusability. And I'm not the only person. I mean, there's quite a few people that have started building charting libraries on top of D3. So you've got like, MVD3 DEX charts, D3 simple X charts, there's a whole bunch of them floating around. But I think my concern is that they're in danger sometimes of falling into exactly the same mistakes of the earlier charting libraries. So you could say this is like classic feature bloke, right? So it's kind of, you're trying to support the kitchen sink and that's why you're running into trouble. What you should do is build like an MVP or like a really lean thing that just supports what you need. But the reality in certainly a situation like the Guardian, I think anywhere where you're really trying to do the best charts you can is that you actually need all this functionality because creating a really effective chart is a craft. And people aren't doing these odd things because they want to be different. They're doing things because it makes it better. It makes a better chart. It makes it more communicative. It makes it clearer the point you're trying to make. It allows you to focus on the data that matters. And so I don't think it's a reasonable compromise to say, well, that's too complicated to implement or that's too fiddly to implement. We have to find a way through this. And I think also it's kind of interesting to go back to what Amanda was showing this morning where you had the video as a development over time and the fact that actually getting roughly the right chart is pretty easy. But the vast majority of the work most of the time is actually in the detailing. It's in the finessing. It's in the polishing of these things and getting the details right. And you want to reuse that work. I mean, if you look at what Doug was saying in terms of ARIA, getting all that ARIA functionality in place is not an easy thing to do, but you want to do it. And you want to reuse that stuff because doing it every time is pretty painful. So, you know, we need to find a way forward. And so where do we go from here? Well, I mean, you know, one hand we've got D3 and the other hand we've got these high level charts. I think what we need to try and do is find a better way of building charts that can be reused. So done again with the titles, but embracing idea of modularity and composition. So what I mean by that is to start breaking these things up a bit. So rather than kind of building a chart, it's breaking the damn thing up. And it's not just at a visual level, but very much at a code level, breaking these things into discrete components. And this has two key benefits. The first is obviously that your code is a bit more modular. And that means that you can work on individual parts and potentially they can be reused across charts without having to write them again. But the other part is that you start formalizing internal interfaces with your chart. Because if you have to think about what's, you know, essentially the role of your axis, what's the role of your main plot, you then actually separate out what falls each side. And as soon as you create that API, you create an opportunity for someone else to say, well, I need a completely different axis, but they know how to implement that and drop that into place. And that's how we can start getting towards these things being a bit more modular, and that enables extension over configuration. Because you're never gonna be able to provide enough configuration parameters for people. And I think this is what you see time and time again as people think, oh, you know, okay, there's another need, we'll add another configuration parameter, but it doesn't scale. And you'll never be able to provide everything that people want. The only way you're gonna be able to provide something that's genuinely reusable is just to provide inflection points where people can drop in their own code. And part of it is also about changing the way we write codes. This is a bit of more of a D3 specific thing, but so here's a chunk of D3 source code, in fact. So, you know, you've got a function here for converting HSL to RGP. It's a fairly straightforward thing. And then inside that you've got two, basically locally scope functions, descriptively named V and VV. Now, if I wanted to, for some reason, change the range of this comes back to, so change the granularity of the color from being 256 to, say, 128 in VV, I'd have to rewrite this entire function or copy this source code into my code. Because there's no way that I can modify that internally. The better way of doing this is to make every one of those functions exposed. Because if they're exposed, I can then just tweak the smallest amount of code possible to actually write the best thing possible. So, I mean, this isn't a floor of D3, but I think a lot of people look at D3 source and use that as a model for building charts. I don't think that's necessarily a good idea. You know, this is quite beautiful in the way you get kind of, you know, scope functionality. But there is this massive downside in terms of reusability that people need to be aware of. So, the next thing I wanna cover is separating data from charting. And this is kind of key as well. So, you know, you've got your data, you've got your chart. And I think, you know, most people don't tend to think about this particularly much. You kind of think, okay, you know, you get our data in and maybe we push some stuff back out at some point. You know, it's quite neat. But I think the reality is that unless you quite carefully think about the interface between the two, you end up creating implicit dependencies. So, you know, you start thinking about when things load and what format they're in. And if you've ever tried to pick up someone else's chart and you spent half your time trying to work out the data format to get certain things to work, therein lies the problem. So, once again, it's about creating a formal interface so that rather than just loading in that data, you have some kind of like get series function or something very simple. And as soon as you create that delineation, it becomes really easy for someone to drop in a different data source. And that way, if you know, you're pulling stuff out of a CSV file and they wanna pull it out of a JSON file or an API or something else, it becomes really easy because there's a clear inflection point where they can do that. And that once again, it just helps promote, by splitting these things up, you help promote reusability. And I think that's key. And so that kind of covers like individual charts to a reasonable degree, but I think it gets a bit harder when you start structuring sort of larger interactives. So, if you put things on a slide, they kind of look like statements of fact, it's great. So, I really like state machines when it comes to building visualization. So like here's a really simple example of like a Viz. So you've got some kind of intro screen or maybe it's like an overview screen and then you've got one level of detail and then you've got like a drill down or a pivot screen. You know, it's a fairly straightforward model you come across all the time. It could be like an interactive story piece. We've got an intro and then you go into detail or it could be like an analytics pivot pane where you get like an overview and then you go into details views that split down certain sets of your data. And if you try and build this out with an MVC framework, you kind of, you know, you've got your controllers and they have methods under them. And your controllers act as a group of logical functionality and your methods sort of activate certain pieces of functionality. So, let's say we're going from that first screen to that second screen. So, you know, we implement this with our MVC framework. Okay, so, you know, our action comes in, goes to the drill down controller, goes to the show action and that does the job. But I think a lot of the time with interactive pieces is a bit more fiddly than that because so imagine this is time. So that's where, you know, on the left is where the user clicks, on the right is when that transition is completed. You've actually got a lot of things that may be happening in there. So you might need to remove some event handlers from the old ones so people can't click on it again while the transition is happening. And then at the end you need to put the event handlers on the new ones so people can start interacting with it. And then maybe you need to load some data during that process as well. You've got certain elements fading out. Maybe you've got some other elements fading in. There's a lot of things going on in there. And, you know, you're a good coder so you split those things out into functions and you put them all in your controller and your method and, you know, it works. And then your designer comes along and says, oh, by the way, there's gonna be this link in the text on the intro that means you can go straight to one of the drill down views. And now you've got a problem because all the code that you've written to go from one view to the other is based on assumption that you're going from the view on the right and not the view on the left. And all of the code to both pull down that old view and build up that new view is in one lump. So, okay, you can split that out into separate functions and you can reuse those functions across, you know, both of these methods now. But the problem is that as you find yourself doing that, you find you have to parcel these parameters around to kind of say which one you're coming to or from because you can't make any assumptions anymore. And that means you have to pass around all this additional information. Your code quite quickly, I find, gets quite bloaty, gets quite fiddly and it's quite buggy. And in part, the fundamental reason I think this happens is that you've got essentially a conceptual mismatch between what you're trying to do and the way you're implementing it. And this is why I like state machines because it allows you to have a much closer match between the conceptual map of your code and the conceptual map of what you're implementing. So, if you imagine each one of these are your states and each one of these are as your transitions. And what this allows you to do is, let's say you need to pull down those event handlers. You can just associate it with exiting that first state. And that doesn't matter how you exit that first state. Whenever you exit it, those handlers will be pulled down. And the same with, say, pulling those handlers up when you enter a new state. And then let's say you've got two different transitions between the same two states. One is quick, one is slow. Each of the animations associated with that can only be associated with one of those transitions. It doesn't matter which one. And that way it's easy to implement sort of multiple transitions between the same states or sort of multiple routes through this thing and carefully sort of associate your functionality only with the parts that matter. And it's keeping that conceptual map as close to your code as possible. And so who's spending the idea of a code smell? A few people. OK, so it's kind of the idea that there are certain patterns you notice in code, which is generally speaking a bad idea. And I've come to think of a code smell when you go to change the design or something and you change the design here and something to change code here and here and here and here. That's a sign that you've got a problem. And I think state machines are a great way of solving part of that problem when it comes to visualizations. So in terms of actually implementing this, you've got a few options. So the first one is JavaScript state machine, which gains a lot of points for having possibly the most literal name ever. It's been around for a long time. It's probably not the most feature complete, but it's really easy to work with. And it's got a really quite simple API. The next one is machiner.js. I'll put up links for all of these at the end. So machiner is probably the coolest one. It's got some really, really interesting functionality beyond sort of the basic sub mentions. You can do things like it wise, automatically into an event bus. You get all sorts of neat events all the way through for everything. And you can do things like it's borrowed some ideas, Merlang, which are really cool where you can have states and then those states can implement methods and they can implement the same methods in different ways. So you can have like an offline and online state and they both can have a save method and they can work in different ways. And then your UI doesn't have to worry about anything other than calling a save action, everything else get handled by the state machine. It's really, really elegant. It's a really nice one to look at in detail. And I'd be derelict if I didn't mention this storyboard which is one Irene and I've been working on. And this focuses a little bit more on trying to map sort of things around more complex transitions and animations and it's really good if you're trying to sort of build out more complex storyboards and that's something we've been trying to focus on with that. So particularly with that one, if you give it a try, let us know. We're looking for feedback. So the last key thing I'd like to talk about is eventing. So I mean, it's very simple, really event the hell out of everything. You can't have too many events coming out of a chart. The more points you add, the more events that are bouncing out of a chart when anything happens, if you open mouse over things and they touch and they click them, it gives you more points to hook onto when you're trying to work with those charts, particularly when you're trying to get multiple charts to work together. It makes it much, much easier to make that happen. Oh, pardon me. So that gets a lot easier. So that kind of covers the broader space of what I'm kind of looking at today. But I guess the thing is that this is an area that more people are working in now. And you still don't actually see much discussion in public around how we actually implement this. So I'm gonna open this up to questions in a minute, but I'd particularly like to hear about any approaches that people here are taking or any libraries or tools or techniques they've found particularly effective for building visualizations. Because I don't think there's been enough discussion between us, those that are working on these things about how to actually go about it in a good way. So if there are anything you're using, I'd love to hear about it now and kind of open this up to a bit of a discussion. One last thing I'd be derelict not to mention that my employer's crowd striker looking for a UI designer at the moment. So if you like the idea of trying to find needles in very large haystacks, come talk to me or my boss, Brian, who is some of the audience I can't quite see. So thank you very much. So I'll open this up to questions. Yeah. Yeah. How do, well, MVC frameworks and state machines. I think, well, you can use a state machine as a replacement for what a controller would usually do in MVC framework. So you'll still have views, you'll still have a model, but you use a state machine to control the way you move through an application rather than controllers. That's a short answer. Yeah. I have it self because the way I've usually seen people implement it in case of visualization is, and yes, a model event comes in and they'll just redraw the chart, which is probably not the point of MVC unless you are controlling each bit in the chart on a change event. So how do you do it? So I think it's more that you don't, a redraw tends to be a more complex operation simply updating the screen because you often have a lot of animations and other actions going on that make it a much more of a complex change. I'm not entirely sure I follow your question. Sorry. So suppose if you have a bar chart, okay, it has Trekkie bars on it, right? Yeah. So if the model changes behind the scenes, Yeah. You do not want to redraw the whole chart because then that's not really using the model view controller effectively. If suppose one of the data points changes, then you want to just change one bar in your chart. That's probably a more effective use of the MVC paradigm. However, I see in case of visualization, that could be a problem because if one data point changes, your axis might have to contract and shrink and there's all kinds of interactions that could happen. So MVC in data visualization is kind of a complex topic to me and I was trying to figure out how other people do it. That's a good question. I'd love to hear other people do it. I mean, Yeah. I mean, is there anyone that wants to answer that one? So it's not going to touch other things at all. It's going to wait someone one. It's hard to do this over a server because you might have to make a huge ask call or get a small call or something. But if you load everything on client, which I think more and more people will start to do, it's pretty straight forward. You know, you just change the data point. It's always way forward. So I don't know, for me, the answer lies in kind of front-end or client-facing frameworks. Yeah, I'd agree with that. I mean, data binding does solve some of the problems, but you do need to be careful about what you're saying in terms of axes changing and everything else. I mean, you change the shape of your data, you change the shape of your chart and you need to be very careful that you cover all those scenarios, I guess. Any more questions? Yep, very back. So which is where D3 turns into something like GDBlock2, which is great. Put it in coupling and all this other nonsense. Yeah. Get that at D3 and then gone. So I kind of hand-waved that one. I mean, so I've tried this a couple of times. I know a fair few people that have as well. I think it's difficult to get right. I think you need to probably do it on a basis of discrete visual components. The difficult thing comes when you want to integrate multiple charts on top of each other particularly. I mean, I think the thing with D3 is quite hard to find the right point to do this because the API is really not built for it. And you need to kind of push it into an odd shape to actually get the right bits exposed to be able to expand things nicely. I think it's something that almost needs to be won through more of a bizarre approach than a cathedral approach. We kind of need to try a bunch of different levels of abstraction in detail and see exactly which one is right. But I do believe that there is a level which is sensible without going to either extreme. Because I think building it from scratch every time, that's always appealing to programmers, isn't it? You kind of always want to do that because it's a green field and you can do everything properly this time until you finish it and then you want to do it again. So it's kind of, there has to be a middle ground that I think will work better than where we are right now. Any more questions? Yep. Well, you can test everything up to the rendering layer quite easily. I mean, if you're working with something like D3, you don't really have to worry about testing that your scales are working correctly because that'll happen anyway. In terms of automating testing to the level of actual display, I've never seen a positive ROI on that. I think it's one of those traps in which you can waste a spectacular amount of time to achieve very little. Any more questions? All right, I think that's all. Thank you very much.