 Our first speaker today is Adam Harvey who will be talking about visualising scientific data with HTML5 but I think first up he wants to give a reminder to anyone to go to a web address and give away your personal details and it's a redirect address so you can trust him. And the only interesting part from his made up bio is the fact that his mother, Adam Harvey's mother referred to him as an unshaven, unkempt mess with terrible hair whereas for me I got the same description but for my wife so without further ado Adam Harvey thank you. Thanks Peter. Good morning everyone. So before we actually get started properly I need to gather some data because I need it for a demo later on. As I said to the people who are here early I'm doing a live demo with user submitted data in a session. Clearly I've lost my marbles so go to this address and enter where you're staying in Brisbane now if it's not one of the preset options on the right hand side just be very generic you know just pick a suburb or something like that we don't we don't necessarily need sort of bedroom level detail here but if you could do that that would be really really good please so there's your address I'll just give you a few seconds to memorize it it is very long yeah yes sorry it does need to be a capital K which is also fun but probably not quite what you're after is everyone good sweet so as I was saying and as Peter was saying I'm Adam Harvey and this is a very inaccurately known talk because HTML5 of course no longer really has a five in the title thank you w3c of course I'm lying to you anyway when I say HTML5 what I really mean is HTML plus JavaScript plus CSS plus SVG plus various other technologies so it's not particularly HTML5 as such but since everybody else seems to like using it as a buzzword I'm going to use it as a buzzword as well you all know that your first problem of course is that you have data must go on reams and reams of data and you want to be able to visualize it somehow which makes complete sense the problem is that you have data in a lot of different formats this was a very this was a subset of what I had to deal with in the last 12 months there are in fact more mentions of XML further down with different words basically I mean you can see there that you've got the problem immediately that a lot of these things aren't going to talk very nicely in a web context which is unfortunate because the web happens to be where an awful lot of data exchange happens these days you've also got the problem that a lot of these formats are not very well supported across platform so I mean for example access what a lovely format that is what a what a brilliant database system but if all you want to do is dump some data somewhere it happens to be easy enough for people to use the only one on that list that I think is actually useful when you want to do this sort of thing csv I should have rehearsed that the xml ones aren't so bad if you can figure out what they actually mean but you do have that small problem that you have to figure out what they actually mean my favorite on this list are the fixed width text files which are not fixed width because some of the fields have overflowed and they've just used printf so if in doubt get it into a database I'm not going to evangelize any particular database realistically if it's something you can connect to from your chosen server side programming environment great that's really all you need the good thing is if you can get your data into csv which most of those formats I mentioned you can with various tools then you're pretty much golden at that point because you basically have the ability with most modern languages php and python the examples up there but there are plenty of others to natively pull out that data and slurping in wherever it makes sense or output json or do whatever it is you actually want to do with it if you have some sort of weird arbitrary database access then odbc and an appropriate connector is often your friend in that regard as I said csv is easy and interoperable it's completely non-standard of course and you'll spend hours sometimes figuring out what the appropriate combination of quoting and escaping characters are but you will at least get there in the end so to serve it up json of course is incredibly widely supported these days it is effectively the de facto standard at this point for getting data to and from a server on over http there are of course good reasons for this it's somewhat less bloated than xml you actually stand a chance of reading it and it's quite lightweight small data sets of course you can embed straight into your web pages there's no particular need to necessarily have a server side so if you've got a small data set forget about writing php or python if you're you know reasonable reasonable web developer just do it all in javascript if you've got larger data sets and you need to look at lazy loading but that's often pretty simple you don't tend to need to do much more than just ask for a particular bit of data slurp it out spit it back in json most of the demos i'm actually doing here are doing neither of those because as i've already mentioned i'm lying to you so what i'm doing in most of these is actually i've got static json that i'm loading up via xml http request that also actually works but i probably wouldn't recommend it in practice all right so the way i'm going to structure this is i'm but i'm really only going to give you a taste of some of the options that are out there 20 minutes really isn't long enough to exhaustively go through the options it's not even really long enough to summarize the options so i picked a couple of interesting libraries and written some demos and i'm going to walk you through some of the interesting bits of code that actually have generated the these demos or at least i think they're interesting you may think they're interesting so the first the first library i'm going to talk about is the javascript infovius toolkit who here has used it seen it okay a few of you well that's good i won't be boring most of you then slightly unfortunate acronym as it turns out it's not terribly easy to google for unless you use the full name which is unfortunately quite long it is basically a way of visualizing mostly tree and graph data you also it can do some charts as well it has a wide variety of output formats and they've really improved in the last 12 months the website which i should have linked if i've been thinking about this which is the jit.org google it has a very nice demo gallery which shows you the range of things you can actually get out of this library and it's really really useful so you have a whole bunch of trees and graphs and space maps and all sorts of things that are actually really really handy and which you can interact with quite well it also works with ie now uses canvas for doing the drawing but it also bundles x canvas and make sure that it uses the subset of canvas that x canvas supports and it all works really really nicely the only problem is if you're doing lots of animation as i discovered last year and one of your target users is running ie6 it's kind of slow actually really slow like sort of one frame per second slow but it's still it still works pretty well so let's get into a demo and then i'll show you some of the code behind it live demo time this ought to be good all right so i've loaded up a data set here which is basically a subset of the distribution timeline so this is just the debian section of it this is effectively just a map or a graph sorry which just basically just funnels out all the direct derivatives of debian the indirect derivatives via say a buntu or nopics and continues on out from there and if i'd spend a bit more time on this i might have actually pruned some of those so this is actually interactive and this is really where the javascript info viz toolkit comes into its own is the way you can actually interact with it so if i click on that now this is out of the box i've done nothing to make this work basically what i have but very little really so you can see here it just goes all over the place let's maybe go a couple of levels at once it's pretty cute and it looks pretty good and people actually seem to grok it which is really really handy it's a very good way of presenting this sort of hierarchical data to potentially a lay audience you know when you're presenting to maybe your supervisors supervisors supervisor who isn't actually interested in your project but is actually giving you money then this is the sort of thing that actually comes in really really handy all right so how do we build this as i said this is just static json so that's the format it's in now this came as a csv file and if i had more time i'd show you how i did the transform but it's not that complicated really it's just sloped in the csv file built up a json structure i use php but you can use python whatever and then spat out json out the other end really nothing to it some of those children obviously have children of their own now the actual code to build this i'm actually going to show you the whole thing including boilerplate so well almost the whole thing this missing a little window on load but everything else so this is actually how we built it so it's the hyper tree module that we're using we're injecting into the container we'll just make it the size of the window so we have to tell it how to create labels this is a slightly unfortunate bit of boilerplate that we have to write because in practice you really actually tend to want the labels to look the same but there you go the only thing i've had to do to basically make the visualization work is that line there so we've got a click handler on the labels and then from there we just call the appropriate method within the toolkit and it does all the animation and basically reparent it for you we also have this here which actually places the labels again i'm not entirely sure why you wouldn't actually have a default implementation of this but you don't so but it's going to end up looking something very similar to this that's it so that's how you define the object you call those two methods you're done happy days going with your life so that was four slides about what was that about 20 lines of code and you had that visualization all of the pretty much all of the toolkit's chart types are basically that simple to use i'm also going to show you an extra bonus demo i'm not going to show you the code for this but basically just to give you an idea of the breadth of what you can actually do with it here's something completely different this is a very ugly because i only did it yesterday browser share chart over the last year the bit i like is that ie has dropped below 50 percent now this is again this is the same toolkit but this is obviously a different visualization and again you've got you can interact with it so we've got little hover handlers there so ie is down to 46 percent firefox is 30.7 chrome's really increased so chrome earlier in the year was 7.2 now 13 now 14 and you can see all of that at a glance and this allows you to basically have a nice interactive chart very very cheaply the amount of code that went into this is about the same as what you just saw for the hyper tree it's it's really quite simple okay so that's the javascript infoviz toolkit now the other library i'm going to talk about add a little bit of length is rafael who here is familiar with rafael okay about the same number but a different number that's good rafael is a vector drawing library it's actually written by an australian who i really hope isn't here um it or ever sees this talk um it generate the reason why it's cool is because you can use it like svg the the interface is very similar to the concepts in svg your drawing primitives are basically the same but it will also generate vml as well which is handy in internet explorer hopefully ie9 will take over the world at least on windows and this will deal with this problem but for the time being if you actually want ie users to see it and unfortunately all the people who are in control of the purse strings at my university run ie then that's basically what you're going to need to give them so why would you use it when html5 gives you this lovely canvas element which also has drawing primitives which also works quite well because rafael gives you event handlers and unified event handlers at that across all the quirks of the various browsers and god knows there are a lot of them so it's very similar as you can see to really most of the major javascript libraries so there's a click method you give it a call back it does stuff this is pretty handy that's actual code from the demo which i will now show you now i'm not going to bore you terribly with the details of this but basically gee i wish this projector was brighter basically all right you don't need to see me for a minute or two anyway success okay so what this is this is the duchenne muscular dystrophy gene i work in genetics um and what this is is this is every exon in the gene and i'm not going to bore you with what an exon is basically with the number of variations in that gene for each exon or the percentage of variations in that gene for each exon and what this demonstrates is that you've got certain hot spots and these variations are in patients who are suffering from a form of muscular dystrophy usually do shan but sometimes becker so you can see here that there are some hot spots that we've immediately been able to see so let's interact with this i've got some handlers on this so let's have a look at exon 14 which is a nice bright one so click on that exon 14 has 191 listed variants in 102 base pairs which is a hell of a lot so you can see there i mean this is a very simple piece of code and again i'll show you most of the code in a minute but it's i like it you know it's got little bits of interactivity and you know it's sort of animated and kind of pretty all right let me figure out how to turn these lights back on again right okay so how have we built this again we have a data set which is just straight up jason we have some code which basically here so we're just starting the rafael element so again it just needs a container to draw into in html that gives you a drawing canvas if you like so we'll draw the actual box which is what you just saw which is just goes across the whole screen then we'll draw the individual exon rectangles i'm not going i'm going to skip through this quite quickly because i'm quite crunched for time but you basically get the idea we've got a little animation handler there so we've got our hover handlers to do the highlight effect okay and that was basically what you saw there is a little bit of boilerplate there but the code is actually pretty straightforward and you can click on the demo links in my slides and this will actually come up so you can have a look at the code in full please don't judge me i wrote most of it about two o'clock in the morning after beers okay so you've got various options i've only touched on a couple there are plenty more out there how do you really put it all together though in the end you know you've got this data you need people to see it and you want them to see it in a rich way cheat there are a ton of good services out there now i don't know that you guys but i don't actually get much time at work to make things pretty you know i it's all about the data data in data out but making it pretty is actually important it's what gets you great money it's what keeps people interested it's what will fire people up at conferences hopefully so you don't really have the time to do all of this so use the services when they're available a good example of this is mapping now mapping i hope daniel nedasi is not in the room mapping is not terribly hard i mean it is but it's not it's it's not rocket science you know you you have data it's usually in some sort of form that you can draw in a vector form you could invent your own mapping system you know plenty of people have done it some of them even did a good job but why would you bother i mean you've got google maps which has an interesting api but it's quite functional uh big maps which has a much friendlier api but it's much less functional and also kind of evil open street map which of course has the advantage that your underlying data is free so if you're doing some sort of creative commons licensing that can be really handy so don't write your own if there's something that's already out there even if it's a you know a web service user i say i don't see any point google you know google charts would be another good example of that don't reinvent the wheel if you need a static chart don't sit down with jp graph and write something use google charts move on with your life so earlier on i had you all basically give me your personal details i hope you all entered locations because otherwise that's the data set all right let's see how this went so what i've done here is a google maps visualization now i use google maps not open street map basically for time reasons i already knew the api that really isn't where it was meant to go that's actually pretty entertaining thank you whoever did that there we go okay so brisbane bris vegas so you can see here we've just got markers at the moment we're using the clustering library that you can get from the google maps utile library yes i know i'm running out of time um you can see that we've got 28 in brisbane proper let's maybe zoom in a little bit more yeah so we've got plenty of people staying at urban est and i'd say the marks in there as well and then we've got people sort of dotted around the place as we go now that's by itself reasonably okay but i think we can do better or at least i hope we can do better the person who's put in south australia might have just mucked this up it's not very html 5e though of course very few things are html 5e these days so let's actually do something more on the client side than just setting markers okay now a lot of the time you really want to draw some sort of heat map over the top or something like that so ordinarily you do that on the server but that requires a lot of resources now where i work we have one server that hosts most of our visualization and most of our projects it's kind of overloaded and the last thing it needs is for people like me to blundering with a map like that and have it generate 40 tiles of a heat map every single time i scroll around because funnily enough that results in the load average spiking assist had been getting paged and my email mysteriously disappearing from my account so what we need to do instead is just get the data it's all about the data and then draw it on the client side so we'll draw an overlay on the client side a tile server would be better but it's trickier to get working on the client side and i know this because i spent about four hours on this last night it's easier on the server side because google maps is just set up that way but we'll use an overlay for now we'll use canvas to do the heat map coloring in actual fact we're using canvas to do to do the drawing as well so this is a very very posh version of the code you're about to see in action so basically we have some data which is a set of coordinates nothing more we will do some boilerplate to get it into the form that google maps expects we will draw transparent radial gradients basically to get an intensity which we will then go through get the image data out of the canvas and we will colorize how we how are we going to colorize it we're going to go through normalize the data so that we actually know where our bounds are we're not using an absolute scale here there's no there's no need a heat maps really all about just seeing where the heat actually is we've got a heat gradient which i will not bore you with the details of and then we just go through pixel by pixel and just colorize as appropriate if we need to all right let's see if this actually works i'm not terribly confident about this after seeing the earlier one you see my point i think i've just destroyed my browser it's decidedly interesting ah here we go it was just having a good think about it now i'm about to spike my cpu because it's about to do the drawing and coloring which unfortunately is going to be fairly painful i think now with the size of that data set actually let's see if i can forestall that thank you adelaide person so you can see there that we have basically we have a small dark spot in adelaide funny i've often said that and then we have a big hot spot in brisbane and i'm out of time but if you come and see me later i will just get rid of that adelaide one and zoom in some more and actually show you what that looks like properly because all right come on you can see the same chugs having to do more and more data but you can see even on this there is a hot spot around the cbd we've got people scattered around the suburbs mostly the north it gives us a good idea of the distribution of our delegates straight away and you can see it at a glance this is not the greatest set of demos in the world but hopefully it will inspire you to think about what you can actually do with html5 go forth there's time for about one question can our next speaker just come our speakers which over why i take another question