 Alright, what do you think these dots represent? What kind of data? What do you think the data represents? Oh, no, but I do have one of those dots. There would be a lot more dots than that. Yeah. High speed internet? No, it's actually got nothing to do with the internet. Sorry to disappoint you. No, but it is sheep-related, actually. Sheep accidents, no. Any other guesses? No, but related. Stockyard locations. And I found there's such an interesting map to look at, because when I saw it, I realised I'd never seen data like this, like, who would make this map and why would they make it. But it's really fascinating to see and go, oh, look at, like, in the Gisborne region, I can't, for those of you that... I've got no pointer, but like the top right, kind of, the big hustle on the right-hand side. You know, that's like the Gisborne region. I had no idea they had so many stockyards, and you look at a map like this and think, what does that mean for the industries and the communities around it, and kind of get to learn about our country through a lens that we don't otherwise see. So, what is Wiki New Zealand? So we are bringing all of New Zealand's public data together in one place online and making it so that everybody can use it, so that it's visually explorable. And so, for example, if you look at this, this is data showing industry finance spend on R&D as a percentage of GDP. And if you look at that and go, right, what are the trends? What does it say? It's going to take a little bit longer than if you look at a graph like this. So I used to work at a think tank called the New Zealand Institute, and during that time we did a lot of research, whether it was on social, economic, or environmental kind of issues, and we would always start by running the numbers and going, right, so how does New Zealand perform over time compared to other countries? And then we would go out and present to a lot of people showing really simple graphs, and people would frequently be really surprised when they got to see data and see, you know, whether it was New Zealand's GDP per capita compared to other countries, or that we have the highest rate of youth suicide. And so I came to realise that, whilst there is a huge amount of data that's already collected, it is too hard for most people. It's not accessible because it's spread across, like, hundreds of different websites and held within thousands of different databases and spreadsheets. So data like the stockyards, that's sitting there, and it's just typically too hard for most people to go in and use to map. And so we are bringing it up one level into visual form, but not taking it further. So we're not trying to tell people what to think. We want to work for people about how to think about data. And so here's another little graph. So this is, you know, like a typical graph that would have on Wiki New Zealand a number of visitors to New Zealand. You can see there, Australia. China is just over the US now in the last couple of years. And it's been really interesting. So I had the idea for Wiki New Zealand about almost three years ago, actually, and we launched the current site, which you'll see at wikinewsland.org, and that was just me going, right, so let's just start this with what people think we threw together like a WordPress template with some high charts and stuff, and just went for it to explore and see what people would think. And the response was amazing to see how many different uses there are for people with data. Once they could actually get their hands on it and use it without having to spend hours kind of finding it and thinking of the questions to ask in the first place. And so we've been on quite like a journey in the past couple of years. And since then, like we won the Australia and New Zealand Internet Award last year, that's what we're doing is the first of its kind in the world. And it's been, because I didn't come from the data side of things, so the open government data kind of movement isn't where I started. And so I don't know that much about it. And so I just kind of started at the user end and thought, what if everyone could use data? And since then have realized that when people have been talking about, you know, how to make data accessible for people and open data, that it has been focused on current users. And that tends to be people who want like APIs or machine readable formats and so that's what the whole kind of principles around data have been set at. But when I kind of started at this scene and it's like, oh, what if a seven-year-old wants to use data? What would that mean for how we structured all our data systems? Or what if a 42-year-old tow truck driver in Hawkes Bay wanted to be able to know how many accidents occurred in his region over time? What would that mean for data kind of requirements? And that's kind of set up the whole structure of how we've been building the new backend for bringing data together and how we display it, which I'm Nigel will kind of do a demo of soon. So it's been really exciting actually just to see that when you actually empower people with information that they haven't been able to get before the different kinds of questions that people can answer, that you don't even know to think of. I had someone the other day, we kind of share an office space with the jewelers and the guy came up to me and was like, oh, you do stuff with data, right? And I was like, yep. And he said, is there any way to know how many diamonds get imported into New Zealand? And I said, yeah, give me 15 minutes. And I went to the stats NZ site and found the value of diamonds imported by month for the last 8 years and sent them through a graph and he came over and was like, well, so we import x% of New Zealand's diamonds and I don't know what he's going to do with that information, right? But that data is sitting there and you just have to know what 18 clicks to use to get to it and you have to know how to make a graph. So that's kind of our whole purpose is getting people to use data about New Zealand and not just public data but also using private sector data. We have some companies that collect huge amounts of data or companies like Chorus and stuff as well that want to put their data into one place. It was interesting actually having the realisation that New Zealand's data is essentially organised by source and by that I mean if you want to use data the first thing you have to answer is who collected it. And that makes a lot of sense because it's an agency and you've collected all your health data and you put it on your website and you make it available for people in spreadsheets. But from a user that's as silly as having a dictionary ordered by country of origin where you open it up and you first have to know where it came from. If you don't know that the Ministry of Health exists or you don't know that there is a survey called the Household Income Survey you will never know how to get to that data. You're searching by source or by data set that you actually start just thinking of data naturally around what's the topic that I want to explore. And in terms of what impact I believe it will have is I think it's really like if someone had talked about something like Wikipedia before we ever used it and said I'm going to make an encyclopedia that anyone can edit and anyone around the world can do it and add to it. I don't think anyone would have thought it was a dumb idea that we could have really understood how much of an impact it could have and how we would come to use it like every day, multiple times a day for lots of different purposes. And similarly I feel like when we actually kind of enable everyone to use data that we can't even begin to imagine what the outcomes of that will be. Like I think sometimes around not everyone used to be able to read and that meant that some people were writers of content, some people were readers of content and others had like relider in those intermediaries to be told what was written or what it meant and what it said. And similarly I feel like right now with data most people have to rely on other people to translate it for them to analyse it or to tell them what it means and so I think that it's just a whole new language that we will be able to use that everyone, whether it's a 7 year old writing a story or someone running a business or having an argument with a friend about how many possums there are and if they're increasing or decreasing that it will kind of really change how we can all think. And if you've got any questions as I'm going like we're pretty small groups that just yell out, ah but you'll need the microphone I think or get in trouble. Okay thank you. In terms of wiki being a wiki and being something that is collaborative right now when you see the site it's not really evident that part of it but I really strongly believe that to get the best representation of data in New Zealand that it will come from collaborating and from drawing from experts. Like we will never be the domain experts probably on anything actually and so to be able to have a system whereby experts are able to contribute and help with auditing of content and adding graphs and making recommendations and helping kind of create the different sections is really valuable. It is different in that I think data is a really dangerous thing for people to be able to play with live and so there is like an auditing and publishing process but that I think is going to really make a really rich and kind of I don't know it kind of one of the examples that happened early on was I had a graph that showed the official age of retirement for OECD countries and I got that data from the OECD just as it was with the label and put it up on the site and someone emailed me a couple of weeks later and said just so you know like that there is no official age of retirement in New Zealand and sent me some links and I was like I didn't realise that the OECD had kind of a short cut how they named the data set and so I was able to then go in and edit the title and send it back to the person and go right you know it was more accurate but there is no way that I would know that you know without engaging experts around the table so it's pretty kind of it feels pretty special actually to have people that are willing to contribute and that care about their content areas and to come together and create a resource for everyone to use oh that was that one sorry but it has actually become to mean a whole lot more like in terms of the collaborative nature of the organisation like we've had, it feels like we've had hundreds of thousands of dollars worth of people's time contribute to what we're doing and I'm sorry I don't have lots of cool screenshots and stuff to show you because we're just going live with the new version in about three weeks right? but it's been amazing like every person that's part of the organisation that now gets paid started as a volunteer and I didn't really realise the true power of like a like a crowd and people being able to when I think when you know when you're when you're selling a company and you want to go oh I've got this product thing and I want to sell it to you and you know you try and solve it all for yourself and I've worked from the private sector and software companies before with that kind of model and then when we came to making Wiki New Zealand I thought it has to be different like it is different and we and started bringing people on the journey and creating an environment where anyone felt like they could input their idea and help us expand it and the power of that is almost indescribable in terms of what we've been able to do and getting into like government and getting them excited about what we're doing and different agencies and different technical experts and it's um I need to reflect on it a bit more actually because it's something that I never imagined like when I first started what actually happens when you go hey this is ours you want to help me make it and what that means so some of the stuff that we've been doing so when we first started and and we just kind of whacked up the current site and threw together stuff in the back end to see what we could do every graph was made like from a different excel spreadsheet and then uploading it was just really manual and painful but I didn't want to spend a lot of time and effort developing a whole system before we knew if we actually liked it and so what's been really exciting in the past kind of six months is designing and going to see if they can go right so you want to use this and so let's actually design it a bit more properly and um and um working with Nigel and his team at opcode to help kind of shape that and so I want to show you that we've only shown maybe like six people before um like outside of our little team what we've been um developing and it's really cool to finally see it live in terms of you know like all the data that we have is collected in pretty rubbish inconsistent formats with excel spreadsheets and merged files and extra columns and lines added for prettiness and when it comes to doing something useful it's not ideal haven't even got to silverlight plugins showing data PDF you name it random XML all sorts of well rubbish yeah yeah well formats designed by statisticians like the the legit the valid and stuff but they're so hard to understand and not for the average person certainly not so if we solve that problem once hopefully nobody else has to solve it is the idea behind the system we should probably show the spreadsheet first I know you're going to do the the easy one of these I'll do the easy one first alright so the system we are making so yeah source documents um files such as spreadsheets CSV files PDFs as mentioned that come from sources which are places like statistics New Zealand and so on so we've got a whole bunch of these in the system already the easy one was this one here time series 1996 to 2014 from the Ministry of Education it's a whole bunch of data that the Ministry of Education produces about students in New Zealand schools so it starts out as a spreadsheet with goodness knows how many sheets in it uh we'll show you an example of one of the ones that they give it later yeah so we jump into here oh in fact the first sheet is a table of content of all the kinds of data that they put in one big spreadsheet and they publish so if you actually wanted to know something from the Ministry of Education this is the format that you currently get this data right you have to download a big spreadsheet find a table that you're interested in and then try and do something with it like use Excel's graphing function to make it chart perhaps so we're going to extract some of this data out of this spreadsheet and into the database in a much more usable format so what we have here along the top here are the sheets it goes on and on and on um this is student role by student funding gear as of 1st of July 1996 to 2014 this is actually a simple table quite a simple table this is a two-dimensional table there are um this is school year levels of funding year levels down the first column and across this row is the year from 96 to 2014 so what we're going to do is extract all of this data the first thing we do is we take hey how about you do the driving yeah so I can hold this so you take the funding year column and you edit as a key data exists as tables tables are multi-dimensional and have a set of keys and each um thing of keys has a value so in this case for example funding year year 1 and then um actual year 1996 has the value 63863 this is just going to be a two-dimensional table quite easy to understand so we're adding that as a key funding year level and then we're adding the second key here which is the year in which that data exists and so now I can kind of select there like they'll end up being like a lot of different units that are strong typed yeah so one of the key things about this system that's different from a system you may have heard of before um called N or other such examples of open data is that often they grab just the raw data the raw numbers and they publish those to you and typing information is not included so you get given a number but you're not told whether for example you get given a dollar value but you're not told is this a New Zealand dollar in 2014 inflation adjusted with or without GST um there are other examples of types as well that you get that are great some numbers are just numbers some numbers are percentages um such data such typing information about data is not often kept and that's one thing that we do very well here so we get to say that this is the the years are actually years we say that they're years which means that if the spreadsheet had um like year ended 1996 year ended 97 because some brain dead person who created the spreadsheet decided that was a good way to write them our extractor looks at those you tell it those are years and the extractor understands that it can in fact pull those out and turn that into that must mean 1996 that must mean 1997 and so on and so forth so it all becomes very consistent and so just um what the actual values are the number in the student role so that number right yes so in this case the numbers in the grid are just straight numbers numbers of people and thus a table has been created which you can see what have we got here so we have funding year level values down the left here and then the year and then the value for that so the total in 1996 was this the unit of those values is a number and the number is called the student role at that time how many 400 isn't there so if you scroll down a bit we can see that it's grabbed all of them out so you can see all of the individual cells have been taken out and they're all stored in a nice consistent format they're all strongly typed they're all ready to go you can drop the Excel spreadsheet at this point the data is in a nice consistent format in the database and from here we can move on to doing more interesting things with it alright so that's a pretty simple one here's another example of a spreadsheet that's less awesome where for example these can you see my little cursor yes because it's there these are actually merged cells and this one here is the heading so it says gender and then male and female but these cells have been merged together this is real data from the MSD ministry of social development that's right New Zealand benefit numbers and so this is what they do right they create a table and then they go we're going to put formatting around it they've clearly managed to clear out the lines that divide up the cells so that they've only got grids in the place where the data is some of them do what they put tiny little columns that are green to make fancy borders these are done with actual cells and they colour the background of the cell because they don't know how to drive Excel this is the kind of thing that we are working with and trying to get around we were thinking of making a mix tape of our reactions when we come across datasets it's pretty interesting you get a lot of why would they do that kind of coming along so we imported that that same spreadsheet here and so yes one nice thing about extractors is you can see all the formatting has disappeared we have only extracted the numbers out of it we don't care about the formatting it would be nice to fix up the the cell widths perhaps a little bit but in general the important information is there and all the extraneous fluff is gone alright so when we're coming to extract this we can talk about it yeah you have a talk about that okay so I'm going to add this selection as a key and call this minor categories can you guys read what's in the cells or should I read out that kind of what I'm okay cool and that's text but of course some of these are kind of headings and so what we can do is split this key and then record the specific ones that we want to be marked differently so that's the gender one the ethnic group label the age group and the continuous duration and start recording and call this the major categories and this is also in text and then we have this here which we will add as a key which is September 2014 it's the quarter so quarter and in the future here the type will be added where that will become a strong type but for now I'll just put it text otherwise I'll break it here are the benefit types and some of these are merged cells again but if we add selection as a key benefit type and this one which is a you'll see like it's like number percentage number percentage alternate rows which isn't ideal when you actually want to come and use it so if I add the selection as a key and say type and save that text then I can say use this key when selecting the value units and so when I don't have that selected there's only one option for a value unit and value label and when you select that there's two and it says where it equals number I'll call that number and select number and where it's percentage and again we haven't added percentage yet so I'll just select number for now but that will probably be there next week and when I press save extraction I'm going to there's going to be some things that are broken but I'll show you what we get so far so it says that there's some problems with the extraction but that it has been split out so age group is now has been separated out from the major and the minor categories the benefit type whether it's a percentage or number the value etc but I think there'll be two things broken did I so one thing is that failed to pass the number NA and the other is that the minor categories can't be empty so I just want to show you how I'd fix that yeah so NA is the person who's made the data is actually put NA in the cell instead of a number yeah so you see that kind of around here we can do a lot of it yeah yeah all sorts of rubbish people can put in there we're trying to figure out yesterday so there's a sheet that Amy's been working with cells that have N indicate less than 5 but everything in the spreadsheet has been rounded to the nearest 10 and so it seems that N should really equal 0 but that also is incorrect yeah so there's lots of like little things that we're trying to build in and understand that again causes a little bit of colourful language yeah so that's New Zealand for example lots of the data that they publish some of the cells will have S for suppressed for some reason that number's been suppressed or C for confidential is another thing that can happen from time to time so we need to be able to do it but I just assume the grid is just simple numbers not at all so to get around that I can just add a roll with the values and so when it is equal to NA then skip this row column and done and the other thing is that so right now some of these around the splitting of the keys isn't coming through so this key here goes from B6 to B23 so it still includes the darker red cells so I need to add a roll that when it's empty because it's been assigned to something else to skip this column or row and so that should now work yeah so all those NA's have been removed and um at this point at this point just pulling the data apart before you start reinterpret yeah exactly yeah so that's the extractor was it a question or are you on it now is there any sort of audit trial for your conversion process like who did it and what decisions you made along the way about what the data should mean or possibly mean I'm going to hand it to you got it fully intended that there will be one not one existing in the system now but the way it's done at the moment every step is stored so we know what was done what happened we also know who did it as well and we plan for that information to be available all the way through the system so apart hasn't been shown here we'll talk about that a little bit is the part where charts are made from this data and the plan is such that you could potentially click on you know tell me about this chart thing and it will go well this chart was created from data was created in this way from this spreadsheet or these ways from these spreadsheets so let's see the extractor part of the back end and then from here do you want to talk about the tables and the chart designer oh yeah sure so we've got an extraction out of here something a little bit tricky about how these work is that they're constantly updating the data right so every three months the Ministry of Social Development publishes the next set of benefit numbers from the last three months and so they put out a new spreadsheet and it'll be a pain in the butt to have to go through this all the time what's more the new spreadsheet has only got the latest three months of data and you know it has data from the past five years basically a rolling five-year window but we want to kind of keep it forever right so having made an extraction what we really want to do with it is create a table and a table is kind of the canonical store of these extractions a table is made up of one or more extractions so from here I won't create one from this but you'll see a table looks very similar to an extraction again same thing list of keys and the values for those keys but an extraction can be made up of for example when we start importing the MSTs data the five years of data we could get out of the first spreadsheet that they published in December 2014 and then when they published their update in March 2015 we can go in there we can make an extraction of the last three months we can add it to that table so that we have now the table just being extended further with this data so tables are really the the store of data the extraction thing is nice but ultimately it all ends up being a table from tables we have a public API as one output this is a HTTP JSON API so you can request all of the data from this table and eventually we will make it so that you can request give me all the data from this table where the year is 1983 so on and so you can take all this into your system and do with it what you will do is that we will also include the strong typing information so that you can compare across data sets which is a nice thing so Ministry of Social Development use one definition of the New Zealand dollar all of theirs are 2010 inflation adjusted and then the data that some other place uses they never adjust for inflation we will provide it out the API all adjusted in the manner to which you require same with things like miles versus kilometers if a sheet this data comes from America all the distances are miles but over here New Zealand's data is all in kilometers then we will provide a mechanism such that they can all be converted to kilometers so you can compare them consistently across data sets so it's tables and the API so the API will be available for everyone but we will also use it to feed into Wiki New Zealand the front end and that will have ultimately tens of thousands of graphs and simple maps that you can kind of play with and explore but we will be building those using something called the chart designer that we're creating at the moment that will ultimately be something that other people can use as well and do you want to talk about the details of that or do we not really know all of them yet yeah so we have a hypothetical view of it we're designing it now the general idea being that you can take sets of data out of these tables and just go I want to chart that please and that's all you have to do and it will put them on axis for you and show you them and then you can adjust it but say I want the data from 1990 to 2010 only please and so on where there were NAs what would probably happen is that the if there was a line chart say given year probably get a hole just automatically provided the idea being that you can take data from any table and start playing around with it and chart designer make all sorts of different types of charts with data from multiple data sets so if you want to graph benefit rates against cheese consumption for example it should be as easy as finding both of them trying to put them together and then start playing with how it looks yeah it's kind of one of my questions as well there's some way for you to kind of get some dynamic let the data be intelligent in a way where you can actually have dynamic correlation so let's say you've got all these various data sets and just throw them all together and see which data sets start to show correlation so like I said income versus cheese consumption versus violent neighbourhoods or whatever the case may be so you can actually start challenging some preconceived you know correlation is not causation kind of thing is that possible? yeah so there's that website isn't there I forget what the guy's name is called yeah Getmunder? I can't remember with all the charts the crazy charts yeah crazy charts that show things like consumption of cheese in America is 99% correlated with like reports of nightmares or something like that like really yeah so this guy does that and we were looking at that initially and going it would be really awesome if the system could do that and I think in time potentially it could it's not something that we do right now it probably won't be this year but yeah but absolutely but also you could take our IPI and do that yourself yeah and then let us use that maybe and but when you come like instead of coming to Wiki New Zealand and just being a chart designer and you know you have to figure out what data to use to build something like when you I think the difference between a lot of sites that have data that you can do things with is that it's important I believe for people to come and arrive and see something to start with because if you don't know how to think about it or you don't know what how to build a graph it becomes too hard and so we will like internally and grow a team and make lots of graphs ourselves and put them on the site and then others can come along and either create their own or edit ones that we've already created in terms of what we're doing organizationally we have been working with some government agencies and are doing a kind of a proof of concept at the moment to demonstrate the value of what we're doing with the goal of it being like an all of government service where we can do be the accessible and the dissemination platform for all of New Zealand government which is a really exciting question You ever had any issues with copyright, particularly Crown copyright? We haven't yet but in saying that we are taking a lot of the data that's already sitting there that's already been made available but not accessible but we're trying to really work with the agencies not just stand there and take things and we are kind of at the moment with part of the concept is defining a methodology with how we engage with everyone and getting them to sign off on the methodology because some agencies are really concerned around the metadata that follows the data through the system which it does and when people take it out of context what they can do with it so we haven't had big issues but we're really cautious of that We'll work through that While they have Crown copyright that doesn't mean that they get to keep it to themselves exactly they are not required to release it They are now but there are baby steps to get there Hi I was just wondering how does your site deal with big data sets like stuff that's available from data.gov.nz or reference there When you say big data sets what do you mean exactly because there aren't many big data sets like true big data sets especially public ones like a lot of the data from data.gov.nz Yeah which kinds of data sets I don't know I can't put an example out of the air but I know that there are some quite big data sets available I was just wondering is there like a size limit to the data that you guys can deal with or have you thought about that I'll just borrow this one Yeah so there's no real practical size limit yet but we're also not the type of data that blows out to say many petabytes or anything like that A lot of the stuff as you can see is already aggregated so number of children on the school roll in this year is already an aggregated number if we were to take in as perhaps what might be nice one day to take in like a raw feed of that information and anonymise it and such like ourselves we would have problems but right now the data that we're getting is often already aggregated number of cancers by year and such like that these are already aggregated numbers with that stuff we don't really have a problem right now Not sure if we're in question time or if you've still got more to go Just curious what your business model is how you're funded and also what your outreaches like this is really interesting how are you going to get it out there into the hands of people so they know about it and can start to use it A few things I missed saying we're a registered charity in the same way that Wikipedia is a charity and I think people would treat it quite differently if it was a commercial organisation I set up Wookie New Zealand to be a charity from the start but I do think of it more as a social enterprise in terms of how we behave I don't think New Zealand is large enough to fund what we would need to do this well just through individual donations so we've worked towards a couple of different revenue models one being the people paying the service to put their data on and that's what we're working with government at the moment to prove that so they're paying us to start and to do that with a bunch of data sets and part of that will be to do a cost benefit analysis report to say what this would look like going forward and some of the private partnerships we have are similar where they are paying us to put their data on and then it might say proudly presented to you by chorus or whatever in addition to that and which is part of the discussion that I'd kind of like to have and to drill what you think is what we're developing in the backend now with the platform Grace is I think it's really valuable to lots of uses and when I first showed it to someone they said I already know who in the UK would want to buy this for the government there to use and I don't know what to do about that because I agree that there is a commercial model for packaging part of this up as a product and selling it into other countries or other or large companies that have lots of data that want to do things with it but we're not a commercial organisation we our purpose, we're really purpose driven our purpose is to get people to use data about New Zealand and we've benefited hugely from open source stuff that's how we started was throwing things together that already existed that were shared with us and I don't know what the best thing is to do with what we have now at the border level we've been having some interesting discussions because we're not in it to make a profit that's not why we're here but we can't theoretically if there was an opportunity to make money from what we have that can be directed towards the purpose of what New Zealand like that's what as governance you're supposed to do so I'm really curious about if there are models or what you think about how something like this could be best treated so that it is freely open for people to use and how you can imagine that working can still have a sustainable model that makes sense does anyone have any thoughts on that have you got a question rather than an answer to that another question interesting media have you been used as source by media and the other question is how's your output and your work licensed for that sorry I didn't answer the rest of your question before either so we've got two main audiences to start with which is schools and SMEs so small to medium business owners and we haven't been trying to do big pushes of people using it when we first like I think we've had about 130,000 users also in the last year or so kind of making up numbers but we haven't tried to do a big push and now that we are building the new front and back end we are thinking about what we're going to do for that part of that will be going into schools and talking to schools and universities about how people can use it and that's been really people, teachers have been really excited about that part of it is partnering with people like NZTE or MB to go around doing workshops and with business owners to say this is how you could use data but I think one of the interesting things is we're almost going to end up with hundreds of marketing arms because when people put their data on Wiki New Zealand then they start driving people to us and saying you have to go and find our data here now and then they are kind of advertising that to all of their networks and then the people that come will stumble across other data sets and when we started we got quite a bit of media attention and I did a TED talk and stuff and I've just kind of tried to halt that a little bit until we're ready to really sell it but I'm not people seem to like it and it's free to use and it's valuable so the spreading of it is not something I'm worried about at the moment in terms of sorry, media itself using it that's they're kind of like our next down of who we would like to focus on but I know that I almost feel like we need to kind of prove ourselves a bit more and have the brand a bit more respected and known before media will take exactly like our graph posts and use them but we're like I've talked to people about being involved in some of the journalism training schools and things and they're keen to get me involved in that but I just want to be kind of ready and have everything a little bit more robust before jumping into that. Question? So through your work are you actually trying to foster the adoption and the use of permissible licenses like Creative Commons and GPLs and the like so both from the people that you source the data but then also making the data available downstream so try encourage the whole ecosystem to basically keep the data open? Definitely, like all our stuff is by 3.0 kind of open Creative Commons stuff if people have given us data that have different licenses we'll display those and just retain whatever licensing they require but to date we've been able to get everything as 3.0 and encourage that. Have you seen your suppliers adopt permissive licenses because of the work that you've... No but that's probably because we're starting with low hanging fruit and data that's easy to use and what I think will happen is that the people that are a bit nervous about it don't understand the whole system yet they don't understand how people will reuse it and the data that will stay associated with the data so once they start seeing that and seeing examples and trusting the brand more then I do think that will happen. Yeah, there could be a snowball effect regarding that as well you start out with people getting their data on it and then they see what people can do with it and they go wow this is awesome, we want to be on this and then we say to them well if your data is not Creative Commons then it's a little bit difficult and they go maybe we should do that. You talk about the trust of the data and at the moment if you're thinking about something like Wikipedia where you have lots of people guarding the data and making sure it's being checked and it's correct and the source and the veracity of the data is good how do you do that with data that people are generally going to upload for graphs into the Wiki New Zealand space like will you do secured pipelines directly out of Stats New Zealand where you maintain the pipeline and no one actually touches it so you know it's a direct reflection of the data from that office or how do you plan to handle that sort of side of things. Yeah that's a really important piece that we'll live and die on right is getting all that right and you can't just arbitrarily upload a data set and suddenly have that as part of the system it does need to go through our kind of auditing processes and checking and you can see who's changed what and when and why and that's transparent for everyone some of those processes we haven't defined yet because we haven't had to and when we start scaling out and having 50 people sitting around the country importing data of those systems we really robust will probably have everything imported twice so that you can you know compare and if they're the same then it's fine and if it's not someone's yeah totally the biggest way to get round that is just by operating entirely transparently and if you see a mistake or if people know and then they feel free like they can actually that they're involved in the community enough to go hey this is wrong and then we figure out why I think that's going to be the biggest way that we win and that's it win through collaboration