 Won't take that. You'll need it, questions. Hello everyone, so I'm going to have a discussion, hopefully, with you today. I'll rattle through some state of the world and open data, hopefully quite quickly, and then we can jump into some discussion. My name is Gavin Starks. I'm the Chief Executive of the Open Data Institute. The ODI is just coming up for its second birthday. We're an independent non-profit company set up by Sotirn Barasli y cyfan yw'r web i'r Ynwyddiol Shadbolt, a'r rhemyd i'r ddau i'n erioed o'r cyfan y gwasanaeth o'r ddau yw'r gweithio ar wel yn yr ymdeg. Mae'r ystod i ddweud i'r ymddangos i'n ddau yw'r problemau sydd yn credu'r cyfan yw'r ddau yw'r ddau. Yn ysgrifennu pysgau'r teiml i'r cyfan yw'r cyfan yw'r cyfan yw'r cyfan yw'r ddau yw'r web wych. Sefyd, y web yn y cwrs 25 yma ar y cyf script i ffodol, ac yn bryd y gwrdd hynny'n rhoi'r ffordd lleol yn ffordd 22 yn y bryd. A'r awrch oだithfeydd followers.gov.uk yn 5 lleol erbyn y cy concei'r ei dŵr opent data eich bod a'r awrch oed yn ddim yn dŵr, ond ei hun, yn ei dŵr ariod. Mae'r awrch o'r 5 o 5 o'r 6 oed yn y dŵr datblygu eu dŵr oed. A'r awrch o dŵr bydau o dŵr What's the web of data going to look like in 20 years from now? The amount of data that's being produced is doubling every couple of years at the moment. But let's be very clear what we mean by open data. It's not big data, it's not personal data. Actually explicitly, it's data that can be used by anyone for any purpose for free. So we explicitly exclude personal data, so your individual health records, for example, should never be open data unless you give explicit informed consent to that. And big data is a phrase that I think a lot of us would like to go away because it is a bit meaningless. It's bigger than yesterday, I don't know, bigger than you can put in a spreadsheet, bigger than you used to, et cetera. So the kind of question though that I'm very interested in is how are we going to scale society and how can open data actually help us tackle some of the greatest challenges of our time and what is that challenge? The challenge is to try and sustain more than 7 billion people on the planet. That's an awful lot of people, a lot of people who need power, who need energy, who need food, who need water. We need to deal with their education, their shelter, transport, their health, their jobs. And at the same time we've got this rhetoric that's almost like we've hit peak, peak. You hear about peak oil. We're also, you know, peak uranium 2030, peak phosphorus 2025. So that's quite bad for farming. Peak copper, although we can mine some of that back out of the landfill. So we're hitting sort of peak resource all over the place. And we're not very good at actually trying to work out how we can scale on a per capita basis. So the kind of rule of thumb here is if everybody had the same sort of lifestyle as the US or the UK, we need several planets worth of resource. So that's not going to work. We've also got cities scaling at a huge rate. So more than 50% of the world's population currently lives in cities. This is going to grow to 70 or 80% over the next 20 to 30 years. In China they're creating I think more than 100 cities that have got more than 10 million people in them over the next decade or something. It's some phenomenal rate. And even in London you have maybe 10% growth in the population every decade. So for London that's an extra million people over the next 10 years and the next 10 years and so on. So that gives you some sense of we can't just keep with sort of business as usual. And at the same time, we've got this huge move towards open data. And that can be social data, it could be population, it could be health information, it could be transports, it could be your own user generated content that you choose to make open. It could be environmental data so you could be looking at weather, you could be looking at farming, you could be looking at pollution or resource scarcity as I mentioned. And you've got economic data being made available. You've got corporate information being published. Companies house in the UK is going to start publishing all of the corporate accounts in the UK as open data. So another level of transparency that we're not used to. Interesting that sort of social contract is companies were originally created to have limited liability in return for transparency. We seem to have forgotten that sort of along the way. And this applies to new forms of trade and commerce like peer-to-peer lending. It applies to where assets are and much more corporate stuff. At the ODI, we incubate startups. We've had 17 startups through the programme so far. One of them identified a potential £200 million saving for the National Health Service by looking at drugs, looking at prescriptions and how many prescriptions of a particular class could be switched from patented drugs to generics. The price difference is £20 versus £70 for the drug in question. And even taking out the clinical cases where you need those drugs, the patented drugs for the patients, you can still save hundreds of millions of pounds. And that was derived in a matter of weeks by a handful of people. The next stage of that obviously is what do you do about it? How do you engage with the 1.1 million person bureaucracy that is the NHS to make the change? Similarly, we've got a company called Open Corporates that are mapping all of their companies in the world. They've got data for 78 million companies now. And they can do things like beneficial ownership maps of banks. So with Goldman Sachs, this map, which you can just see on the screen here, resizes the countries based on where the greatest number of subsidiaries are. And the large blob in the middle at the bottom there is the Cayman Islands. So you can see a huge number. You've got the UK, Luxembourg, Cayman Islands in the US. If you look at this map, this is for Goldman Sachs. If you look for Bank of America, a very different picture, most of the ownership there is inside the US. And you can draw whatever conclusions you like from that. We also brought together the peer-to-peer lenders in the UK and analyzed £400 million worth of spend data. We looked at who was lending, who was borrowing. Unsurprisingly, most of the lending was being done from the southeast, but most of the borrowing was uniform across the country. And that really highlights that the high street banks are not lending to a huge chunk of the market here. And this helps the Bank of England to think about how it could be data intensive in policy light. So this is kind of open data hitting the financial world and helping them ask questions about how could we regulate differently. How could we use data to police rather than coming down with policy that may kill markets. Similarly, analyzing government and public spending across Europe, the UK is particularly bad at receiving money quickly out of Europe. There's roughly £22 billion worth of cash that takes nine months to get from Europe into the UK compared with about three months for other countries like Poland. So we don't fare too well there. Also, tools that can help policy makers and citizens make decisions about their public services. So this is an analysis of the fire station closures in London looking at the response times. The number on the top left there is the response time of a fire engine to your area. And what happens, you can go through and select which fire station you want to close and it will tell you what the difference in response rate is. So it gives you a different sense of engagement around how you can not only help the policy makers but how you can engage the general public to say actually closing this fire station isn't going to change the response time in your area. In this case, we also combine it with footfall data from Telefonica looking at the actual population densities in different areas in the city because they're radically different from the actual residential population. You might have 100,000 people living there but a million people that work there during the day. Skip through those. One of the big pieces here that we are very interested in is the cultural change. We don't really see open data as a technology problem. We see it as a behaviour change problem and a cultural problem. We've been closed by default for really since the beginning of the industrial revolution and our patent and IP laws have reinforced this notion that open data is better. There's a lot of evidence now that open is much better. I think we're all familiar with the open web. It's far more powerful and has had far greater impact through being open than through being closed. We are a very interesting inflection point here where we can choose to be more open or we can choose to maintain the status quo around whether we have open by default or closed by default. That's something we've embraced at the ODI, so these are some of the figures from our own accounts, so how much value we've unlocked, how much income we make, how many people we've reached, how many people we've trained, et cetera. I think we're trying to lead by example here in identifying how can we quantify impact from a triple-bottom line perspective, so looking at social impacts, looking at environmental impacts and economic impacts. Social impact could be just greater transparency, could be better services, could be better public engagement like the fire station story, could be open innovation in the economic sector, bringing operational efficiency like in the NHS example and environmental impact, so can we really pull together the different data sources? How would we find, how would we know exactly how much oil reserves, coal reserves, copper reserves exist and mash that into an application that you're making that interfaces them with a product design tool? So if you're going to make a new fridge and there's going to be a billion people across India and Africa that want a fridge over the next 10 to 20 years, how are we going to make those devices? Because there simply isn't enough material. Where's that innovation going to come from? And so from our point of view at the Open Data Institute, we think it's time to really build the Open Data Ecosystem at web scale. And I think what we've seen over the last 18 months of our own existence is a huge amount of interest. We've seen the G8 or now the G7 sign an open data charter mandating that certain datasets like mining, extractives, corporate information maps, and genome information is open. That's likely to be adopted by the G20 this year and that's going to likely then be followed through by the OECD. So you've got this top-down pressure that we didn't have during the web. We've got a lot of the engineering of the web, I think, right because the political establishment simply didn't understand what was happening. This is very different in that context and that this is being driven by a political agenda as well as a grassroots agenda. We found that there's a huge amount of interest from governments, from NGOs, but also from companies. We've trained about 400 people actually over the last year and a lot of them have been lawyers or executives in telecommunications companies, et cetera, who are really trying to understand what does this mean for their business and how can they actually engage in it and be part of this kind of revolution around open data. We've also been blown away by a lot of international interest. We've got 20 international open data institutes now in 13 countries right across the world, including Moscow, including Seoul, Osaka, Dubai, places you wouldn't necessarily think as being particularly open. And we've got a programme of the World Bank. Again, it's all very top-down stuff of saying how can we train the world's political leaders to develop open data policies? We've done work in Burkina Faso, which has helped them map all the schools and release that as open data so parents can make better choices about where the schools are and how well they're performing. Like I say, there's a lot of corporate engagement here as well. I'm going to finish on a couple of points before we break to questions. One is that in order to try and link together all the open data that's being published, there isn't a particularly good mechanism or hasn't been a particularly good mechanism to do that. One of the things we've created is called an open data certificate. It's free, it's open source. You can use it as a data publisher to say, here is my URL. It has an open licence and I will commit to it being available for a year. And questions around the metadata about what you're publishing that enable that data to be discovered by data users and used more by those users. And so we're very keen to see people experiment with this to adopt them. The UK government has adopted them and we're starting to see now people like the Met Office on surveys, et cetera, publishing data with open data certificates. We're also starting to see private companies publish data with open data certificates as well. And what this enables is the ideal here is that Google will start to be able to index these certificates and therefore enable the discovery of data much more easily than it is today. And the other call to action is if anybody is working in open data, we invest, we have substantial amounts of funds to invest in communication. And we invest heavily in telling the stories. Some of the examples I gave earlier, we've had national coverage in the UK and around the world. We've been on the front page of the Financial Times and so on, really trying to raise the visibility of open data's relevance to everyone, not just the fringe activity that's happening in the geek or sort of web community. This is really front and centre in the political community and really front and centre in the business community. So lots of challenges there. The main one is how do we scale? How do we get millions more people into cities? How do we do that in a sustainable fashion? How do we provide food, energy, et cetera, for everyone? But also how do we make sure that this remains open? So how do we avoid this net neutrality debate around open data? How do we convince companies that open is more powerful than closed? These are really big challenges and we are in this again, to just finish where I started, on this inflection point. We have the grassroots community who have been working on this for decades. We've got the political sphere who sort of understand the principle, but they don't understand the technology. Companies who largely are confused or threatened by change and are quite cautious, but will default to close unless we really help them be more open. So lots of open questions there about how can we make this a global movement? How can we really engage in a broad ecosystem and what sort of problems do we think we could solve to demonstrate and provide the use cases for the power of open data? Thank you very much. I know I did that very quickly indeed. OK, we're just going to open the floor to questions now. So if anyone has a question here, just please raise your hand. If not, I'm going to answer. Yes, hello sir. OK, just two seconds, just going to bring this to you. So what are the main groups like companies, organisations against this open data concept and how do you deal with that? Is it a problem? Are you trying to get new open data or are you just using the data that already exists? We're both trying to demonstrate the value and what already exists because that helps demonstrate what's possible as well as open up new data sets. I wouldn't say companies aren't really against it, so to speak. It's more of a cultural thing of like we have just closed by default as a way of operating. And it'd be really interesting if I was standing here saying, let's imagine a world where everything was already open. Like when we published our accounts, it was already open, when we published our research, it was all completely open. Let's imagine that was already the case when we were standing here from the closed data institute saying we should really close all this information down. You don't think I was completely crazy, right? But we have developed this over the last couple of hundred years as a way of working largely down to our intellectual property laws. So I think the challenge for companies, particularly big companies, is that they're very risk averse. Actually, they might feel like big companies are very kind of beamoth who will try and own everything. But they're not going to do the innovation because it's too risky. So we need to bring together the start-up companies, the innovators, the NGOs and the rest of the community to say actually the sky isn't going to fall if this happens. What are the benign use cases we could start with and then gradually get more confident as we go forward? So I think that's really the challenge. So that's one category. The other category of people which is the vast majority of the world has no idea this is happening. So that's the bigger challenge. It's almost just raising awareness at all. And one of the big challenges I think we face is when you mention data to people, they either think of care dot data or personal data and the kind of bad PR you get around that or Snowden or WikiLeaks or Facebook or something that actually is quite different from what we're trying to achieve here. So there's a real threat I think of poisoning the well for a lot of the good work and good outcomes that open data can bring. So many, many challenges. It's one of the problems, not just a kind of a moral thing that people want to be closed or whatever, but if it's a bit of a hassle or maybe perceived as a hassle and expense to release data for a public body or for a company and then either perhaps people don't necessarily use it much or that there might be value that accrues to society and the use of that data, but the value is not accruing to the organisation that's got the hassle of releasing. Yeah, so the hassle and the cost question is something we encounter all the time. And there's some really interesting evidence now to say open is cheaper, it's better quality. So as soon as you start engaging with people who might be interested in the data that you've got, they start to care more about the data they're getting because they can provide direct feedback and fix things if it's broken. Same as Wikipedia. And the people who are producing the data in the first place suddenly realise that they have an audience rather than just having a spreadsheet on their desktop which is remarkably common still today in the academic world and in the public sector and the private sector is the information simply isn't captured and if it's captured then and put into a closed system, that closed system eventually atrophies bitrock sets in and we just lose the information forever. So it's a major challenge, again it's back to behaviour change to say open is cheaper, better and more usable than closed and that's a huge thing to try and communicate through these sort of existing barriers. A lot of people are terrified that when they publish their data people will find mistakes and we need to cultivate a culture of failure being okay. Mistakes being okay because the mistakes are already there. We just don't know them and if we don't know that the mistakes are there then nobody's going to fix them. So we're already using bad data or we might be using good data, we simply don't know but data isn't going to get worse quality and have less utility if it's open. My question sort of follows on from that which is don't you think that the technology such as RDFs and Al2, I'm talking about the semantic web end and say RDF and Al and all those sort of things which don't really have a problem. So in other words there's no real language so people don't want to invest in it because there's not a standard in which you can sit there and go well I can write all this in Al or Al2 and probably have to write the compiler and then next year it will all be changed and I'll have to do it all again. A bit like every software language that's ever existed maybe but I think now the principle here from my point of view would be put the data online, put it up there as a CSV file but give it an open data certificate so we can find it. If it's online and it's licensed the key question here isn't about the format. The format is an important question but it's almost a secondary question right now. The major challenge we face is simply getting the data published and licensed as being open in the first place and it's that licensing piece that is the big gap right now. In the US there is no open government license there's no equivalent of crown copyright. If you dig into it public domain in the US means public domain inside the US. So they're actively looking at CC0, CR0, license as a way of actually licensing some of their own public domain data. So the licensing piece is one of the big anchor points here because a company isn't going to invest in using data if they don't know for sure that they've got the license to use it. So if it's online I don't really care whether it's a CSV file or an RDF or an XML or whatever the thing is as long as it's addressable and that's one problem that we have resolved because we have URIs. If you have a URI and a license the data formats can then be machine read and machine manipulated after that point and it will be messy but it will at least be a mess that we can find. Hi there, thanks. I just wanted to go back to the point earlier about value of cruel really. When you were talking I was thinking about a recent book by Thomas Piketty and Capital in the 21st Century and one of his theses in that book is that one of the potential dangers for us in the 21st century is that more and more value gets accru to a smaller and smaller section of society and I was wondering do you see open data as a way of countering that? So that's the first question and then if so, is there a group of people who are actually against this or when they understand it likely to oppose this shift? Okay, so that's a great question and I think to answer the first question broadly yes I do see this as part of the general web culture which is more of a socially driven movement it's more about many parts loosely joined than aggregating and centralising power however we see that, we see this in everything that we've ever created and you see it in the web right now is that we are creating centres of power in Google or Facebook or the CIA or whoever happens to have the connection point so will this happen with open data? Yes it will in some way because once we make all the data indexable Google will and Bing and all the others will create great search engines which means we can find it so my question then is what's the additionality what problems are we able to solve after that? Again, central to this is actually the net neutrality piece so we have to have an open web and if we lose the open web we will lose all of the value not only of the web but all the potential value I think for open data so I think there's a really interesting tension there and your second question about who will oppose it well companies typically are opposed if they feel threatened they will maybe try and acquire as is often the case they'll try and acquire the small startup who's going to disrupt their business model but then we've got a good one thing that's different from say 1995 looking at the web is we can look back at the web and say well this has happened before we've done this in every sector we've done it in the media and entertainment industry we've done it in the newspaper world we're doing it across a whole range of different sectors and it's almost ironic in some ways that it's taken us 25 years to come back round to data and treat it as a category of information it's all right so I think certainly there will be incumbent organisations that will feel threatened some of them may go down in acquisitive and more aggressive type roads I think the more progressive companies will look at this and go I think we could probably save half a billion pounds if we did for example open clinical trials in the health industry rather than doing closed trials which cost fortunes to do and produce quite small sample sets there's probably better research and there's better outcomes and actually everybody wants those kind of better outcomes the question then is what happens around the IP and I think there's a lot to be learned from the open source movements to say what transition did we go through to say that the software IP isn't important but software as a service is the business model it's separated out the licensing from the paying for the service and in the long term open source has won it's kind of my view anyway has won a lot of categories because the sheer force of numbers and that's where we have a lot to really think about because as I've been trying to outline this is different to the early days of the web in that it includes the companies and the governments already that they're already looking at this and the government's already legislating for this so it's a really interesting balance I guess it's led by governments at the moment I think it's led by governments and grassroots in the same way the web has been and I think the governments are paying attention because of the grassroots piece and we look at the work that Government Digital Services has done in the UK and the history of that from the late 90s activists through a pause of while people digested what the hell was happening to Government Digital Services now existing as reinventing what the web looks like for governments in a very significant way and being able to point at that as an example this is the transition you've been through over the last 20 years is really useful to say well there's another transition we're going to go through here so I think it has been driven by the grassroots community the political sphere has adopted it because it ticks certain transparency in public good boxes but there's also if anyone remembers Yes Minister if you watch the very first episode of Yes Minister it's actually about open data and they get to a point in the episode where there's some data being released which will make the Minister look bad and of course the Minister then says no we shouldn't publish that data and you see that pattern in governments when governments come into power the first year they love the open data because it points out how bad the previous guides were and the second year they don't like it as much because it shows how bad they're doing and that just requires brave politicians and we need to really support our politicians in their push to keep the data open and actually Francis Maud in the Cabinet Office has been a massive supporter of the open data movement and I've been very genuinely surprised by that No one else have any other questions yet? No, I think just one more and then we'll probably just wrap up after that Is there anything from history that says why we have gone down the path that we currently are on? The open data path or the closed data path? Industrial revolution What's about my favourite line is the one thing we learn from history is that we don't learn from history so I think the cycle here for me is very familiar it's centralisation, decentralisation, aggregation, disaggregation we're seeing that in... you could look at the British Empire and the way that it's sort of trying to hoover up massive amounts of power and then disaggregated you could look at it in the way that now even Scotland wants to become independent you could look at the huge number of electricity providers that existed in the UK at the early 1900s that then aggregated and now we have six some of which are not in the UK and now we're seeing a transition to renewable energy where actually we can redistribute the energy generation and we can do peer-to-peer energy and we've got a start-up at the ODI called Open Utility who's trying to do exactly that so we can look at these sort of cycles and I think we're seeing the same thing with personal data personal data currently is centralised in Facebook and Google it can be decentralised through the personal data store initiatives and the sort of my data initiatives where you could get the control of your data and then you choose to licence it back to Facebook rather than the other way around so I think we just see these patterns all the time across everything we do whether it's international trade whether it's energy or whether it's the web it's just happening faster and I think that's the big change is when it comes to open data if we think back 20 years the web took quite a long time to reach where it is today because the technology wasn't quite there we didn't have ubiquitous broadband and we didn't have one of these in everybody's hands now everybody has a supercomputer in their pocket they have access to multiple supercomputers in the clouds for pennies a minute they can connect with everybody else in the world at a fast speed so the pace of change is going to be faster than the web in my view there's a lot of other caveats around that but I think the cycles continue they just shorten in frequency if that helps OK I think that's pretty much it I'm just going to shill one question on my own if I can actually please so you've spoken about open data in some ways it's almost like a gift from government here's how you can hold us to account better that's not the same as having a right so when it stops being convenient to you what's stopping the government's closed data back down again when it stops making them look good really well this isn't one of the most important things about getting companies involved in using that data because if companies are totally dependent on government publishing open data then there are jobs that are dependent on that data and governments really don't want to be responsible for shutting down innovation and shutting down job creation just to give one extreme example of that in the US some ex-Google folks went and set up a company called the Climate Corporation and they used US data around the environment around weather, climate around soil etc and they created a micro insurance industry for insurance industry for farmers so farmers could get very micro specific insurance policies for their farms across the US that business was almost entirely based on open data they didn't actually publish any open data they consumed it and made stuff Monsanto bought them for $930 million that's a very strong argument in terms of open innovation job creation and economic value creation for a government to actually say here is a tick now the specifics you may have different opinions on maybe it would be nice if they published open data as well and maybe that will come back around as we realise that everything becomes better if we're open everyone ladies and gentlemen thank you very much thank you folks now