 I was reading for pleasure the other week, a fantastic edited collection called Senses' Surveys and Privacy edited by Martin Bulmer in 1979. The arguments that I read in that book, which is now 36 years ago, are still many of the arguments that are being brought out about the protection of data now. There's a lot that hasn't been done and also I think in terms of our law making we are starting to look quite regressive. The EU regulation is probably the tip of this legal iceberg that is going to make things more and more difficult in the future. But you don't want to hear from me, you want to hear something more and challenge our speakers on the issues that they've raised so far. So, I'm going to hand a microphone to somebody who's going to rhoave with it. Chris Martin of Crystallize. Can I ask Andrew a question? You mentioned something about enabling predictive analytics while harnessing new methods to enable predictive analytics whilst protecting data. And I wonder if you had any more information as to how that, this seems to be the big issue that we're grappling with and if there are methods being developed. I think it's what you do with it, isn't it? The classic example that is often referred to, is it the LAPD or NYPD, where actually through the predictive analytics you end up in a situation whereby you are sort of almost identifying communities and you know sometimes by further inference individuals who are also types of people who are those people who you are saying are likely to commit more crime. And that is a dangerous thing that crosses various ethical boundaries. So, I think that's probably what I was referring to. A sort of probabilistic analysis that leads to a situation where you are with a degree of surety and certainty targeting individuals almost around things with which they don't want to be associated and with which they have a right not to be associated. So, the other side of the coin to that same question. Can I not use those same predictive analysis to make my case for resources for my community? Thank you. So, Jackie Carter University of Manchester. I was in a meeting recently in Manchester, Northern Powerhouse up there, where local authority people were desperate to get their hands on data, join it up in the way you were describing. And really keen to acquire the skills to be able to interpret that and gain insight and do something useful locally. So, my question, which I tweeted earlier, is, but who pays? So, you know, we have this fantastic infrastructure supported by the ESRC and others for researchers, but at the local level this challenge seems to me to be who pays for that at a time of restricted budgets. And I'm not suggesting any of you have the answer here, but I'd be really interested in your thoughts given the conversation and the narrative around the public good. Well, I think in the spending review we did have an opportunity to kind of bring together some of the kind of digital and data capability in local government. Unfortunately, the investment's not there. You know, government's announced 1.8 billion in investment in digital transformation in central government. Unfortunately, it's going to be an even tighter financial settlement for local government, and it doesn't seem like there's all that much money there to kind of fund that kind of capability building that you mentioned. But it's really important. It's really, really important. I'll have to give my two peneth as well. It's something I feel super keenly. You know, I've got a traditional analytics function. I work in the Greater London Authority, which in comparison to those 33 borers who I've just described is broken, is a feather-bedded environment actually. You know, I'm not on the front line doing service delivery in the way that they are, and I sort of fully acknowledge that. And we need the invest to save models around, you know, these wonderful things that we talk about, you know, in sort of relatively attractive rhetoric. And in some cases they're just not there yet, and somebody needs to take a bit of a leap of faith. One of the things that I am thinking about at the moment with people in the borers though, which might free up some capacity, is around traditional IT infrastructure actually. There are 33 server rooms out there in London, all of which are consuming vast amounts of energy, all of which cost a lot of money, and all of which have large teams of staff operating them. There's an ugly truth in there that they need not exist in such quantity. There's that thing called the cloud in which you can put a lot of this data, and in which you can start to harmonise some of this data in the way that the various amongst us have spoken about. That has the potential to drive significant savings, as simple as that. It's almost a politically attractive thing, I think, the degree to which, sorry, the savings that we could potentially be talking about. But through that you need to be saying we want to add an analytical layer over the top of that data harmonisation and infrastructure rationalisation. And it's through that that you get some money to actually do some of the stuff that you're talking about. And it's a shame Paul isn't here to talk about his big windfall in the spending round, and how he proposes to make sure that, well, a significant chunk of that trickles down to local authorities, because I would argue that it's there where there are the biggest gains to be made. Central government has had its bite of the cherry now, and I think it's in local government where you can exercise, well, some financial generosity and get some significant gains. Hi, my name is Jansef Jamal, and I'm here this evening interested from a variety of angles really. I did postgraduate quantitative study, and thank you to the ESRC for the funding. I've since been working in the private sector, but I'm also a local councillor, interesting following on from the last question. And it's so true there is such a desperate need for data. And it was interesting to hear, Andrew, some of the things that you were saying. In addition to the skills and the analysis, there's also, I think, a challenge with regards to interpretation of data and what we're seeing. So just to give you a very quick example, I chair a committee. We set up a group recently looking at child sexual exploitation issues, looking to sort of see what are the processes in place. Are we trying to be capturing these things as effectively as we can? And I was with some colleagues, and we were presented with some data by the local police force. And one of my colleagues said, I like that bar. And it was actually a comparison of different boroughs across London. And one of my colleagues looked at one of the bars from one of the boroughs and said, I like that bar. That looks nice compared with what it was showing for our area. And I said to be very careful here because we cannot know. Is this correct? Is this not correct? Is one area, you know, the data might have been collected very accurately. The data might not be so accurate in another area in terms of what does good look like. So I think there are a lot of challenges. My question was going to be more about, I mean a number of you mentioned the public policy challenges with regards to sort of, you know, the long term, a lot of processes in the infrastructure in place to get to I think where we'd all like to get to is going to be a long term project. And one of the things that frustrates me about the political process is that it can be a very short term vision that politicians have. So I was wondering to what extent are you hopeful that actually we can sort of collectively put that long term process in place? Well, I was going to pick up on the first half of a question to start with. A child sexual exploitation is a perfect example of an area in which freelancers aglomerate local data into national figures with very little guarantee that that national figure is actually accurate. And you can't blame them for wanting to do it. This is the MSPCC, an annual report. It gets quite a lot of press. It takes FOI requests to all police authorities, which usually come with an explicit disclaimer saying there's no guarantee that's comparable to any other police authority and aggregates it up. There is a huge risk in that. And if you compare that to the careful process that crime figures go through in order to be nationally published, we can see why there is value in data being properly managed and properly checked for its accuracy in the real world. On the other hand, you are getting a higher resolution picture from these lower quality figures. So there is a real problem here about where we invest in the data chain is not just storage. It's how we manage it, how we quality assure it, how we analyse it and then how we present it and make it useful to decision makers. Whether decision makers means us as voters or ministers or police commissioners and so on and so forth. On the, am I an optimist question? Absolutely, yes I am. Our world is being transformed by the intelligent use of data. It's also being transformed by the unintelligent use of data, but we're doing our best to stem that tide. I think one of the biggest challenges there is overselling and I am so bored of hearing data people claim things can be more dramatic than they will be. They can be faster than they would be. Whether it's a technical oversell or a political oversell, it's just boring and it's been done for too long and it is actually deeply damaging. So if we all agree that we are on a long term journey here and we all of us know that some of these things will only be sold when some IT systems get so obsolescent that we finally have to be paid to be replaced, then we need to actually start making that pitch I think because utopianism is useful and not entirely misplaced, but some more realism about what the journey looks like has its place as well I think. Yes, and to echo this, I mean I'd say I'm partly optimistic just because I can't help it. I'm just an optimistic sort of person which I think is useful in this area. It's never going to be an area that's going to get 6 million people really excited. You're just not going to find that many people writing their local MP saying can you please give some more money to local authorities in general for this in the way that you saw with the recent discussions over benefit changes. It just doesn't happen, but I will observe that it is a small P political process and I say that without any lack of sympathy for particularly local authorities which really have been injured so it's all very well us all knowing it might save money in the medium term but up front there's an investment and I say that simply because in my previous role trying to get money out of central government which did better and I saw how hard my analytic colleagues in central government were running just to stay still and it's not just a fairytale that they didn't always have the resources to get data sets deposited or whatever in time so it is a small P political process but that's why I do come back and say it's not just for us it's for up here or data analysts professionally but for those of you in a wider world to think about if there is a public benefit to some of these things and again it's not polarized by saying we don't care about confidentiality it is about saying well we need to be writing letters we need to be signing the welcome trust from different organizations it's really important it's not just medical researchers that are saying this will really affect research so I do think it's partly about creating a climate that says we can do better for citizens if we do this A couple of observations I used to work in that dark satanic mill that's come up a couple of times tonight called Morrie so then Paige used to tell me if it's interesting it's wrong repeatedly so it sort of responds to Bill's points as well about people overselling and misrepresenting things there's an awful lot of that around this thing called big data I'll leave that there but my main substantive response to your question is it's really interesting to look at the states and to look at city governments in a place like Chicago or New York of which had data litter of mayors still has one in Chicago called Rami Manuel and there's a guy there called Tom Schenk who is their chief data officer and you know they have a chief technology officer as well and these debates whilst not central to the next mayoral election in London because we'll talk about housing rightly so and transport and other things like that that need to be supported by a proper debate underpinned by proper analysis but those are highly professionalized and political roles in the states and the most important P there is political Tom Schenk has made data a political thing it is something that is brought to the cabinet table or the equivalent thereof in a city like Chicago and nothing happens without a proper thoroughgoing analysis of the data at their disposal and these are seriously qualified people doing this and I think we need to move towards something similar in our own city as well Chris Martin Chris, I'm from a healthcare and bioinformatics background I work with the finance industry quite a lot and recently did some work with the longevity science panel on a report on this topic and one of the big issues was this definition of the public good so the pensions industry so I do longevity modelling and the pensions industry need to make estimates of how long people live using aggregate analysis of individual data there is an issue about access to this data now they are very strongly of the view that they are providing a public good and that lack of access to this data means that we spend more our pensions become more expensive now I wanted to know what's the panel's views on what this definition of the public good actually is cos it's hard to find out and what are the key principles that should be the deciding factors in what is and what is not a legitimate public good If you ask me do I think there is an airtight philosophical a priori definition of public good my answer is I'm a social scientist and I don't believe that what I do think is important is that there is an independent assessment which would include scientific peer review but also include some lay review that says is this in the public good and is it proportionate and I hate I can't help it, I'm going to do anything I'm going to mention carodata but I think frankly that was one of the issues there was not having an independent review that says was selling these data to health insurance companies so that they could more accurately set premiums is that in the public good or not now I know many economists who had argued that one way and I know some economists and a few other people who might argue it another so I'm not going to give you my answer to that but I am going to say you need a proper scrutiny process and it seems to me that's why it pains to say much social science data already does this and again I'm not saying it couldn't be strengthened but it does actually take seriously is this going to be the public good but one key indicator is publishing not only your findings but kind of how you reach them so that if people want to challenge that and say well actually you've used silly algorithms or you've made heroic assumptions that can become itself part of a matter of public debate so I absolutely agree with what you're saying about the use of these data for premiums you wouldn't want to rule out I think all commercial use as not being in the public good but what I'm very sure about is locking something up so that big business which transcends all these boundaries can actually share data and public policy and local services can't does seem to me asymmetrically not in the public good I think one of the questions to ask is what's the public view of what a public good is and how detailed is the understanding of an individual when they give up certain data about the uses that it will be put to and the controls that they have on that so I'll give you some examples so we're doing some research at the moment looking at consent and you know we could give people some very very detailed options we could give them 42 check boxes you know I consent for this to be used for academic research but not for commercial purposes I consent for this to be used for healthcare planning but not for insurance companies you could start to give some of those granular permissions what we've found so far is actually people kind of make some very sophisticated trade-offs don't underestimate the public understanding of some of this stuff but equally I think going to that extreme of 42 check boxes I think that would be another extreme of overwhelming people with something that is quite a difficult concept to get across so I agree I don't think there is a definition of what a public good is I think there are complex issues around identity and trust and consent and there's a public perception and public debate that we've got to have over the next few years I'd say as this becomes a bigger part of all our lives we're not there yet I suppose I'd make a couple of observations one is that I wholeheartedly agree and I say this on instinct rather than on evidence that the casual, careless and utterly heartbreaking destruction of trust by care.data frames this debate I think for me and for patients to feel betrayed by the NHS is a terrifying starting point for having a sensible debate about what we do with data in the future and that's compounded by the fact that Parliament is and I say this having worked in a non-party political capacity there for a while terrible at legislating in any area with rapidly changing technology we know this and we proved it four century after century with communications technology and with biology and data will hit and is hitting exactly the same barriers and that's why I wonder if there isn't a role because every time I have a conversation about data it gets forked in very short order the level of confusion the level of nonsense around big data as Andrew hinted not very long ago and lots of other aspects of data is so high that I wonder if there isn't a role for a similar process that we went through in the context of human fertilisation, embryology, Bill and Axe where Baroness Warnock was commissioned to do a report which has provided the underpinning for the legislation in that very fast and complex area of technology ever since where she provided the intellectual distinctions that have remained current even as technology changed and the Nuffield Council for Bioethics has had a similar role in providing an intellectual framework for some of these conversations which I feel is missing in the conversations I have and I hear about data so I wonder if we don't need it if I can borrow their name and take it in vain a Nuffield Council of Information Ethics where people try to create a framework of concepts and distinctions which will remain even as the technology changes around us so yes there needs to be a public debate and yes ultimately this is a question for the public and not for anyone to hand down but actually framing that debate and providing a useful intellectual framework for it is a task that I haven't seen anyone really step up to either that or I'm just not keeping up to speed and one moderate suggestion is if I can have a bank statement why can't I have a statement of everybody accessing my data because if I have that, that level of transparency even retrospectively will provide an awful lot of accountability for uses which we find acceptable and in a sense there will be a way of as things tend to things levelling out over time but actually the way your data disappears and you don't know about the uses prevents that levelling out process and raises the stakes of that initial consent to such a high extent that I'm not sure is entirely helpful or as Dan was suggesting solvable I've been involved in some discussions recently within the OECD about data philanthropy and the relationship between how individuals might offer their data towards things and get some form of reward for it and this is all to me to my old fashioned mind this is almost as bad as trying to think about the monetisation of data but it does demonstrate that there is a really straightforward public belief and this is a majority that data can be used for good I won't say the public good necessarily and I prefer to use the word public benefit because just publishing research is in the public benefit so I think there are challenges there were challenges 30 years ago we've got some way we do have access to some data we have access to quite a lot trying to join up is the message that I'm getting here trying to join up some of the things that we do so that we're consistent and that we are robust and resilient and reliable and that's not just the UK data service as something funded by the ESRC but it's the other data infrastructures as well we can share some of these same standards we don't have to always do everything in a really heavy handed way and we can join together because we do really all have the same ultimate goal which is to try and put data into the hands of people who can use it for some form of good and there are all sorts of challenges around that and they start with understanding how data is collected they move on to making sure that it is described properly that there is the right sort of metadata that we do understand about the statistical properties of the data we still have to look after it one word that hasn't been mentioned yet which is almost in my job title is the word curation we do need to be able to look after these data not just for today or for tomorrow there is an indefinite period of use of some data well into the future and data for academics, data for researchers if it's two years old that's better than nothing from yesterday and if a commercial organisation can't give anything because it is commercially sensitive I think we need to play that game but we do need to let some of the commercial organisations play our game too publish get the public to respect what you're doing what we as a researcher community I don't use the word academic if I can help it I try and use the word researcher it's somebody doing something trying to understand something there are formal definitions of this which make it a bit more difficult so I'm taking away messages of more collaboration to try and widen data access and to try and understand how we can put what is valuable to those who need it into their hands