 Ddechrau'n gweld, ond rafferwch cyfboardaeth ar y cyfboardaeth yma. A yna y ffordd ddataeth yma, yna ddano yw yw ddataeth yma, r spamwch arod, yw ddano am y cyfboardaeth, hon yw cyfboardaeth ar y Briysl i'r Llyfridol ac y Llyfridol cynllunio gan y Fathredd o'r cyfboardau ar y cyfboardaeth ac yn hynny. data science, and artificial intelligence. I just wanted to give you a quick bit of information and run through some housekeeping. There is no fire alarm due, so if it does go off then it's genuine, so please follow the science. We've got a hashtag which we'd love for you to tweet about the discussion that happens this evening and that will appear on the next slide. And I also just wanted to, just before we kick off with this event, just let you know that we've got another one coming up on the 14th of September, looking at cyber security and kind of looking at the question of whether artificial intelligence is the new defence, so that should be quite an interesting debate, so please do take a look at that on the British Library website. And lastly on housekeeping, the event is going to be filmed this evening, so if you didn't want to be filmed or photographed then please do let a member of staff know. And if you ask a question in the Q&A then please just say you don't wish to be filmed and we'll make sure that doesn't happen. So it's my pleasure to introduce Tumandra Harkness, who is going to be the moderator and the chair for this evening's discussion. Tumandra is the author of the excellent read Big Data Does Size Matter, and she's got a show coming up on Radio 4, very appropriately titled How to Discrete, and that will be out in mid-August. So I'll hand over to Tumandra. Thank you. Thank you very much. Yes, pleasure to be back. I return regularly to chair these data debates, which is great fun for me, but also I think really important that the role of data on AI in society is something that's for wider debate, not just by tech companies and policymakers behind closed doors. So it's lovely to discover the 100 or so people in London who either are more interested in inequality than football, or know that both England and Belgium are trying to lose, and therefore it won't be a good match. However, in tribute I have brought my popular red and yellow cards, which I shall not hesitate to use on the speakers, and it did on the audience when we come out to you. In case you haven't seen these before, basically, when the speaker has a minute to go, they get a yellow card, and when really their time is up, and they should wind up now, please, they get a red card, they don't actually get sent off. Even if they get a red card, they are allowed to come back for the next debate. We discourage diving, obviously. So the way this will work, as you probably guess, is nothing particularly innovative about this. We have a wonderful panel of speakers. We have given them a ridiculously short amount of time in which to introduce their thinking on the subject of data and inequality. That's because we really want to make it a debate, so we're going to ask them to keep their remarks to seven minutes or less, and then when they have all given their short introductions, I will try and abuse my chair's privilege to monopolise them for a few minutes with a little bit of discussion up here, but then I will come out. We've got until around quarter to eight, so we've got plenty of time, so we'll come out and we want to hear what you have to say. What I will do is take probably around three points at a time. Don't really have to pretend it's a question. We don't play those games here. You can have opinions, that's fine. So I'll take about three points at a time. We have roving microphones. If you don't want to be filmed, say I don't want to be filmed, or disguise your voice, or pass your question to the person next year. We don't mind at all. And then, obviously, we will keep coming back to the panel to get their responses and thoughts. That's all how it'll work. Right, let me introduce the five speakers. I'll introduce them in the order in which they're going to give their introductions. So, first of all, we have Dr Sandra Wachter. I hope I'm seeing your name. Yeah, really good. You were checking the slides when I was down. Dr Sandra Wachter, who's a lawyer, a fellow here at the Alan Turing Institute, and a research fellow in data ethics, AI, robotics, and internet regulation, cybersecurity at the Oxford Internet Institute. Her research focuses on the legal and ethical implications of big data, AI, and robotics, as well as the ethical design of algorithms, predictive policing, and human rights online. So we're going to hear from Dr Wachter first. Then we're going to hear from Catherine Mayer, who is a writer, activist, speaker, consultant, the co-founder and president of the Women's Equality Party. Her latest book is Attack of the Fifty-Foot Women, How Gender Equality Can Save the World, out and paid back early this year. Then we're going to hear from Dr Karen Salt, who is an assistant professor in transnational American studies at Nottingham University, an expert on sovereignty politics and the ways that discourses regarding difference influence narratives, decision making, and systems of governance. And she currently leads or co-leads projects on reparative trust, collective activism, racial equity, and transformative justice politics. Next we'll go to Robert Berkley, currently broadcasting editor at Blackout UK, a not-for-profit social enterprise, run and owned by a volunteer collective of Black Gay Men. He was director of the Running Mead Trust from 2009 to 2014. Alongside his academic writing on education, social justice, and community organizing, he's presented and co-produced short documentaries and written for the Guardian and the Independent on Racial Justice. Dr Berkley was an awarded an MBE in 2015 for services to equality. What a great reminder, please turn your phones to silence while tweeting. Does your MBE always get that response? And finally, it's a great pleasure to hear from Heaton Shah. He's the Executive Director of the Royal Statistical Society. I should declare at this point of which I am a member. It's a membership body with a vision for data to be at the heart of understanding and decision making. He's also visiting professor at the Policy Institute, Kings College London, and chair of the Friends Provident Foundation, a trust seeking a fairer economy, and a member of the Social Metrics Commission, which is seeking new measures of poverty for the UK. So, the fantastic panel, and clearly it's an insult to them to ask them to keep their contributions so short, but that's the price we pay for also wanting to hear from you guys. Are you going to present from the thing because you have slides? So, first off, we're going to hear from Dr Sandra Wachter. Thank you so much for the introduction. Yes, so I'm a lawyer and researcher in data ethics, so I focus on the legal and ethical implications of machine learning and AI and robotics, and I'm very excited to be on this very, very important panel to discuss this crucial issue with you. And I've named my talk diversity as a holistic approach, because I actually feel that the problems that we have with inequality stem from a lack of diversity. And I have listed here a couple of things that I would like to briefly discuss because I think the other speakers are going to dive into those things a bit more in detail. I think it's very important to keep in mind that diversity is not just a box-taking exercise, right? It's not just something that we do because we have to do. It's actually a common threat that runs through different topics. And I've listed here a couple of topics that are very close to my heart, and I think are very important to keep in mind, and are actually a problem, and actually the root of a lot of the problems that we're facing right now. I mean, the first is obviously the lack of diversity in data sets. I think the other speakers are going to talk about this more. But of course data is biased because the world is biased, and if we don't keep that in mind and have systems or researchers that are aware of that, we're just going to replicate the bad decisions we've made in the past into the future. So having a rich and diverse data set is very important if you actually want to tackle inequality. The second thing is obviously the tech community is not very diverse, and that's a big problem because if you have a very, you know, not a diverse tech community, obviously their stereotypes, their biases will be embedded into the systems that they develop. So not just this inequality and biases creeping into the data sets, but also actually the systems because the people who develop them are biased. The third thing is also very important is the lack of diversity in research groups. We talk about a lot of problems with AI, and that's most of the time not just a tech problem, right? We think about how does AI challenge our legal system, right? Are the laws that we have still good enough to guard against those risks? That's a legal question. We think about how does AI affect our workforce, for example, that's an economics question. So we actually need economics experts to think about that. Think about the ethical questions that rise from that. Obviously philosophers are the ones that need to think about that, and how does technology shape political discourse, you know, opinion building? That's for political scientists to decide. So this is something you see that those problems come from technology, but actually the answers are probably in the diverse research communities, and we don't see them so much yet. And one of the problems why we don't have so many diverse research groups yet is because there's still bias and the lack of diversity in funding and institutions. If you look at the whole of the funding scheme at the moment, a lot of money is pumped into STEM, but actually the social sciences, the humanities don't get as much funding anymore, even though those are the disciplines that could help us to solve all those problems that I just talked about. And the same construct for the institutions, it's very much silo. There's not much of dialogue going on between different disciplines, and that's a very big problem. But even if you have the funding and even have the institutions and the people who want to work together, there's a problem in terms of publishing. So researchers wanting to publish their findings having trouble to find diverse journals. For example, if you want to publish a paper that is very heavy on tech, very heavy on law, it's very hard to find journals that actually cater for that, so you don't actually have the venue and the audience to read that. And the last point is obviously policy solutions. As I said, the problems are very diverse, and therefore that means we need not just different disciplines working on this together, but also different stakeholders. It's very important that when we think about strategies to mitigate those risks, we need people from industry, government, academia, but also from civil society, NGOs, and most importantly, we actually need representatives of the general public. So that could be, for example, union groups or consumer protection groups, because those are the people at the end that are going to be affected by AI, and those people need to have a voice as well. So I guess what I'm saying, the most important thing is to think about diversity as a holistic concept, and if we have that in mind, we can actually come up with solutions that harness the full potential of AI without tapping into the pitfalls that we see. Thank you. Thank you very much. I didn't even get to use my yellow card. What a good clean game we're running. So next, Catherine, would you like to speak from there, from over here? Would you like to walk around? Feel free, whatever makes you most comfortable, far away. I'll start seated and with a timer, because you've got me so scared. So they didn't let me bring my electric shock machine. It can't get that back. So sometimes lack of data is very useful. If I had known the data about the probability of a new political party succeeding, I probably would not have founded the Women's Equality Party. I started it against all of the odds of a party succeeding in a first-pass-the-post system and in a system that has rules that are supposed to create stability but actually just end up enshrining the dominance of existing parties and upholding the status quo. When I say how a party succeeds, of course that depends whether you measure success only in terms of the numbers of seats won. I took as an unlikely role model, UKIP. This was 2015 when I co-founded the Women's Equality Party. We hadn't had Brexit yet, but it was clear by then even so that UKIP was creating seismic change without winning seats. At its most, they only ever had one MP. And it was because of the weakness of the big parties that instead of challenging and pushing back against UKIP's politics saw what they were doing as a vote winner and started trying to steal their thunder. So I thought what in fact I could do is show that gender equality was a vote winner and they would start to copy us and that turns out to work. But of course there's something else UKIP was doing. We saw them using and misusing data, both false data to suggest a kind of Brexit dividend that would never happen. And also of course there has been this whole issue around micro targeting of voters and what that looked like and how that came about. If there is time I will talk more about Cambridge Analytica and that whole issue. Our success is continuing. We are growing as a party. We are having electoral success. We are having success in using data for social change. Data is absolutely essential to solving inequality. You can't prove inequality without it. You can't prove the benefits of resolving inequality without it. And you can't understand what the mechanisms of inequality are unless you also look at the data. I'll give you one example here. Recently of course the biggest UK companies were forced to reveal their gender pay gaps. The efforts that some of them went to at any rate not to report properly to obfuscate those gaps told its own story and it was very imperfect data. Sometimes what you were seeing was real pay discrimination which of course is illegal, lumped in with all sorts of other structural reasons that were causing that and as I say very data that you couldn't challenge and interrogate and all sorts of weaknesses. Opponents of that reporting suggested that what was needed was to get rid of that reporting again whereas in fact of course what we need is more data not less. We need it to be broken down by other aspects as well such as ethnicity and age and disability and smaller companies need to be reporting it. But as I say data can actually help in solving inequality. However and this is to something Sandra said data inequality in various forms is a very real thing. There's a power imbalance between individuals and the corporations and the governments that hold and process their data and of course there's about how we collect data what the data is as Sandra said data is biased because the world is biased and there's a gender there's a danger that data will be used for example or will inadvertently create normative or exacerbating trends that make worse things that already work recommend a book by Virginia U banks called Automating Inequality. The first big donation that we got for our party was to go on a software platform called Nation Builder. This is a kind of platform and that platform indeed specifically used by nearly all parties in the world. They've got Emmanuel Macron their website at the moment and it is because you need some kind of way of interfacing between your website between the data that you collect when you're out canvassing your mailing list your membership your donors and social media but those kinds of software are also ones that you pay according to your ability and of course for a small party like us we have the bare minimum but the big parties are already using micro targeting in very major ways they are used they are scraping every legitimately available source in order to do it and it is further adding to a system that is incredibly unfair into you know there is no such thing as an equal starting place in democracy and interestingly in the responses to Cambridge Analytica the suggestions that have been made about how to fix this are nearly all about ladling on additional bureaucracy which of course will affect parties with less money more and rich people and rich parties have always found ways to get around these things so I really wanted to make that point as a starting point but there is so much to say on this subject in terms of how data can go wrong but how it could go very right and I'm a believer that we all have to engage with this in order to get the outcomes that we need. Excellent thank you very much. I'm beginning to fear my little cards are going to stay in their case which is probably the best way. I was 30 seconds under. Look at that, see that's data. There you go although you know we should calibrate and check that we're all using the same no that was excellent thank you very much anyway Dr Salt. Yes thank you everyone for coming out tonight I'm going to jump straight in to my thoughts. I've got lots of little arrows and things circled around here and I'll try to narrow them down. Everyone's brought up some really key points that I think I would probably say ditto to quite a few of them but I think one of the crucial things for me that that that starts sort of at the at the beginning is information. We've been saying the word data but we're actually talking about information and quite critically for me those those questions of for whom you know who is the intended intended audience for this set of information. Catherine's told us about micro targeting and the parties trying to get particular sets of data and information but what is the intended aim of other sets of information that might be coming together and I imagine if you have corporate controlled information filters and bubbles so suddenly you're going around and having sets of information and you think you may be giving access to various things but there's another corporate group that's making a whole set of decisions over here about your content and moving that information around much less things that that I think Saundra was mentioning about algorithmic decision making that's proprietary you'll never get access to it so if you're a CIC or a company you're trying to get access to that information but not necessarily in any way that you can manipulate that set of information it's proprietary so I think there's a really key place to start with around data in the larger sense and most definitely around what we maybe consider tech driven information about its supposed neutrality. Saundra's already kind of pushed us to really kind of think about that from the development perspective but I think we've got to really start to think about that especially if we're thinking about something like inequalities right and if we're really really wanting to understand that then we really need to be figuring out how to do that we need all sets of tools to really get at that and I'm going to agree with Catherine that data and various forms of information gathering could be a way forward and it's not I think making the claim about bias just being everywhere is I appreciate it but I try to live in a world that is that I want to imagine that it's full of justice so I can't live in a world where I think bias is just normative and therefore I should just sit quite comfortably with all of these biases around me I want to work on adequately trying to expose them and ultimately trying to reduce them because that's the only way I can imagine a just future for myself and for all of you in this room so there's this question about what are our tools that we need to try to reduce inequality data could be one but data and technology it's not the answer nor is it the solution because for me something quite critical exists around all of this and I have a feeling that Rob's probably going to pick up on some of this but we need critical analysis of all of the sets of information we really need adequate really critical reflexive tools that can actually debunk and kind of deconstruct racialized notions of difference about people hierarchies of being and questions about power and until we do that we're really going to be replicating the system no matter if you can come up with the most powerful analytics or or sort of tech driven kind of place but it's still in the hands of decision makers that are just going to perpetuate the same sort of boundaries that we've got set up so I think it's really critical if we're going to end up with justice whether or not it's digital justice or tech justice that we really quite critically think about the decision making that we've got Catherine's mentioned quite a I think cogently about the parties but there's a whole bunch of decision making happening everywhere from the very micro level in the local environment to the national and international level and if we turn all of our sets of decision making over to an algorithm or a set of corporations or a set of people making lots of decisions about your lives we run the risk of actually never knowing and never actually actively engaging with trying to transform the worlds around us and that's a concern for me a very deep concern so I'm about the data and I'm critically thinking about the inequality but for me it's really about trying to to not replicate power dynamics and either of those places that essentially just keep us in the same space that we've always been in excellent thank you very much at precisely the point you mentioned you believe in justice I drank your beer sorry hope it was good well don't do it again Rob great to be here and I'm trying to echo some of the points that people have made already but also throwing a kind of voice of caution and hopefully not gleaming doing but caution a bit of an interloper I'm in some ways I'm a really early adopter of new tech and I get really excited about all the possibilities possibly overexcited so some of what I say may mean to take the with a pinch of salt but after 25 years as an activist thinking about issues around racial justice I've been qualified as a sociologist of education having spent the last three years as an audience data specialist at the at the BBC and really as a child of the who came to political consciousness in the 90s a technocrat like all good new labor types are and it sometimes pains me a little bit to talk down what data can do and its position and role in in creating social change and so starting really with the history of enumerating populations by characteristics by identifications that they may have or maybe ascribe to them isn't a happy or pleasant one think pink triangles think yellow stars think the request for papers please and the categorisations that are given are more often about the concerns of the powerful than the than the drive towards justice and they're inherently political and think of the the categorisations we grab with thinking about race and racial justice the old and new common wealth migrants remember that one or or BME to refer to to lump all people of color together because otherwise we know much more about second generation polls and how well they're doing in in school and these categorisations are driven by policy and and driven by political concerns and so it's always relevant to when when confronted with data to ask well why this data and not not something else being collected here and it's it's both it's indicative of the the level which the politics pervades any discussions around data particularly ethnic data and when you get campaigns and people driving to be included within a set of new set of categorisations whether that's Cornish language whether that's Turkish groups whether that's in 2011 the Arab group being added to the census not because of size but because of gchq work out kind of how that how and why that happens but fundamentally I think it's based on there's a there's a basis on which data scientists somehow believe that evidence changes practice and I think we have to question whether that is true or not or whether we actually are in a situation where we're making evidence-based policy or we're making politics-based policy and bringing some evidence in towards the ends to justify it so my first foray into activism was a battle about data I started university I started at Oxford in October 1992 by January 1993 and I started a I become engaged in a battle with the university about data that's I guess the nature of being one of a few black students at Oxford right and there's a table in the Oxford Gazette a table 9a if I remember rightly that changed my life terrible to say but in it I could see that I was one of 16 black students in my year at that university last year it was four black British students so actually we're going backwards rather than forwards with that group but it um but that table raised so many questions so so white students had a one in three chance of being successful in that mission throughout black students and Bangladeshi students had a one in five chance of being successful um and that year 93 and almost every year since there's been a battle with Oxford University about that data um so much so that I've just got so bored of that conversation that I found it out to the very capable MP for Tottenham who's uh he's doing very well with it currently but but as a cycle where in 1998 the university would deny the truth of their own data and send us back for more um in 2008 they withdrew publishing the data uh in 2018 they're back to denying the data is true again so we're in a kind of political round with data and actually the data is not the important thing in that discussion anymore but a kind of conception about what is the good life what is justice etc um so that pattern of disbelief denial and victim blaming to be expected because the data is released into a space which is racist so so so white supremacy will will encourage and reinforce its pattern uh and and um and actually looking at the way in which data is received and understood and analysed uh in part reinforce some of those patterns of exclusion and inclusion um finally I think we're going on um just a point really about that but the data is never enough um so I've been guilty in the past of confusing battles for transparency of data for battles for equality they're they're not the same thing and often uh because they're often very wrong directions trying to get more data if we just had the data they just believe us and then things will change actually you know um so I completely applaud the work of the race disparity unit at at number 10 currently and we've pulled together some fantastic data um and some of the and some of the the best work I've seen on on on mapping ethnic disparities but I don't know is what that relationship is what the relationship is between that collection and presentation of data and social change um and um and I worry that waving we've got the data uh is used as a uh uh as a way of pretending that there's activity going on um so we should be careful not to think that producing data is doing the work um and similarly if more data is published and and I I hope that we get better and better data from uh from AI etc um there must be a demand for analysis there's no benefit in uh in establishing a race race disparity unit at number 10 uh and then cutting the funding for university research or cutting the funding for civil society to analyse that data because no change will happen um so for me the question about data is not so much the quality of the data or how much data there is but what that data can do to transform the relationships between people because it's inserted into a political space where we have to think about the political change that we require not just the scientific data collection thank you thank you very much you're all doing terribly well sticking to time it's marvellous heaton can you can you keep this clear round for me well if I can bank everyone else's time I think I like that so I'm from the royal statistical society and it's uh highly amusing that you've decided to spend your kind of warm wednesday evening in a room discussing data and inequality of course if if we given it its proper title which is statistics of inequality none of you would have come so uh we as a society have been around for 184 years we're delighted that everyone also sort of caught up and the the flag of data is suddenly a kind of advertising agency for statistics so uh welcome to my world um I want to introduce you to uh has anyone heard of the the rock star statistician Hans Rosling who died about a year ago uh a few of you have so he uh had this word factfulness which I love and the idea is that it would be a really good thing if we knew some facts uh and because we all go around we sort of know the facts are there we can kind of access them at any time but actually they're not in our brains right now uh and so what he does and I'm not going to do it to you but he does it uh he asks his audience you know here's three choices what's been happening to extreme poverty uh what's been happening to this that and the other and basically the audience gets it so badly wrong that chimpanzee is just throwing darts would have been better uh at answering these questions so we just don't actually have a good idea of the world that we're living in and I was looking in advance of this talk uh at some of the income inequality data uh and you know as I say I'm not going to test you but it's interesting we we assume I think many of us that UK income inequality has been getting worse and worse in recent years actually it's been pretty static for the last 20 years now that doesn't mean it's too high or too low I mean we can argue about that but it's quite useful to have some facts at the very top 1% things are getting uh much worse but for the other 99% it's been pretty static for the last 20 years so as I say I'm not I'm not therefore then saying what are the political implications of this but it's quite useful to know the other one there's a fantastic website called our world in data now which I really recommend to you and what they do is they take the kind of the statistic things really seriously and go back to the 1200s and say what's the very long run data trend here as I was looking at income inequality from their website and uh if I can find my uh bit it's sort of from 1600 to 1900 the data suggests in the UK 35 to 40% of income went to the top 5% and then it was in the 1900s in the 20th century that really dropped all the way to the 1970s and then in the 70s and 80s went massively worse again and now has sort of plateaued this interesting stuff right what we do with it that's for us to argue about in terms of our politics but it's quite useful to have some facts and so I really commend the kind of hands-rosling approach to factfulness and I sort of suggest you you read his book having said that we have pretty good data in the UK for some things uh but there are some areas where I think we're very weak wealth data uh is not very good so we've got reasonably good income data but not very good wealth data we could get better at tracking the very wealth here so when we're doing uh inequality we're not bad at the bottom end but it's pretty hard to know what's going on at that top 1% so it would be very good to see more and our regional data is appalling and in a sense it's the EU referendum stuff that has suddenly woken everyone in London up to the fact that there are these places outside of London uh and that we really ought to know what's going on in them we haven't got much further than that but the the theory is there so you know let's let's work on that so so those are kind of your traditional statsy type issues I thought I might make a bit of a foray into kind of Sandra's new tech territory as well bias is is a really big issue so they're starting to use algorithms in the US to decide on your likelihood to reoffend if you sentenced in the sort of criminal justice system some brilliant journalists have shone a light on this and showed that even though it doesn't use race uh in in that uh dataset uh directly actually it does through correlations uh is biased against uh you you know if you're if you're a black man it's much more likely to assume that you're going to reoffend and wrongly assume that uh that than if you're white so as algorithms start permeating through policy making uh that these risks become very real and in the UK you might remember a few weeks ago uh there was evidence that a number of people were deported because they weren't deemed as speaking English language sufficiently well these were students uh but I think I mean I can't quite remember but the error rate on the algorithm was something like 20 so the estimation was that 7 000 people have been deported wrongly so error rates start it doesn't really matter if you're amazon and they send you the wrong good or they advertise the wrong book to you because who cares and the commercial word doesn't matter but once these things start creeping into public life it does matter that said you can turn these tools around to use them for usefulness as well so uh some people have recently designed a new HR algorithm which knows what human biases are you know you prefer the first and last candidates and you forget the ones in the middle uh all these sorts of things so they give you they don't tell you the name of the candidate so you can't kind of be biased against people because of their funny sounding name or because they're a man or a woman uh they randomize the order in which you you see the candidate question by question so you'll see questions six from candidate three and then uh question four from candidate seven and you can't be biased because you know you've got no idea who who or what is coming at you so these are ways of using uh some of these new data tools in a way to actually promote fairness but you've got to build it in and design it from the start as it were similarly satellite data you can use satellite data people are using it to spot sites for modern day slavery so there are opportunities but at the end of the day politics what what is it we care about that's what drives data which is I think the the thing that all the the panel have been saying I'll say one more thing which is that the the one thing that's worrying me at the moment is the the rise of private sector companies who are gaining a larger and larger share of data so historically what's happened is that the state has been the one that's amassed data through the census and those sorts of things and some of that has then been used for research purposes for public good aims to inform public policy where schools should be built etc etc now we're giving away more and more of our data to giant companies and again I don't care in the short run if if I give my data to amazon and they want to recommend a product to me that's not a big deal but where will we be in 30 or 40 years time there will be enormous amounts of personal data held by private sector companies probably three or four of them and we as the public will have no access to that academics won't necessarily have access to that unless the private sector decide that that's what they want to do but it's our data right so I would like us to sort of think about what where do we want to get to in 30 or 40 years time one idea that I floated is that in the way that intellectual property rights companies have a right to a new idea they've got a license over it for a short period of time but after that everybody has access so you know we've seen that with the ibuprofen neurofen used to have the patterns on it but now you can buy it for 16 pence in the supermarket because any company can produce it in the same way what if the the rights that companies had over our data were also time limited so they could hold it for five years or something like that enough time to sell us whatever the latest gadget is after which it would revert into some kind of public trust held by an arms length charitable corporation and then it could be used for purposes to actually say you know what's in the public good because there's all this amazing data we could do lots of good with it but as always the question is political as to who owns it thank you very much right they all kept abruptly to time so I am going to monopolise them for just a few minutes with my own thoughts and questions thank you or you have covered an enormous amount of ground there and raised so many issues we're certainly not going to answer all those questions in the next hour but it seems there are a few common directions emerging one of which is bias so I suppose I'd just like to unpack that a little the idea of bias kind of within the data and the AI where do you think we can address that because there's a whole process going on isn't there there's there's the collection of data there's deciding what to collect then there's the the processing and the analysis of that data and then there is if you like the application and who well there's who gets access to it of course but there's there's how that then gets translated does it get translated into recommendation for sentencing does it get translated into where economic growth happens I can I ask you to unpack that little bit for me and maybe draw out the different areas in the process where where bias happens and you know if you feel so bold suggest some things we can do about it but even just a kind of a deeper understanding of when we talk about oh the data is biased where does that come from and what do we mean who would like to to jump in first you're sitting nodding very wisely I mean clearly um some points that that were very important were made by Robin Caron about um what bias actually looks like what we're talking about here so I'm engaged in politics in part of my life because I think politics is one way of tackling actually of exposing and tackling and bias but um the political party I'm involved in also for example believes in campaigning for um a less biased media and looking at education you know at the very beginning of all of this and so on the one hand you're talking about huge the huge cultural issue of of the the cultural soup we're swimming in and and who's determining in it isn't how we change it but you're also talking about things like you know to Sandra's point about diversity not being a tick box exercise it really matters that the big tech companies are so and and that the technology that is being designed is being done so by such a tiny proportion of the population if you look at the things that people decide to design you understand something about the lives they lead because you see the things that they you know they think it's incredibly important to have a service that does your laundry for you or they will also do things like sort of moonshot technologies that are supposed to solve great inequalities which they themselves are helping to create you know so so there is an awful lot of circular thinking and in terms of those algorithms you were to I mean that they're separate things then there is the very human bias which is there and it's deep and it's cultural and it requires that kind of rigorous analysis in order to understand it when I was talking about data being important I think data is vastly important but I absolutely agree that how it is used and interpreted it's it's not it's not there in a vacuum and actually to get people to act on it always you want it's then around the narrative you tell but in terms of in terms of how how we we deal with things how we address the bias in in algorithms that starts with how we address who it is who's creating them and what it is that they're drawing on and that's where you get you know this this problem where it exacerbates existing inequalities rather than than solving them so many times so it's because I mean the example that you mentioned about the prison the prison sentences that's a classic one it's because of the data that they think are relevant to reoffending are data that they haven't even realized or racialized data and so they they create this outcome by feeding that data in so you need to step way back from the process as to why anybody thought that was a good idea why anybody didn't notice it and why it is that once people noticed it it not only kept being used but is being used increasingly widely to what effect this this is actually a really interesting example because like probably all of you I I read about the algorithm it's called Compass it's designed by a company company called Northpoint and then I read the pro public a coverage which said well we have followed a number of the people in I think it was Florida who were given risk scores by this algorithm and we follow their outcome and we can show you that exactly as you said if you are black you're much more likely to get a false positive to be falsely flagged as high risk and if you're white you're much more likely to get a false negative to be falsely flagged as low risk but then the company came back and said this is an unfair interpretation and what they what they claimed which I am told by people who have gone into the statistics in more detail is that this is because the underlying populations in fact have different rates of being rearrested I think that's probably the the fairest way to put it and that therefore you are dealing with basic populations that are dissimilar and if you make the algorithm fair on the way in so if you treat every individual the same on the way in then it's unfair on the way out if you look at it by groups and this I actually did I'm sorry to monopolise slightly I did a really interesting event at the Royal Society recently with Professor Cynthia Dwork who's precisely looking at this as a technical issue it's an algorithmic issue and she said well the problem is it's mathematically impossible to design an algorithm that is fair from every possible definition of fairness and so this idea that you can make an algorithm which somehow compensates for human bias or for the unfairness of the world in which we're living is fanciful because the way you define the fairness you're looking for will define what you what you get out but this I think throws up a really interesting question because it basically throws the problem back into our laps we have to define what fairness looks like what equality looks like in order to go back and design the algorithm Robbie you I think that was the the point I was trying to make but we haven't given up on politics yet that that actually and these are these are tools which we might choose to use to inform the debate and discussion but don't replace the debate and discussion yeah and I have a real worry about as Karen described much more eloquently than I could around our expectations of others so a suggestion that people are seven and a half percent biased in this direction or 25 percent bias in another direction and that we can just correct for it by somehow fixing their mobile phone um feels to me they're not just not just fanciful but but actually damaging to the to the kind of human empathy we need to create to create the kind of new politics in the new society yeah I think it's I think it's like I very much agree with you I think it's two things I think there is a belief that somehow data is neutral and there is a good kind of data that is not biased in any way and that's a fallacy there is no such thing as neutral data so I was hosting a workshop a couple of months back and I would say it was like 15 people in total three of them were women when we were talking about biases and decision making on how to who to hire basically there was the whole thing well we the the demands that we developed an algorithm that is very good and um you know making sure that we increase equality like how you do that while we get rid of all the data that is you know not neutral it's like how did you do that so yeah we're not using names anymore we're not using pictures we're just using neutral data and I was like what what's that so yeah we're gonna use you know um the employment history and salaries and the free room like what and that's the problem like they I think this is where bias also comes from from the people because apparently men you know you just don't think about the problem is that other members of the community have and think yeah you're just going to take that neutral thing like income or employment history to make those decisions right and the other thing is um even if you have those data and even if you know that those inequalities happen okay we know that we know that the world is sexist we know it's racist the next step is to do something against that right it's not just I'm going to tweak the algorithm make the algorithm less racist or make the data look less racist well actually you need to work on us being less racist and make less racist decisions and offer equal opportunity to those people like that's the real issue actually data might help you to figure out what a problem is lie but if you want to change that it's human action not robot action exactly I'd agree with that I would I'd also stress that um there's a whole wrath of folks who've been writing on this issue um uh Sophia Noble is one who's written a book called algorithms of oppression um but a lot of that is asking about what are the schemas what are the the sort of corpus what's the kind of knowledge that's being generated to populate these algorithms so it's not as if the algorithms dropped from the algorithm tree um and then just sort of did stuff somebody's made it no stuff to be able to do stuff so that knowing is critical to try to figure out what is the basis of all of that and I think people have to take ownership and responsibility for the things they're coding the words that they're creating and the and the worlds that they're imagining and producing at all stages it's not necessarily just the tech aspects of it it's also the presumptions when I might walk into certain buildings in my university and I'm assumed to be the cleaner no I'm being very serious that that is the only that is the limitation of the world building that there's some people can do that means that it doesn't actually matter and I think going back to Rob's point in some cases it actually doesn't matter how much data you do you create there are certain things in certain spaces and we've we've been able to borrow this from people who have been working on structural racism and just sort of racist thinking from people who've been investigating climate change that there is a very large pervasive aspect of ignorance that is that is and it's and it's actually understood now as a theory of this resistance to data so we we make the presumption that you just fill up the data tree with more information and then we just put it out there to people and everyone just goes aha I will now stop being a horrible bigot racist nasty person because you've now told me with all of this data you assembled and I'm good to go and actually the resistance to that is really quite strong so I think we've got to take quite critically while we're also talking about these algorithms what we're talking about around inequalities and we're talking often about sustained structural systemic long-term problematics they might deviate different bodies might move in there but as a process we're talking about stuff that's just not shifting because it you know because people are bored or they're tired and they are they just need more data it's like we've got to be much more creative to really try to analyze and understand that and not just think if we build a better a better tech ship we will therefore just make the world equitable it's just it's just it's a ludicrous concept and I think it it it puts a lot of emphasis on the data to magically fix stuff and it also presumed presumes I think Sandra's point that that data is neutral right before I come out he's in your day job he's kind of trying to get the data or statistics as we secretly call it when we're meeting alone in the world statistical society but to get the data and actually try and use it in some way to do things to shift policy to to make the world a better place and this is the the historical mission of the royal statistical society is not just to get a better stats but to do that to make the world a better place can you can you give us a few maybe pointers about that before I throw open to the audience because it's a nice warm debate going there let me just kind of respond a little bit to what's been said I mean you know I agree with everything that's been said having said that I want to make a comment I am trying to foster some disagreement I know you do exactly partly it but also I want to make some distinctions because you know we all want to make the world a better place but earlier we were saying all data is biased but let's unpack the ways in which it can be biased because there are different ways in which it can be biased also the fact that data in of itself doesn't change the world by itself doesn't therefore mean data good data is not a useful thing so good data is the bedrock of democracy right because if we can't agree on facts which we then have a disagreement about what we therefore do about them that that is a foundation for a useful conversation as it were so you know that's why statisticians I think play a really important role in saying here are some facts now you go away and beat the crap out of each other to say what we should do about it right but let's at least try and agree some facts and one of the worries at the moment in the so-called post-truth world is that we're not even being able to agree on what society we're inhabiting as it were and that feels quite concerning to me on the issue of bias yes all data is biased but let's think that there are different ways of doing this there are things called official statistics which are produced by the office for national statistics and other statisticians in government and these are quality assured now these are biased in certain ways because the government decides what to measure and what not to measure and willmoy who runs full fact which is fact checking charity yesterday gave a really interesting example he said the office for national statistics brings out its baby names list every year and this is the ons being its most frivolous it gets a nice media hit in august you know what are the top 20 baby names this feels to them like it's not political but of course one of the things they do is that they don't look at the name mohammed according to the sound they distinguish the different ways it's spelt as a result mohammed is not listed as one of the top baby names in the country whereas if it was how the name sounds rather than how it was spelt because it's spelt in many different ways it may well be the top or one of the top baby names as it were so what gets counted counts so even in official statistics these political choices impinge all the time that said this data does tell you something as it we've always got a question it but you know there is quality in the sort of survey design there's a sort of representativeness etc etc in the sort of new world of data that we're in where there's digital data floating around all over the place uh for example people have started doing polling on twitter some people have started looking at you know can we see what's going to happen in the election because of what's going on on twitter no is the answer right because no matter how good your techniques are the point is twitter is a very small sample and a very biased sample so there are biases and biases as it were and I think that it's just worth distinguishing between methodological biases uh and the kind of biases that are in any data set uh which reflect the world that we're in we want to hear what what you have to say uh if um we have some could we have a little bit more audience so we can check they're still listening not just checking the football scores on their phones excellent good welcome you can see your faces now uh we have a couple of roving microphones one over there and one over there so stick your hands up uh oh look at you all go excellent right so i'm going to ask you to be concise would you like start with that microphone there just because you can see that person and over there there is there a hand over that side I'm afraid you can have to go back up to the top at the back there sorry this is where the volunteers get incredibly fit by running around do you have the microphone if so start speaking now hi um so thank you for your talks and it'll be interesting uh my question is how do you think we can or should use data to successfully not set back a quality within society good excellent large question but um open with plenty of answers I'm really worried about you now are you okay do you need to uh so if you'd like to get that microphone back and you've got the mic and if you'd like to give that one to that person there in the blue for next okay you at the back hello the government recently launched the ethnicity facts and figures website um for me that was really interesting because it had like 150 different topics about how your ethnicity can impact your experience of public services healthcare education etc um if you were that team what would you do with that data to inform it as a powerful tool to improve society marvellous lovely specific question I like those uh and who has the microphone there blue top yes hi so it's come up that sometimes um if you're starting with the data it is always reflecting um these problematic biases in society so the criminal justice system data um is is I don't think it's necessarily impossible to to kind of mitigate the biases I've read an interesting paper by christianne lum in the us which was um kind of remodeling and actually seems to have come up with a way of potentially doing it so I think it's a field that you know you have to keep watching and see if it comes up with that type of solution but I also think um the way you've got that type of problem um do we need to go away from talking about data and just looking at what happens in real life and do we actually have to shift to what would an ideal world look like and actually modelling like do we need to talk more about modelling and less about data whoa okay big question there so modelling data real life how we want the world to be okay if you'd like to give the microphone to oh you see they're coming up with this okay man with beard and then person hiding on the end there and then I'll come back to the panel and then I'll come out again hi um it's just I would be curious for some observations from the panel so in a post truth world where our leaders make make it up and decide what they want to believe in how can data enable rebellion at grassroot level rebellion driven data sorry data driven rebellion excellent good see there's a man with a beard for you uh and there's yeah person hiding uh no on this side yes there you go that little hand there I'm not impugning the size of your hand it's just you're appearing from behind somebody else um yes this is um well a comment um to the gentleman from the royal um statistical society um I remember a lecturer in my um first degree saying that there is no such thing as a fact that everything is open to um anybody's interpretation um I would also like to um take up a an issue with doctor is it salt or salt um I don't know whether I misheard or um or something but data is not information you can have pages of tables of statistical um data um but it only becomes information once it has been analysed and that depends on whether it's the analysis has been objective and honest which comes back to the issue of a bias in data um when it comes to um considering whether um data is biased or not um you have to look at who has actually paid um for that data to be collected and if it's been collected in a biased way i e for the survey or the questions in the survey have been um um and slanted in a particular way um you will get bias and therefore whatever everything else thereafter becomes biased so the the results in in actual fact and themselves are biased thank you excellent normally I discourage making two points at once but your first one was very concise so you got away with it uh okay so panel don't obviously don't all feel you have to all answer everything because you saw how many hands there were uh but let me do a quick recap and then basically come back on whatever you feel like coming back on so uh there's a very very big open question how do we use data to make the world better and a very specific one about the government ethnicity unit and what should they be measuring uh there was the really huge question about things like criminal justice system data uh is it completely impossible to mitigate bias possibly not but should we instead stop looking at the real world and start building models I think about how we want the world to be I'm paraphrasing really badly um sorry about that uh how do we use data for rebellion typical man with a beard uh and then comments about welcome on Karl Marx 200 years always trust a man with a beard if you're on a rebellion uh no such thing as a fact and data is only information when it's interpreted and then who's paid for the collection and the interpretation so who'd like to pitch in on what first go on Rob off you go this is solid argument with a beard um so rebellion um but but also to respond to the points about how we should use data and uh and the race disparity unit um but I it's always really difficult to plot where policy decisions come from and where the where the practice emerges from uh so I'm just going to claim it uh so this is it feels like the race disparity unit is doing some work falling on for some work that Ronnie Mead started about a decade ago thinking about how you how you get citizens to use data to hold their elected officials to account um we based that on the piece of work which was done in uh in Chicago state where um community organisers had given schools based on decisions that uh that congressmen had made um and they would have members of the constituency turn up and say you've got a D this year explain yourself you've got an A this year explain yourself because it was about uh uh uh using data to create a dialogue um these free floating great website where it goes next is a is is a challenge so how how you how you connect people to that data and connect uh a dialogue and the politics back into uh someone which is perceived to be scientific and and distant um to me is is is key in both the starting rebellions but also uh in how to use data okay very good are you itching to get in as well go for you yeah well i'll i'll i'll take on the the one about information i think for me part of the critical understanding of thinking about some of this is um if i'm looking at information um let's say an actuarial sheet of enslaved persons that somebody had owned i'm actually looking at a societal moment where an individual has bought and sold someone um and has maybe transported or taken on pushing that that property that they have to somebody else so for me that is information um the same as if i'm looking at how many babies were born in the country at a certain period of time the same as if i'm looking at um the two two uh uh degree classifications for students who identify as african or african caribbean what you do with that is a whole separate set of circumstances but i think that the the one point i'm trying to make about calling it information quite specifically and this is not to to to make some argument with with hatan but i think we we need to take a for me working on quite specific issues around racial equality is i need to make sure that i'm making sure that data is is structurally positioned and recognize that it's often talking about people so one of the problematics that tends to happen i fine is that people then neutralize the space and then it becomes just stuff where i really try to put back the politics back into that and go this is actually information that is capturing a moment um and it is not it is as Rob was saying this it is very politicized so that's probably my one question about not necessarily the fact part but just the sort of softening that i think that you were taking about data because who gets counted as human for the research i do is quite significant where there are actual roles of data and information governments and empires have had where certain sets of people are not present there so they weren't counted that did not mean they did not exist as people or that they weren't born or that they did not die it's that the data doesn't capture them as being human and for me that is so critical to have to reflect on that to not put back into the space of data this whole sense of either neutrality or just this sense that you know it's even the quality aspect i understand why we need to do it but we have to think because these histories are still alive i mean they have currency you wouldn't have something like windrush as a scandal right now um actually erupting if these things weren't still present it's not just that we're talking about 100 or 200 years ago or colonies or places far afield it's actually has a present right now and then when people destroy records ha this is now having an issue around data because you have to evidence that you are actually in existence and be known so suddenly now the data has to take on an even more significant role to make you present in a space that is actually denying your presence right so this is the these are the ways that i'm trying to understand this issue about information and really i think wrestle with it um and and um and really understand that whether or not it's about the analysis part the assembly part or just that you know the gathering part this is the great paradox of data isn't it that on the one and it has this of this does rob said this kind of scientific aura that makes it seem kind of superhuman and unbiased and unchallangable uh and yet and and then in fact we do need it in order for evidence to make arguments and yet it also has that distancing thing that makes it hard for us to feel that we can control it and i and i should say i work in data i mean i'm a i'm a researcher um and i i mean i do consultancy in the advocacy and the social justice community work but i'm not anti information i'm just having to constantly be grappling with it um and actually i recognize my positionality and i think the trust i have to have to work in that space and do it in a way that my ethics maintains itself as i'm interacting in that space and also imagining policies for the future yeah i i think i'm going to tackle the question that's some one of the first ones to say like what should we actually do with the data what what's like what's the end goal here and i'm very much on offense on that so i think most people or a lot of people would say that if we have read good data i'm biased data that could help us to make better decisions right like decision making let's get rid of that human factor that is so awful that makes all those awful decisions and people say well quite rightly um well algorithms are better than human decision makers because they're you know you cannot write them right um you they're more consistent in what they do right they don't they're not moody right they're very consistent um humans tend to be moody they're also faster more efficient right so there is a lot of promise there to to use algorithms for certain kind of decision making but the very very important question is what does a good decision actually look like right and again that's not a tech problem and I think we need to have a very honest discussion about that because at the moment it says yeah we have just that kind of data we run an algorithm for it and it's going to make the best decision possible actually how machine learning works is it learns from the past and predicts the future and that's only correct you only make good future decisions if the future stays the past right um this is only a jump in on well finished but I want to come back on algorithms of moody because I think that it can be wait till summer's finished so the idea is like even though this is a very powerful tool we have to be very conscious of the limitations of machine learning in general because it only works if if the future looks like the past which is not a very good thing right so um I think we have to be very conscious to think about the where the opportunities and where the actual risk aren't there certain areas where would be very cautious to use algorithm decision making and criminal justice definitely one of them um for algorithmic sentencing which is a lot of mentioned also predictive policing right um not the least because even if algorithms are more consistent they are also very opaque and very hard to understand and in those concentrations you want to know why an algorithm assigns a certain risk score to somebody and there is a lot of tension to break up that black box and usually it's two arguments right it's either I don't I cannot tell you why the algorithm did that because the technology is so complex and um I just don't understand and all the argument argument is I don't want to tell you because it's straight secrets and such property rights and I think both those answers are not good enough if algorithms are making very important decisions about us so we have to be very conscious about what where we deploy it where it's helpful and think about the limitations as well so tell us about moody algorithms you want you wanted an argument didn't you on this so here's a stage in argument but algorithms can be moody so here's an example where Lufthansa became the dominant airline in its country and the algorithm was setting the price and it suddenly realised that it could start increasing the price really quite ridiculously and so the price of the seats kind of went totally sky high and the regulator had to step in and sort of said to Lufthansa what are you doing you're behaving like dominant company and there's a well it wasn't asked it was our algorithm you know moody teenager in the corner sort of so I mean it plays to your point that if circumstances change it can behave differently but in that sense they're not always consistent they can behave slightly erratically I'd also I suppose position I'd query the question of actually transparency the question is what are we comparing the algorithm to and often it's a human decision maker who is also not very transparent so I mean I think one of the really exciting things about this conversation is that it's getting us to focus back on issues we should have always asked about which is how how do we make human decision making more transparent you know our judges more lenient after lunch there's some data which suggests that I mean it's been disputed et cetera et cetera but you know in a way algorithms are biased but humans are biased and that I think that's actually helpful this this is sort of blown open a kind of conversation about bias and inequality in humans which I think has been quite helpful just to come back on some of the questions if it's okay I can't jump up on that one because you did this with me I'm going to do this well can I answer the questions first okay you ask the question first and then you jump back and you will get a chance Catherine eventually so what can we do about the post truth how can we be rebellious in a post truth world I mean it's hard but I think I would say one thing is invest in good quality information so you know I took out a subscription to the New York Times I don't really read it but it's an act of rebellion for me in the States right solidarity just for good quality because journalism is basically dying and so you can you can crowdfund fact checkers I'm not saying these things are perfect these things are not kind of upturning the information system but they are a kind of important bedrock as it was so I think that's important and the final thing was just you know no such thing as a fact I mean I disagree I think that's the dangerous territory where you say everything can be discussed and open up well at some level yes but that doesn't mean that it's all unbelievable and I think that's that's where we're in the trouble where everything's post truth we can't trust anything everything goes and I just think that's a dangerous way to go what do you want to come back um just this is all something that I that I always always hear when algorithms are making decisions people say well you know you don't understand how humans make decisions either so why do you expect that from an algorithm and I always say I don't think human decision making is the gold standard we should orient ourselves right so actually if anything I hope that we're developing technologies to be better than where we are right now and we're thriving towards something better and it's like oh no the algorithm is just as bad as a human so it's fine I'm not saying it's fine but I think you do need to have a benchmark of what your what standard you're holding it to and there's a sense in which at the moment people are criticising algorithms is they're not 100% fair and of course as we know fair could be any lens and nothing can be 100% fair across every lens as it were so that's politics again yeah but actually it could exacerbate like inequality because all the sudden you have certain algorithms being you know developed by a couple of companies and that's just you know scale up the inequality whereas if you have just one human decision maker in a certain area you know you can minimize the damage so it's just scaling up in a way so it's not just saying oh it's not as good as humans it might be even worse okay scaling up bias that's a new interesting idea Catherine I think this is all very useful but I think the one thing that our conversation isn't really doing is taking into account where we are now in terms of economic global technological development so you know to the to the point about you know how we how we use data it's also about how data uses us it's about the fact that we're seeing everything we're talking about is about power imbalances that are in the analogue world and refract into the digital world and where we're also seeing you know I said at the beginning I think it's incredibly important to engage with these things I think it's incredibly important that we are having these discussions but we're also already very far along in terms of the way in which the world is changing in terms of the fact that every single one of you has created a data trail just by being here and the way that data is being collected and used also then continues to impact the way the world develops going forward on the small point um or not so small point about modeling um I mean I I think modeling would be I think modeling is susceptible to all the same things that that other uses of data are but I also think it's impossible to get to the future you want to get to without in some way modeling it without you know that's also what politics is politics is about articulating where you want to be and then showing how you get there data is about understanding what the mechanisms are that are stopping you from getting there and finding ways to dismantle them and reconfigure so so that to me is is you know what we need to be doing but also I mean Heaton said very early on something about how we might think about owning data and holding data and whatever and he talked about having this this sort of public at arm's length entity but you know you know what's going on in China where basically all data belongs to the government and so these models are also part of the societies that they're in and when I say we need to engage with this I very urgently think we need to engage with this on every single level possible because the country that we're in the political system that we're in will also determine you know sometimes these things are global sometimes you have something like GDPR come along and you get 7000 emails you didn't want but but that you know that's an attempt at something that isn't based in one country but whatever happens the society we're in will determine how our data is used and how our data uses us and that's why I would entirely agree with you know the notion that we have to engage with these things very directly through politics through advocacy through understanding through knowing what's going on you know and and again that's where you know the data data in the form of of these the writing that's being done on this subject it's incredibly important excellent on that note back out to the audience look at all these hands so please keep your contributions concise I'll try and get through everybody let's start the front this time we've got a hand and just just trying to keep you fit you guys the microphones hand right here on the second row and one on the front row and then we'll work our way backwards yeah hi yeah just building on Catherine's point about the global nature of this market I'm my question I suppose is how do we balance the demand for growth and ethics and in that our desire for equality and I had to take a very sharp intake of breath this morning as somebody on a panel suggested that our libertarian idyll is constraining our growth whilst China as Catherine points out who's using social scoring to try to balance the books across society have set a 10-year strategy to be dominant in in a digital world and have a completely different social and cultural structure within which that that digital design fits so they are going to dominate if we're if we're not careful their technology will dominate our world and we are constrained as this chapter is saying this morning by our liberal idylls very good point so are we in danger of holding back our development of these technologies by being concerned with ethics and then at risk of just getting overtaken by China who are not so constrained very good point okay off you go and let's see more hands then we'll get the microphone around yeah I've been interpreting the the data debates data debates as people complaining about my work um what is your work sir I'm a data scientist developer type of person and I write about technology and people have been complaining about the stuff I've written for years and years and years um I don't actually very often hear anything remotely constructive at all and I've worked on projects which when they want one particular project lost billions and billions of pounds and people were sad about that and I've done some databases for the government and some people got sad about that and when I've seen projects go bad I see basically what's on the platform in front of me I see people complaining I see powerful users loud voices who want their system and they want it to do a certain thing one project data science project I was in they wanted it written in Java they didn't actually know what Java was but Java was cool so it had to be written in Java and I hear these complaints I hear very rarely do I hear anything which tells me as a developer what I should write and how I should write it so do you know of any constructive ideas if so where do they come from because what I'm hearing currently is complaints and the other thing that kills large projects because I've done government projects is who is on the management committee what I've heard the other thing I've heard this evening is you lot want to be on the governing committee of the project you want people like yourself in commerce it's called the business we want business people on the committee of this large doomed project in the civil service it's referred to as the leadership team which probably tells you something very important about government versus commerce and they want to do it and they want their voices they want friends they want business people like themselves they want civil servants like themselves what they don't want to actually do is do it because I hear all these complaints and I very rarely hear them from people who could themselves actually write the code thank you so you've got the microphone there yet and I don't know if you can get yourself up to the oh it's a hand up there good you go for that one yes off you go so most of you have mentioned that having more data is not a cure-all but if some of the most expansive and extensive data sets being collected by the likes of google or facebook were made publicly available on the basis that it is our information and perhaps their algorithms as well being released after a period of time how helpful would it be for your work and how beneficial for society do you see it there you are I see that was constructive uh yes thank you so how how could we use these data it's now going to throw available so if you'd like to go to there and meanwhile uh oh is there not someone at the back of the hand uh oh no that's somebody different you keep putting your hands up down it's like whack-a-mole somebody put their hand up quick good there off you go okay so I I want to just speak because I'm an analyst and a team that is meant to provide evidence to inform decision making and I just feel that um I think uh the initial uh I can't remember who said it sorry um but about critical analysis and I think as an analyst I make a lot of decisions about how I'm going to analyse this data and I feel it's really important that I document how I'm going about it so that someone can challenge me on that um it's not always easy so you can do that you know a nice methodological report and you can give your finding and obviously it's not very easy to give your short message of your finding but all the caveats that go with it but I think to the other um data scientist comment there I feel actually it's incumbent on us as people doing the data science or the analysis to be explicit about what we're doing and the consequences of that in relation to we're doing a piece of research or monitoring or evaluation and we're answering a question that someone asked us to answer I think until the analysts own it and be clear about how they're doing it then it's just a black box excellent point that's a good point not a question actually no no it's fine no no we don't need points are fine but try and keep them concise and to the point that was very good and very welcome thank you uh so somebody here is the microphone and then we go there and then I will come much to the panel before they forget their own names um so I guess almost leading on from a couple of the comments already said um so I guess firstly the idea that you know if we if we go for ethics and everything maybe we're restricting growth so I think that's a very important point is that actually how do we convince people that ethics and you know equality is actually beneficial for society um and then also on that uh on the point that this lady just said there is you know if if we have all of these tools for detecting bias all of this data to make sure that we can have a more equal world um even with all of those tools and everything available for us to actually you know take it upon ourselves to make sure that we're not developing black box type algorithms etc I'm not 100 percent convinced that everyone would actually do that I still feel like there's a lot of people in a lot of companies that if they don't have the proper incentive i.e money or even just the company in general you know they're looking at making profit and making their shareholders happy and that does not always equal what is most beneficial for society so I guess my question is more around well okay given we have all these tools and everything what are the incentives or how can we create those incentives okay very good an overarching question uh so we'll go to that that person there and then we'll come back to the panel but I will try and come back again to get everyone else in yeah so actually it's interesting because I was going to make a comment um that or mostly a provocative statement along the lines of what the gentleman uh in front of me I said the lady just beforehand which is the fact that um justice and diversity are not profitable and I think in a way that's the big issue because if Facebook Google and all these big guys they own most of the data they there's also this um statistic that I might get wrong so forgive me about this but that the fact that um I think most of the wealth is basically owned by one percent of the global population or something these people have the data these people have the skills and they have absolutely no reason to give up on it because we waive the uh you know the injustice flag and we tell them oh we want more justice they have the they are the ones having the power so yeah I think there's some something that we need to do incentivise those people who have the data who have the knowledge to actually do something in the the right direction or in a more ethical direction thank you very much okay so panel it did wonderfully converging discussion here including discussion between members of the audience which I count as a sign of success personally uh so we have the this thorny question of uh are we in danger of holding growth back by ethics and uh and some discussion provoked by that over here the plea for some constructive suggestions instead of just moaning uh the discussion if we've got all this data and the algorithms from the big companies how could we use it to help uh is ethics restricting growth do we need to convince people that ethics and equality are beneficial if we do convince them how do we incentivise people to actually be ethical uh is the problem justice and diversity aren't profitable and it's a question of power uh I think that oh and uh is it incumbent on data analysts to be explicit about the decisions so that uh we can go back and hold them to account so lots of lots of clashing ideas here okay Keith don't feel you have to answer all of them no no it's fine but I kind of got one answer to the whole lot in a way so and then we can go to the pub so I suppose uh what I hope is that we can use data ethics as a competitive advantage right so um in a way GDPR is a really good example of that America another place we're going oh this privacy stuff's really annoying post Cambridge Analytica suddenly they went well actually we might ought to model some of ourselves around that so I and I think there's quite a lot of ethics work going on in the UK there's a new Ada Lovelace Institute being set up uh which we've been part of uh the government itself is setting up a centre for data ethics there are new ethics codes and this is my constructive advice that there are some useful things because our community of statisticians are saying what do I do on Monday and that's what we're trying to help these organisations sort of build up to so I think uh there are some incentives for ethics uh Google uh staff recently said we don't want to work on arms projects we're in an interesting space where uh there's such a tight skilled workforce that the workforce have power uh and so actually if we can help our data scientists think that this is something that they should care about uh they may be the root to some change I'm not being naive here but I'm always looking for where are the levers markets are less of a good lever Facebook post Cambridge Analytica their share price went straight back up so you know uh we as consumers quite happy to keep plugging away on Facebook it seems so but ultimately also when the other when the ethics incentives fail governments are still there and they're not powerless okay um yeah just speaking only randomly I very much agree with with with everything that you said I think I'm going to start with the the data richness thing and like if we actually need you know various kinds of sensitive data in order to make better decisions and like I'm a lawyer so I'm cautious about that data protection lawyer in fact so um that's a very I'm again very much an offencer that I've been working with with tech people a lot and I've almost been converted because in data protection law you usually say don't collect data that is sensitive unless right data scientists tell me I need to know that information in order to be able to make the right decisions so like even if that's true right I think there's still a problem there that what what you're actually asking of people is give me all the detailed intimate information that you have like everything that makes you vulnerable and I'm going to protect you right so tell me everything about your sexual preferences tell me about your race you're just religious political opinions so I can make sure you're not discriminated against like that's wow um just to put it a bit more in a context like Facebook did this recently in a very funny way but it I think it's a very good example I don't know if people remember that but Facebook was trying to combat revenge porn on the internet and what they suggested was send me all your nude pictures once somebody posts something about you we can take it down immediately that was your suggestion right and that's ridiculous but it's like it's in the same realm it's like give me something that could potentially hurtful I'm going to protect you and even though I think it's very important to keep that in mind because probably we need some kind of crucial information sensitive information to protect people we need to think about step further who holds that data you know who protects the data who gets access to that data how is it being used and then we can have this discussion around that because like just collecting sensitive information about everybody that's a very dangerous territory and historically speaking as you mentioned that's very much a problem the other thing is I think I take those two together I'm very sympathetic because I used to be a lawyer that just work with lawyers and now I start working with tech people and I get the frustration there that especially lawyers complain all the time and tell you you can't do that you can't do that do it better but don't come up with any sensible solutions and this was actually something that I tried to get across in my initial thing this is why you actually need to work with different disciplines together right so you can actually give people guidance because I think the last thing that we need is yet another set of principles that don't give you any guidance right be fair be good be ethical be just well okay what what how do you do that right there is no consensus on those things so we actually would need to break it down for specific applications for specific sectors and give guidance like this is what you can do and this is what you can't do but there you actually need you know people from different disciplines working on this together and finding consensus on those things and the last thing and that's also very important ethics as a hindrance or law as a hindrance actually and it's you know going to destroy the economy I think that's a very dangerous discussion as well because if we do that we're just going to have a race to the bottom basically we're going to compete over you know stuff we shouldn't be competing about and I think the more countries or nation states come together and say this is certain uses we're not going to do we're not going to use data algorithms for that that's actually a step forward to making sure that we don't abuse those technologies but saying that ethics is a hindrance I think that's very much problematic I very much agree with you that it's actually competitive advantage because I'd rather go to a company that is ethical right and somebody you know who does give you stuff with it and there's the law again there is the law um that can when if you don't want to rely on the ethical conscience of a company then you have the law and this is why we have regulation and sometimes where we need regulation we have for example anti discrimination law we don't just trust people to be not racist we have laws that force them to do so and sometimes we need to have a discussion about that too do we need laws that force you to do the right thing if you don't do it on your own it's can I yet I mean I I agree with with everything you said and also the point the facebook example that you gave is such a perfect example of what happens when you have too few kinds of people in a room thinking about something because you damn well know that that was a decision made by white men in a room I mean you absolutely know that and so to your point about complaints I mean I have to say I think yours is the only complaint I've heard this evening I think everyone else has been talking about big issues and how to approach them and I but I like the fact that you keep coming back to these debates even though you think that we're all whinging but in terms of what in terms of I didn't think that this was a job interview and that we were all trying to get on the boards of these things but I think this would actually look like the beginnings of a good board for exactly the reason that that Sandra just talked about you need different perspectives different disciplines yes you need coders I'm I actually the other day started to try and learn coding just so that because I keep having these conversations but you know you don't know how to do it okay but I'm eventually going to know how to do it I'll still be shitted it but I'll know how to do it and but the point about bringing about how we get the good solutions to this is precisely to bring in these different perspectives and yes we have to come up with intensely practical solutions and again that's what I think these discussions are the beginning of in terms of the question about big data sets big data sets can be that they might be immensely useful you know in in health are some of the most obvious applications where big data sets might actually speed you know find find ways to tackle diseases more efficiently but also you know people looking for efficiencies in the healthcare system could be wonderful that way but used wrongly it could also end up making whole groups of people completely uninsurable so it goes back to that point about you know it's not the data it's how it's how the data in that case it's how the data is used as well as how secure the data is and all the personal details that we know can be reverse engineered sometimes when people say they're not and I think those two conversations actually about the critical analysis and I think the the data specialist or technician because for me I mean I don't know how much you know about any of us in terms of the projects we might be on or the people we might be involved with but I would I would encourage you to be involved with digital catapult if you're not already because that is a place where you are seeing not necessarily data digital specialists over here but you're seeing SMEs you're seeing companies who are driving an incredible amount of innovation you're seeing governmental bodies struggling to try to figure out what to do both at the local authority level who are who are making relationships with different sets of people because they don't necessarily have the specialist in house and then have third party providers holding on to what might be local authority information so is actually is going to need all hands on deck but I think more importantly it's going to need everybody owning their positions coming in and actually trying to recognize this this is not a you versus them them versus us kind of a thing because my whole point about information and critical analysis is all of that plays a part right every single part it's not just that tech is therefore bad and evil and go out tech people and try to make tech better it's the fact that it's it's it's serving a note in a whole place right and this is why I want to get us back to the fact this is not a debate about technology this is a debate about data and inequality right so we need to focus on this question of inequalities and really try to grapple with what are we actually talking about that about there are we throwing data at it as a solution I think I told him about that up very clearly are we actually trying to understand that it's social mobilization it's all of you collectively coming together to try to come up with the decisions about that is it a combination of both do you need to go back and mobilize and and we haven't mentioned this tonight we've talked quite freely about the technology but we also have a significant amount of people who are digital by default who may have very little access to to to tech devices or information who are not linked up but their information is and not only is their information linked up but their authority and power is now required to have a digital file you can't now get a passport without certain sets of information so we've kind of created certain kind of citizenship kind of processes that we're not just talking about kind of inequalities like oh you haven't you don't have so much wealth and let's do this other thing which kind of we actually have processes now that we can look around that you know all of you still know you can go to places and have no mobile connectivity but you can also go to some loads of places and have no internet at all it's not just that your phone will connect you can't find a provider so we've got to kind of think through that we've got a lot of sets of these inequalities that are there and one of the biggest one is access to technology and and not just think that it's that we just make the bigger ship over here we can just drag everybody there it's the fact that those things are not necessarily connected and there are lots of groups who are trying to work on this but that means we've got to look at it from every direction possible if we're going to move up like the entire society around inequalities otherwise we're going to be fixing the problems for a very small set of people who live in certain boroughs in London and we're not actually going to be going anyplace else to actually solve larger scale inequalities we have well technically no time left but if if you're okay with this we are allowed to run on for a few minutes I really would like to get anyone from the audience who had not get a chance to speak yet who wants to speak before I give the panel the last word you see you're immediate now I've said that you're going oh we can go to the bar okay no someone on the right on the back row wants to do that this is your absolute last chance to have a word so stick your hand up now hi there um one quick question there's a bias in data that people don't often talk about which is that human decision making is often not based in aggregate it's big events it's like really big stuff and that those events vanish in data if you take any lengthy time period like there's only one tower block that burnt down recently and there are good reasons to make decisions based upon that fact that have to do with race to do with gender to do with class but if you look at a time series that event will vanish it will diminish at least it'll go away in the analysis how if we make more decisions based upon aggregate just data that seems like a big change over in the way we make decisions from these big events to this long-term aggregate event making how do we balance that off how do we make those decisions we would otherwise make by machines more like the decisions we would make like people where that's useful where it helps greater resilience in our society how do we prevent the default mechanism of how data sees the world becoming the default way that we see the world and make decisions well that's a that don't is that a hand is somebody like texted in a question or something yes why can we have the microphone around here you're going to be really fit you I hope you've got your step counters on you might carry your people because steps don't count if they're not counted you know that thank you um so this is a question from twitter and it's quite a good one to end on it's um does the panel feel the public have anything to add on the debate on data and inequality and if so how do you go about making the debate more inclusive does the panel think that the public have anything to add bearing in mind this question has been tweeted in do we know who by is it anonymous someone from nesta they don't count they're not the public they just want now now I should keep quiet I'm working for them in a few weeks uh okay well so so panel bearing mind I have to say we're technically over time uh and I have spared you another bank of six questions out of time but we have that how you know how can we get the public involved more uh well I mean feel free to say no the public shouldn't be involved I'd be surprised if any of you do after the discussion we've had so far uh and how do we prevent the way data sees the world becoming the default for how we see the world is that another question look where's their microphones gone all right this this person over here I did say it was your last chance I think okay it's got to be better than that one that nesta tweeted in but as we know nesta aren't really the public so they don't really sorry it's a comment on your summing up really about involving the public because a lot of the problems we have certainly I'm responsible for some government data sets or in getting the right people in the public to engage with the questions in the first place which leads to some of the bias in the statistics which means we don't know what people need and it's a really hard question to answer so a lot of the problems with data stem from the fact that it's really hard to get people to take part in the surveys to collect the data which means we just don't know for big big sections about the population and I work in food our response rate is usually 50% just which means half of the people out there we don't know about their choices around food there you go okay well it's excellent then so that's kind of tied in uh the question to end on so panel uh be be concise be pithy be whatever you want to be but uh this is your kind of last word so shall I shall I take you in the same order that you started off in and just give us your your final pithy thoughts to send us into the bar um yeah I think the the engaging the public question is is very very important and also something is very close to my heart because I I go to a lot of those events and panels and very often you don't see a representative of the general public in a sense it's hardly ever that civil society is there it's even less that like actually I know somebody from a union or consumer protection is there um and those would be the people who actually know what's happening on the ground right we talk about inequality and employment for example maybe you should talk about the people in the union right um we talk about the fact that you know this um algorithms are being used for certain products if we're not happy with maybe talk to the people who buy that actual stuff so consumer protection groups would be an actually good start to have a discussion if you have extra problems and I get that but I work with social scientists as well and they all have problems with getting good data by our surveys or interviews and focus groups I get that um but you could start with at least representatives groups and I think it's very very crucial to get to get those people in because we all talk about good decision making and I being used for the good but you don't ask the people um so it's just you know an elitist community that decides what's good for society but actually we should ask the public about that thank you very much Catherine um to the point about the the aggregate decision making um I think that's just a really good illustration of why there always has to be a role for human intervention for um the ability for for human the right human intervention which goes back to the question of what that looks like and how we get to that point um and and that actually also links through to this question about public involvement with this you know through advocacy through politics I talked about education these are things that we should be learning about and discussing and understanding you know you can't have for example something as as small as as the the dreaded cookie consent part of the reason that was such a nonsense is because nobody ever looked at it or knew what it meant if they did look at it um you know we talk about as somebody talked about about in you know knowledge knowledge based decision making is the basis of democracy it's the basis of everything but you can't have it without giving people the tools for that so there are some organisations as for example indeed I believe Nesta is somewhat involved in this and does everyone and you know there are lots of people who are trying to find ways to get the general public which by the way is a phrase I hate because there is no such thing as the general public and we are all the public as well as being involved in you know and that distancing of oh the public over there that's something you always hear people in government doing the public you know that's us um and that's why I meant we all have to engage with it we all have to find whatever ways we can to understand it and to spread that understanding and to get people to understand that unless you engage with this stuff it is going to roll right over us thank you so I can I think I can talk to all three um uh to the comment about responsiveness um you know uh I would probably counter that with that's probably not the case for every situation there are there are plenty of reactionary types of um policy making um that tends to come really quickly uh for certain sets of people and certain sets of incidences and other certain sets of people where it drags out and you have 57 inquiries you have to have a talk and you have to have a thing you have to consult before finally people might actually go we're horrible right and we should change right and I I wouldn't I wouldn't I mean I appreciate your point about data and aggregation but I would I would take us right back to the inequalities thing and actually kind of go press at that to recognize when does when does the letters and the evidence making that that community had done for a very long time not get taken up as data and evidence right um and now we have this inquiry where this conversation is happening um to a certain extent but that is not the first time anybody had heard about the the right clinic so I think we've got we've got we've got to really press at inequalities to really I think kind of understand how do we handle and deal with that set of data um which goes back to you know the race equality disparity audit information and I think the previous conversation from before because that's that's great as a set of information that's been aggregated from information that's already been gathered um but that doesn't necessarily move us further to trying to actualize and use that information and then actually hold particular sets of departments accountable that's supposedly going to happen but right now it's not um and that would be great to turn back to people to actually then try to put in and various groups are doing this some groups are doing around data transparency and openness some are mobilizing around a kind of citizen assembly kind of approach to try to um you know deal with issues around privacy but you can see what we've been talking about today that all the different levels that this might be involved in criminal justice education healthcare it's it's not going to be just enough that I give you access to my information that's on my phone that's on my facebook we have a whole set of things that are happening without anybody's kind of knowledge or consent whether or not we're talking about facial recognition and parks to try to um keep certain sets of people out of them all the way through to other sets of things that are done for the public good that don't necessarily need to necessarily get your permission to be a part of that because it's imagined for the public good and that is also perpetuating a whole wrath of issues around the power that different sets of people can actualize around a moment and that gets right back to this question about inequalities because that means certain sets of people's voices are okay to come in certain sets of arenas and others just disappear right they're kind of in the noise and this gets to the question about reaching people there it may not be an accident that certain sets of people do not want to engage with certain sets of people and authority and power and we have to reckon with that to what does that mean when people are not disengaged they are choosing to not engage with you right and that this hard to reach population rhetoric we need to get kind of thrown on top of its head and really actually go are we doing anything that makes anybody trust us trust what we've ever offered of anything of any kind of it for anybody it could be the university going into spaces much less an activist group because what's being sold empowerment rainbows uniforms touch my thing my data my stuff everything will happen and then you you don't give it it doesn't happen people's lives continue that has that has an effect right and i think that's where there's some work to think about the fraying of these sets of relationships over time and not just about i really wish i mean i appreciate the point but i really wish i could get the people if i just had the people i'd get the information because lots of people talk about that i'm like maybe you need to reckon with the fact that you will never get them if you currently have the sort of structures you've got in place right they're at home with the curtains draw reading pies they don't need to see follow that rob funding what what pies now look in 1971 uh bernard chord published uh how the how the west indian child is made to educationally subnormal in the english education system in the early 80s we saw the sus laws lead to riots in brickston in tothland uh we knew um about patterns of labor market exclusion we knew about uh poorer health come health outcomes we knew about uh the racist treatment of people at borders and we knew about the grandkids of the of that generation who burned a cord identified as being taken out of education uh two-thirds of them are on a gang's matrix held by the metropolitan police um it's not a major surprise people don't want to fill in a form it's not a major surprise that in that context for many people the the the the the particular tragedy of that windrush moment was people were asked for papers after that set of experiences that are built up over every set of generations um i i i i suppose i would just make the plea that we don't we don't forget that we're talking about people and we're talking about data uh we're talking particularly talking about ethnic data we're talking about people's experience um i'm currently the throes of setting up uh this new organization blackout which works with black gay men we don't feature in any of the census data there's no cross referencing so we can't find out that way um but i'm convinced that as a community we didn't own anything there's not a physical building there's not a there's not a infrastructure there's not a business infrastructure et cetera we didn't own anything but what we might own is our data now so there might be a way in which we can uh use the the collection of that data shared with each other to start to build a community asset that might actually have some impact back on our community and that's that's why i think this discussion isn't lost um this isn't just about complaining about what uh what's gone before but this is about thinking about a new uh access to power and a way of shifting power that we might be able to achieve through uh that ownership and control of data thank you um Newton uh i mean the thing that's really come out hasn't it is uh data operates within power structures as it were uh if you think about the word statistics it came from the word state because it was the numbers that the state wanted from you so even the genesis of the term is bound up in power structures as it were and i think today has been a bit about uncovering some of those as it were so the point is if you want to create change data is a tool and we need to harness that uh but it's not the only thing data is necessary for creation of change but certainly not sufficient and i think the point's made about the importance of civil society and movements is really really critical and one interesting theme that uh i mean rob just touched on right at the end there was owning our data and this is quite a seductive movement at the moment sort of the way forward is if we all owned our data we'd have rights and power over it as it were and i'm quite worried about that way of framing it because it's very individualistic uh those who are poorest if they own their data can then trade it away whereas i who are more well off can say actually i don't want to trade my data i'll pay for an email service where i pay 20 quid a year or something and i don't have to give g gmail my data as it were so i'd much prefer to see a discourse based on data rights than data ownership thank you for coming uh there is as we heard another event on the 14th of september in this series please come back and be annoyed again we valued your contribution we won't ask you to code anything in java we promised uh thank you to all the crew that are putting this on to the british library to the cheering centre and especially thank you to this fantastic panel thank you