 If technology could make a twin of every person on Earth, and the twin was more cheerful and less hungover and willing to work for nothing, well, how many of us would still have our jobs, right? I think the answer is zero. Artificial intelligence, or AI, is everywhere. And yet, few of us understand what it actually is, what it can and cannot do. This makes AI seem complex and unpredictable. I'm sorry, Dave. I'm afraid I can't do that. Would the power to make our lives better or worse, or even to replace us? Will it take our jobs? Will it save our planet from climate disaster? Are we controlling AI, or will it eventually control us? AI is a technology. It isn't intrinsically good or evil. Some people have hijacked the ability of the algorithms to very rapidly change people, because if you nudge somebody hundreds of times a day for days on end, you can move them a long way in terms of their beliefs, their preferences, their opinions. Our world is already shaped by algorithms on our devices and on social media. But as with all man-made tools, understanding the promise and limitations of AI could help us stay in control of the tech, rather than it controlling us. To guide us, we have Stuart Russell, a professor of computer science at Berkeley and one of the world's leading expert on AI. It gets to some of the most difficult current problems in moral philosophy. How do you act on behalf of someone whose preferences are changing over time? Algorithms are having a massive effect on billions of people in the world, and I think we've given them a free pass for far too long. It's actually surprisingly difficult to draw a hard and fast line and say, well, this piece of software is AI and that piece of software isn't AI. Because within the field, when we think about AI, the object that we discuss, something we call an agent, which means something that acts on the basis of whatever it has perceived, and the perceptions could be through a camera, through a keyboard. The actions could be displaying things on a screen in the steering wheel of a self-driving car or firing a shell from a tank or whatever it might be. And the goal of AI is to make sure that the actions that come out are actually the right ones, meaning the ones that will actually achieve the objectives that we've set for the agent. And this maps on to a concept that's been around for a long time in economics and philosophy called the rational agent. So the agent whose actions can be expected to achieve its objectives. And so that's what we try to do. And they can be very, very simple. A thermostat is an agent. It has perception, just measures the temperature. It has actions, switch on or off the heater. And it sort of has two very, very simple rules. If it's too hot, turn it off. If it's too cold, turn it on. And is that AI? Actually, it doesn't really matter whether you want to call that AI or not. So there's no hard and fast dividing line. Like, well, if it's got 17 rules, then it's AI. If it's only got 16, then it's not AI. That wouldn't make sense. So we just think of it as a continuum from extremely simple agents to extremely complex agents like humans. This has always been the goal of what are called general purpose AI. There are other names for it. Human level AI, super intelligent AI, general general intelligence. But I settled on general purpose AI because it's a little bit less threatening than super intelligent AI. It means AI systems that, for any task that human beings can do with their intellects, the AI system will be able to, if not do it already to very quickly learn how to do it and do it as well as or better than humans. I think most experts say by the end of the century, we're very, very likely to have general purpose AI. The median is something around 2045. And so that's not so long. It's less than 30 years from now. I'm a little more on the conservative side. I think the problem is harder than we think. This is a very old point. I mean, even amazingly, Aristotle actually has a passage where he says, look, if we had fully automated weaving machines and fully automated plectrums that could pluck the lyre and produce music without any humans, then we wouldn't need any workers. And it's a pretty amazing thing for 350 BC. So that idea, which I think it was Keynes who called it technological unemployment in 1930 is very obvious to people. They think, yeah, of course, if the machine does the work, then I'm going to be unemployed. And the Luddites worried about that. And for a long time, economists actually thought that they had a mathematical proof that technological unemployment was impossible. But if you think about it, if technology could make a twin of every person on Earth and the twin was more cheerful and less hungover and willing to work for nothing, well, how many of us would still have our jobs? I think the answer is zero. So there's something wrong with the economist's mathematical theorem. And over the last decade or so, I think opinion in economics has really shifted. And it was a fact. The first Davos meeting that I ever went to in 2015, there was a dinner supposedly to discuss the new digital economy. But the economists who got up, there were several Nobel Prize winners there, other very distinguished economists, and they sort of got up one by one and said, actually, I don't want to talk about the digital economy. I want to talk about AI and technological unemployment, and this is the biggest problem we face in the world, at least from the economic point of view. I think there's still a view of many economists that, because there are compensating effects. It's not as simple as saying, if the machine does job X, then the person isn't doing job X, and so the person is unemployed. There are these compensating effects. So if the machine is doing something more cheaply and more efficiently and more productively, then that increases total wealth, which then increases demand for all the other jobs in the economy. And so you get this sort of recycling of labor from areas that are becoming automated to areas that are still not automated. But if you automate everything, then this is the argument about twins. It's like making a twin of everyone who's willing to work for nothing. And so you have to think, well, are there areas where we aren't going to be automating either because we don't want to or because humans are just intrinsically better? So this is one, I think, optimistic view. And I think you could argue that Keynes had this view. He called it perfecting the art of life. We'll be faced with man's permanent problem which is how to live agreeably and wisely and well. And those people who cultivate better, the art of life will be much more successful in this future. And so cultivating the art of life is something that humans understand. We understand what life is. And we can do that for each other because we are so similar. There's this intrinsic advantage that we have for knowing what's like, knowing what it's like to be jilted by the love of your life, knowing what it's like to lose a parent, knowing what it's like to come bottom in your class at school and so on. So we have this extra comparative advantage over machines that means that those kinds of professions, the interpersonal professions are likely to be ones that humans will have a real advantage and actually more and more people, I think, will be moving into that area. There's a big difference between asking a human to do something and giving that as the objective to an AI system. When you ask a human to fetch you a cup of coffee, you don't mean this should be their life's mission and nothing else in the universe matters. Even if they have to kill everybody else in Starbucks to get you the coffee before it closes, they should do that. No, that's not what you mean. And of course, all the other things that we mutually care about, they should factor into your behavior as well. And the problem with the way we build AI systems now is we give them a fixed objective. The algorithms require us to specify everything in the objective. And if you say, you know, can we fix the acidification of the oceans? Yeah, you could have a catalytic reaction that does that extremely efficiently, but consumes a quarter of the oxygen in the atmosphere, which would apparently cause us to die fairly slowly and unpleasantly over the course of several hours. So how do we avoid this problem, right? You might say, okay, well, just be more careful about specifying the objective, right? Don't forget the atmospheric oxygen. But, you know, and then of course, it might produce, you know, some side effect of the reaction in the ocean, poisons all the fish. Okay, well, I meant, yeah, don't kill the fish either. And then, well, what about the seaweed? Okay, well, don't do anything that's gonna cause all the seaweed to die and on and on and on, right? And in my book, Human Compatible, the sort of main point is if we build systems that know that they don't know what the objective is, then they start to exhibit these behaviors, like asking permission before getting rid of all the oxygen in the atmosphere, right? And they would do that because that's a change to the world and the algorithm may not know is that something we prefer or disprefer? And so it has an incentive to ask because it wants to avoid doing anything that's dispreferred. So you get much more robust controllable behavior. And in the extreme case, if we want to switch the machine off, it actually wants to be switched off because it wants to avoid doing whatever it is that is upsetting us. It wants to avoid it. It doesn't know which thing it's doing is upsetting us, but it wants to avoid that. So it wants us to switch it off if that's what we want. So in all these senses, control over the AI system comes from the machine's uncertainty about what the true objective is. And it's when you build machines that believe with certainty that they have the objective, right? That's when you get the sort of psychopathic behavior. And I think we see the same thing in humans. You know, AI is a technology. It isn't intrinsically good or evil. That decision is up to us, right? We can use it well or we can misuse it. There are risks from poorly designed AI systems, particularly ones pursuing the wrong object, wrongly specified objectives. And I actually think we've given algorithms in general, not just AI systems, but algorithms in general. I think we've given them a free pass for far too long, right? And if you think back, there was a time when we gave pharmaceuticals a free pass. There was no FDA or other, you know, agency regulating medicines and hundreds of thousands of people were killed and injured by poorly formulated medicines by fake medicines, you name it. And eventually over about a century, we developed a regulatory system for medicines that, you know, it's expensive, but most people think it's a good thing that we have it. And we are nowhere close to having anything like that for algorithms, even though, even to perhaps to a greater extent than medicines, these algorithms are having a massive effect on billions of people in the world. And I don't think it's reasonable to assume that it's necessarily going to be a good effect. And I think governments now are waking up to this and really struggling to figure out how to regulate and while not actually making a mess of things. The problem with answering your question is that we actually don't know the answer because the facts are hidden away in the vaults of the social media companies. And those facts are basically trillions of events per week, trillions because we have billions of people engaging with social media hundreds of times a day. And every one of those engagements, clicking, swiping, dismissing, liking, disliking, thumbs-uping, thumbs-downing, you name it. All of that data is inaccessible. However, if you think about the way the algorithms work, what they're trying to do is basically maximize click-through. They want you to click on things, engage with content or spend time on the platform, which is a slightly different metric, but basically the same thing. And you might say, well, okay, the only way to get people to click on things is to send them things they're interested in. So what's wrong with that? But that's not the answer. That's not the way you maximize click-through. The way you maximize click-through is actually to send people a chain of content that turns them into somebody else who is more susceptible to clicking on whatever content you're going to be able to send them in the future. So the algorithms have, at least according to the mathematical models that we've built, the algorithms have learned to manipulate people to change them so that in future they're more susceptible and they can be monetized at a higher rate. Now, at the same time, of course, there's a massive human-driven industry that sprung up to feed this whole process, the click-bait industry, the disinformation industry. So people have hijacked the ability of the algorithms to very rapidly change people because it's hundreds of interactions a day. Everyone is a little nudge, but if you nudge somebody hundreds of times a day for days on end, you can move them a long way in terms of their beliefs, their preferences, their opinions. The algorithms don't care what opinions you have. They just care that you're susceptible to stuff that they send. But of course, people do care and they hijack the process to take advantage of it and create the polarization that suits them and their purposes. And I think it's essential that we actually get more visibility. AI researchers want it because we want to understand this and see if we can actually fix it. Governments want this because they're really afraid that their whole social structure is disintegrating or that they're being undermined by other countries and who don't have their best interests at heart. With social media, this is probably the hardest problem because it's not just that it's doing things we don't like. It's actually changing our preferences. And that's a sort of a failure mode, if you like, of any AI system that's trying to satisfy human preferences, which sounds like a very reasonable thing to do. One way to satisfy them is to change them so that they're already satisfied. And I think politicians are pretty good at doing this. And we don't want AI systems doing that, but it's sort of the wicked problem because it's not as if all the users of social media hate themselves. They're not sitting there saying, how dare you turn me into this raving neo-fascist? They believe that their newfound neo-fascism is actually the right thing, and they were just deluded beforehand. And so it gets to some of the most difficult current problems in moral philosophy. How do you act on behalf of someone whose preferences are changing over time? Do you act on behalf of the present person or the future person? Which one? There isn't a good answer to that question. And I think it points to actually gaps in our understanding of moral philosophy. So in that sense, what's happening in social media is really difficult to unravel. But I think one of the things that I would recommend is simply a change in mindset in the social media platforms. Rather than thinking, how can we generate revenue? Think, what do our users care about? What do they want the future to be like? What do they want themselves to be like? And if we don't know, and I think the answer is we don't know. I mean, we've got billions of users. They're all different. They all have different preferences. We don't know what those are. Think about ways of having systems that are initially very uncertain about the true preferences of the user and try to learn more about those while sort of respecting them. So the most difficult part is you can't say don't touch the user's preferences. Under no circumstances that you allow to change the user's preferences. Because just reading the financial times changes your preferences. You become more informed. You learn about all sorts of different points of view. And then you're a different person. And we want people to be different people over time. We don't want to remain newborn babies forever. But we don't have a good way of saying, well, this process of changing a person into a new person is good. And we think of university education is good or global travel is good. Those usually make people better people. Whereas brainwashing is bad and joining a cult, what cults do to people is bad and so on. But what's going on in social media is right at the place where we don't know how to answer these questions. So we really need some help from moral philosophers and other thinkers. The three principles. The first one is that the only objective for all machines is the satisfaction of human preferences. And preferences is actually a term from economics. It doesn't just mean, well, what kind of pizza do you like or who did you vote for? It really means what is your ranking over all possible futures for everything that matters. So it's a very, very big complicated abstract thing, most of which you would never be able to explicate even if you tried. And some of which you literally don't know, because I literally don't know whether I'm going to like durian fruit. If I eat it, some people absolutely love it and some people find it absolutely disgusting. I don't know which kind of person I am, so I literally can't tell you, do I like the future where I'm eating durian every day? So that's the first principle. We want the machines to be satisfying human preferences. The second principle is that the machine does not know what those preferences are. So it has initial uncertainty about human preferences. And we already talked about the fact that this sort of humility is what enables us to retain control and makes the machines in some sense differential to human beings. The third principle really just grounds what we mean by preferences in the first two principles. And it says that human behavior is the source of evidence for human preferences. And that can be unpacked a bit. And basically, the model is that humans have these preferences about the future and that those preferences are what cause us to make the choices that we make. And behavior means everything we do, everything we don't do, speaking, not speaking, sitting, reading a email while you're watching this lecture or this interview. And so the upside potential for AI is enormous in the future. And I think it's important to understand that in the future, we will have a choice. I hope that we don't just choose to stay in bed, but we want to have other reasons to get out of bed so that we can actually live rich, interesting, fulfilling lives. And that was the first principle that we wanted to make. And I think it's important to understand that we want to have a choice. We want to have a choice. We want to have a choice. We want to have a choice. We want to have a choice. And that was something that Keynes thought about and predicted and looked forward to, but isn't going to happen automatically. There's all kinds of dystopian outcomes, even when this golden age comes, is that whatever the movies tell you, machines becoming conscious and deciding that they hate humans and wanting to kill us is not really on the cards.