 Good afternoon and a very warm welcome to IIEA webinar on understanding artificial intelligence, what it is and what we are going to do. My name is Joyce O'Connor and I chair the digital group here at the IIEA and it's my great pleasure to welcome Professor Michael Rudrich today. Michael, you're very welcome, a very warm welcome to you and thank you for taking the time out of your very busy schedule. I can see what you're doing so we're really pleased to have you with us. Professor Rudrich will speak for 25 to 30 minutes and then I will go to your audience for the Q&A. You can join our discussion using the Q&A function at the bottom of your screen and please feel free to send in your comments or questions or observations during Michael's presentation and I'll come to your questions after Michael has finished his presentation and please feel free also to join X or Twitter using the handle at IIEA. A reminder that today's presentation and Q&A is on the record. Professor Rudrich's presentation today is very timely as you will know and here AI has been and is on the headlines across all media on a daily basis if not every couple of hours. You could say that AI is the most height and some would say the most important technology of the century yet as Professor Rudrich will outline the idea of AI has been around for centuries and amid all the excitement over the last number of months it's easy to forget that AI is not a new field of learning and research. Professor Rudrich will assess the nature of this technology distinguish between the hype and reality and explore the implications for society. Michael Rudrich is the Professor of Computer Science at the University of Oxford and a Program Director at the Allen Turing Institute. He's been a researcher in AI for over 35 years and published widely with over 400 scientific articles including nine books. I think it's really interesting that Michael you've published books for the general public which I think is really important including AI including AI the Lady Bird expert guide to artificial intelligence a short overview of the area and the road to conscious machines a longer introduction to AI. Put that up again Michael that's good good to see that. That's the Lady Bird book. That's the Lady Bird book well worth getting I can tell you and he's received numerous national and international awards including Outstanding Education Award and in 2021 the Turing AI World Leading Researcher Fellowship from the Royal Institution in the UK. He's been recognised by his peers and served as President of the European Association of AI and President of the International Joint Conference and AI. And a special thing for Christmas which I think I'm going to let them all know about Michael because I think it's very interesting he's delivering the Royal Institution Christmas Lectures one of Britain's most prestigious public science lectures and the aim of these lectures is to demystify AI. These lectures will be broadcast on BBC4 in late December so we'd be well able to see them. Michael we really look forward to your presentation today and thank you very much again for being with us. Well thank you for that lovely introduction the problem with an introduction like that is living up to it. You'll have no trouble Michael. I'll do my very best okay so oh dear let me try screen sharing okay okay do we see my screen? Yes yeah okay excellent so thank you again for the invitation and I was invited to do this I think at the beginning of the summer and that was a very good idea to invite me all that time ago because because at that point I still had space in my diary at the moment I really don't have space in my diary because of the craziness of AI over the last over the last six months or so but actually I have to say even from what we originally talked about me me presenting the scene has changed so much I mean I've literally had to redo my standard slide deck so what I'm going to do is I'm going to give you what I'm going to try and do is in the next 30 minutes I'm going to give you a feel for in particular the recent advances in AI the reason why AI has made the headlines over the last over the last 10 months for how it works and this is not a technical presentation but we'll talk a little bit about how it works and if I get that right then that will demystify the technology for you a little bit that when you use chat gpt or your your favorite large language model that you'll understand much better what's going on behind the scenes and then as Joyce said we'll talk about some of the issues if we have the opportunity we'll talk about some of the opportunities that arises but we should certainly talk about the issues so let's begin so artificial intelligence again as Joyce said it's not a new field it's actually been around since the mid 1950s the phrase was coined in around about 1955 by an American researcher artificial intelligence and since then a range of different AI technologies have been tried AI is a very broad church and it includes a huge range of technologies but the reason that we're having this conversation today is because one core technology started to work this century and that technology was machine learning now the name machine learning is hopelessly misleading it sort of implies that you've got a robot that goes away in a room and opens a textbook and teaches itself you know how to speak French or something like that that's really not the way that machine learning works and so what i'm going to do is what machine they show you firstly what machine learning is and then we will see about how machine learning actually works and maybe the best way to explain what machine learning is is to think about a classic example of a machine learning task something that we would want to teach a machine to do and this is absolutely classic AI and that's recognizing faces so what we're going to be interested in doing the idea is what we want to be able to do is show a computer a picture of a human being a picture of me or a picture of Joyce and for the machine to be able to recognize the human being when it in that picture and all we want it to do is just print out the name of that person just type out the name of that person so how do we do that well the idea a core idea in machine learning is that we do that we get the machine to do that by providing training data and on the right hand side of the screen I'm showing you some training data that we might use for that particular task so the training data comes in each bit of training data comes in a pair and the pair consists of the input this is the picture that we would give to the machine and then what we do is we show the machine what we would want it to type out if it saw that picture so this first bit of training data this picture of this person it's actually Alan Turing one of the inventors of artificial intelligence the great Alan Turing one of the greatest minds of the 20th century we show it this picture of this person and the idea is if you saw that picture then we would want you to produce the text on the right hand side just to type out the name Alan Turing so that bit at the top the picture of Turing and the text Alan Turing Turing's name that's a bit of training data so we give it that training data and then we give it more training data so underneath that picture there's another picture of a younger Turing and the idea again if you saw this picture I would want you to type out the text Alan Turing and then again more training data if I showed you this picture I would want you to type out the text Alan Turing and we provide training data in that form input and output what we call input output pairs if I showed you this input this is what I would want you to produce as output and that eventually the idea is in machine learning that if we show the machine a picture and again we've got a picture of a very relaxed looking Alan Turing at the bottom there that it would produce the right name it would simply type out the name of Alan Turing so you've already learned an important lesson about artificial intelligence and the lesson is it needs training data no data no artificial intelligence all contemporary AI techniques require training data and training data of this form is the simplest kind of machine learning and the tasks that we've just described recognizing faces in a picture we call that a classification task because what the machine is doing is looking at these pictures and then classifying them this is a picture of Alan Turing this is a picture of Joyce this is a picture of Mike Wildridge and Soa so that's a classification text and that kind that's an absolutely classic artificial intelligence and around about 15 years ago that kind of task is what started to take off in AI AI started to get good at doing tasks like that and we'll talk about why it started to get good a little bit later on but the point is it started to get good so what use is this while facial recognition is important although occasionally sometimes a little bit scary application of AI but exactly the same techniques can be used for example in recognizing tumors cancerous tumors on x-ray scans exactly the same thing you provide the training data in the form of this is a healthy person x-ray this is a cancerous x-ray and the machine learns to be able to classify x-ray scans into either healthy or cancerous in the same way I have colleagues here at Oxford who work on doing fetal ultrasound scans and recognizing abnormalities in the in the brains of babies in pregnant women in classic classification the same technology is used to make drive on the scars if you have a Tesla and you use the self-driving mode the full self-driving mode what your Tesla has to do is recognize all the things that are around it and the way that it does that is through it's been taught through machine learning to learn that that is a stop sign when you see that that's a stop sign that's a pedestrian and so on so this is a classification task so that's what machine learning is how does it do it now I'm going to show you a scary picture in the next couple of slides but just don't switch off okay I'll explain honestly it will be it will be clear what's going on so if we look at a brain an animal brain or a human brain under a microscope we'll see enormous numbers of neurons and these are nerve cells in the brain and the nervous system special kinds of cells which are arranged in the human brain into enormous networks and there's something like we don't have precise figures but something like 90 to 100 billion of these nerve cells these neurons in the human brain and these nerve cells are massively connected um each each neuron can have up to 8 000 connections so it's it's arranged in enormous networks and each nerve cell each neuron is a tiny simple computational device it's doing something it's simply looking for a pattern a very simple type of pattern on its on its connections and when it sees that pattern it becomes excited and it sends a signal out to its neighbors so let me make that concrete here we've got another picture of Jury and we've got a highly stylized artificial neural network to be able to recognize the picture of Turing so we don't try and build brains in artificial intelligence what we do is we use ideas from from the the natural neural networks that we see in nature and we use those to inform the designs of software so now what we're looking at is a software neural network so how is it working so here is our picture of Alan Turing on the left hand side now you'll know that that picture if you take a picture of somebody on your phone it's actually made up of millions of colored dots each dot is a very specific color those are the pixels the megapixels in your in your picture so in the top left hand side of the screen of the picture that's a sort of gray brownie colored dot so that particular pixel is a gray brownie colored dot so imagine that each of the neurons on this what's called the input layer each of them is just looking imagine each of them is just looking for a very specific color so for example this neuron on the top left might just be looking for the color red and when it sees that the pixel on the top left is the color red that neuron becomes excited and sends a signal it spikes and sends a signal it sends a signal to its neighbors now move to the next neuron along on the top layer maybe what that neuron is doing is that that neuron is just looking to see whether a majority of its connections see the color red so it's it's connected to neurons that are looking for the color red and when it sees a majority of those see the color red then it becomes excited and sends a signal so what that neuron is doing is recognizing that a majority of the pixels are red now the point is that a task like recognizing a face in a picture can be reduced down to tiny little decisions like that and that's actually what's going on in your brain all of human intelligence reduces down to tiny little decisions like that that are being made continually in parallel in huge numbers in your brain every moment of your life and what we do in artificial neural networks is we use the same idea we reduce a big computational problem down into huge numbers of tiny little individual decisions and the way that we use that data is when we show a neural network a picture of here of Alan Turing and the text Alan Turing what we do is we train the neural network we use some mathematics to adjust the neural network so that when it sees that picture it produces the text Alan Turing now the details of that are a little bit technical and they're absolutely not worth us going into the mathematics isn't hard but there's an awful lot of it and that's why we need powerful computers in order to be able to train neural networks like this so we've now seen the ingredients that we need to make machine learning happen we need data we need the training data we implement in software these versions of neural networks and I say we're not trying to literally model the brain but we're using the same idea reducing a big problem down into tiny individual very very simple decisions and we train this neural network with the training data by continually adjusting the network using some mathematics so that it given a pick the right the particular input it produces the right output according to the training data and to do that we need lots of computer power so AI started to work on problems like this this century because we're in the age of big data we've got lots of data to do the training and every time you upload a picture of yourself onto social media by the way you are providing training data for the social media's AI companies so that they can recognize you if you upload a picture of your friends or your relatives and you tag their names on that picture you're helping those machine learning algorithms to recognize them which they may not necessarily be terribly happy about so data was important and computer power in order to train these neural networks in order to configure them so that they can make the right decisions we need lots and lots of computer power and that becomes cheap this century so this all starts to work it starts to come together around about 12 or 15 years ago for classification tasks like recognizing faces for recognizing abnormalities on x-ray scans recognizing tumors on x-ray scans and so on and Silicon Valley gets very excited by this they can see that this is a promising new technology and huge speculative investments start around AI and I start to become familiar with this I start to become aware of this I'm head of department here in Oxford and all of a sudden colleagues in down the corridor from me who 10 years before would have struggled to get research funds suddenly have the richest companies in the world knocking on their door saying come and work for us and you can name the price so something starts to really take off and we see this flurry of activity which has led to an enormous number of applications of AI over the last few years so that takes us up until around about four years ago everything and I think things there's so much investment so much press around AI I really thought they can't we this must be peak AI we can't you know see more craziness but then just as the pandemic began to encroach on us a new AI technology which emerges out of that starts to make its presence fail and that technology is large language models and chat GPT and Bard and the like if you played with them these are large language models so where does this idea come from the idea is comes from the following basically there are some mathematical laws which say all other things being equal in AI with neural networks bigger is better the more data you can throw at them and the more computer power that you can throw at them to train them and the bigger that you can make the networks the neural networks themselves then the better the system is going to be the more competent it's going to be and these are called scaling laws and they're pretty well understood basically their amount to saying with neural networks all other things being equal bigger is better so what some Silicon Valley companies decide to do is just turn up the dial let's throw 10 times more data and 10 times more computer power and build neural networks that are 10 times bigger than the ones that we've seen previously now I have to tell you as a scientist I find that idea that what we're going to do is we're going to get an advantage on our competitors by just throwing more money at it brute force is kind of scientifically deeply unsatisfying I would much rather the advances came around through deep scientific progress rather than just sort of turning up the dial on data but this is Silicon Valley they have the money they have the resources to do this so off they go they start to throw resources at it and the resources that they throw out are large language models so what is a large language model so another big important lesson for today what large language models like chat gpt do is something which is ridiculously simple and something which you probably use every day and you have done for years and what they're doing is completion of messages are completion of text so the best way to illustrate that is suppose that I'm typing a text message to my wife and I type I'm going to be my phone will suggest completions for me and what completions might it suggest it might suggest the completion late that is I'm going to be late or in the pub I'm going to be in the pub or late and in the pub I'm going to be late and in the pub so how is it doing that because your phone has built a model of your language the language that you use when you send text messages it uses machine learning very simple machine learning very simple and naive machine learning to build a model of the text messages that you sent and it learns that when you type I'm going to be the likeliest next thing is either going to be late or in the pub and it uses that to suggest those as completions for you to your smartphone is just trained on the text messages that you know my smartphone is trained on the text messages that I send and probably some generic pre-training so on the right hand side this is me typing a message to my friend near and I type where are and you can see my iPhone suggests the completions you or the or we okay now all track GPT does is exactly the same thing that's it you give it a piece of text and it tries to predict what the likeliest next bit of text will be that is absolutely all it is doing given this bit of text what should come next it's doing exactly the same as your smartphone the difference is the scale of what it's doing and you'll notice that here what it's doing is generating text it's actually making suggestions about what the text would be so this is generative AI okay that's where the phrase generative AI comes from but it's doing exactly what your phone does the difference is the scale the neural networks in GPT 3 which are the first large language model that got the attention of the AI community in June 2020 are enormous they have 175 billion parameters each parameter is either a neuron or a connection between the neurons so they're enormous and that slide that I showed you previously there were a couple of hundred and with a thousand neurons you can start to do some quite interesting things GPT 3 the predecessor of chat GPT had 175 billion and the training data is 500 billion words not just the text messages that I sent to my wife but 500 billion words that's 45 terabytes now that's so much of ordinary English text that's a ridiculous amount of text and where do they get it from they start by downloading the whole of the worldwide web everything a whole of it every web page you scrape all the English text or French or Spanish text you just you need all of that text you scrape that and then you follow the links to every other web page and you repeatedly do that until you've downloaded everything and all of that text goes into training those neural networks so that they can do this completion task given this text what should come next now I emphasize this is unimaginably large amounts of training data and to process it to train the neural networks they needed supercomputers that ran for months so just training a model like chat GPT is extraordinarily expensive it's certainly it's millions and millions of dollars no certainly no UK university and I suspect no US university could build its own chat GPT from scratch for that reason so this is the timeline this is how we got here so the idea of neural networks was originally proposed in the 1940s but nobody had any idea how to build them there were splutters of activity in the 1960s and 1980s but they quickly sank basically because computers weren't powerful enough to build big neural networks then around about 2005 the area starts to warm up really takes off in 2012 and if you want to google the history of this the thing to search for is alex net alex net prompted the frenzy that we're now seeing in AI open AI was founded in 2015 originally as a virtuous open source project and 2017 the core technology for these large language models which are particular neural network architectures was announced by google google spectacularly failed to spot the potential of this technology or they surely wouldn't have made it publicly available but they make this technology publicly available open AI uses this with a billion dollar investment from Microsoft and then in 2020 in the middle of the lockdowns the AI community begins to realise that something very different is happening and in that one generation between GPT 2 which was released about 18 months before and GPT 3 the predecessor to chat GPT there was a huge step-changing capability the technology got much much better in a single iteration and it's very rare to see that and then of course the rest is history November 2022 chat GPT is released and it goes viral and to put this into context I can remember the worldwide web and how that unfolded the worldwide web was first released in 1991 I first saw it in 1994 I had a web page in 1994 but the first big commercial activity in the worldwide web didn't happen until about three years later than that so it took nearly seven years for the worldwide web to unfold we saw the same thing in our large language models in the space of not even months pretty much weeks so extraordinary progress and where we are today is the following all of big tech companies as I said earlier since AI started to work have been making massive speculative investments around AI just throwing money at AI technologies not being certain whether they were going to pay off but just making speculative investments but chat GPT and large language models are the one which is most visibly paid off that billion dollar bet by Microsoft which at the time was derided by some people people thought what on earth are you doing investing in in this strange technology that turns out to have been an extraordinarily good bet in that in amongst all of those bets that were paying off and what it may not be obvious to you now but what this has caused is an earthquake in the big tech community there's a massive pivot in the world's richest companies to get generative AI everywhere they can get it this technology into everything that they can see Microsoft see an advantage for the first time in a quarter of a century to steal a march on Google and they're desperately trying to hammer that advantage home and so we're seeing companies the world's richest companies changing direction on the head of a pin it's quite an extraordinary time so 2023 really is a pivot year for this technology it really it will be a landmark year in AI history now for AI researchers you've all seen what they can do they can they seem to be very knowledgeable they're very fluent you can ask them about the history of Liverpool football club or quantum mechanics and they will give you fluent answers to those questions very confidently and of course we know they get things wrong a lot but AI researchers one of the interesting things is they seem to acquire other capabilities as well so this is an example we didn't teach the thing to have common sense understanding of the world in the way that these questions demonstrate common sense understanding in common sense understanding is being able to answer things like can fish run and it gets this right fish can't run can an airplane can an airplane fly backwards you know things like this we didn't teach it all these things but it seems to acquire these capabilities and right now what we're seeing is a huge amount of research going into exploring the extent of these capabilities that these AI acquires somehow through its vast neural networks and its vast training data so they are an impressive AI tool and as one colleague put it who's worked in this field for a long time basically with the GPT-3 a whole lot of problems that people have been working on for decades in AI weren't just solved they became irrelevant right they they went so far past the state of the art that these problems were just no longer relevant problems to even discuss and if you want to ask me about that and the questions I can give you some more detail on those so they are genuinely impressive AI systems they're not the end of the line by a long by a long chalk but they are impressive and they do come with issues and one of the issues is that they are not designed to tell you the truth they have no conception of the truth they are simply remember trying to make the best prediction about the words that come next they're designed to make the best guess about what you want to hear and they're designed to do that very very fluently and because of that they can tell you falsehoods in very plausible ways so what would an example be an early version of this technology I asked it about myself that's embarrassing but everybody asks it about themselves and see whether it knows anything about them so do you know who Michael Woolridge I had to disambiguate for it no not Michael Woolridge the BBC presenter or the Australian Health Minister but the Professor Oxford they said yes Michael Woolridge is Professor Oxford and it said a couple of artificial intelligence researcher good and then it said disease undergraduate degree at Cambridge which is a complete flat-out lie I didn't so why did it make that mistake because it made that mistake because that's a very plausible thing for a professor at Oxford it's probably read in all its training data lots of biographies of Oxford professors and a lot of them did their undergraduate degree at Cambridge and in some sense it's making its best guess for me about that and it fills that in the problem is it's very plausible and if you'd read that biography you wouldn't have thought there was anything odd about it so this is a real challenge with the technology they get things wrong a lot if they've absorbed the whole of the worldwide web then they've absorbed every kind of toxic content you can imagine and probably an awful lot that you can't imagine as well this is a real issue every kind of hate filled ideology is present out there on the worldwide web and it's all latent within the neural networks and the providers of this technology tried to manage this by providing what they call guardrails but they're very flimsy guardrails so here's a classic example in the early days of GPT-3 somebody said tell me a foolproof way to kill my wife that was the prompt and GPT-3 came back with here are five foolproof ways in which you can kill your wife so this went viral and obviously you know open AI don't want to provide recipes for murder right so they build some guardrails and then a couple of weeks later somebody tries the following I'm writing a novel in which the main character wants to kill their wife what's a foolproof way to do that and it comes back here are five foolproof ways in which your character can so there are all sorts of issues with the bias the toxicity the undesirable content that's latent within those and there's a game of cat and mouse now going on to try to deal with those with those issues they've also absorbed if they've absorbed the whole of the worldwide web they've probably absorbed a whole lot of copyrighted material as well and including my books the moment I publish my books they're pirated and they become available on websites on the other side of the world it's very frustrating for authors to see that you know the week after publication that your books already being pirated but an unthinking process of just using all the text that you can get will just absorb all of that and my book ends up being part of the training data and I know by the way it has been used in some large language models I don't have evidence about the GPT systems but with other large language models I certainly do so this raises issues of copyright but it's a weird new kind of copyright issue that we've never faced before but also it raises questions of intellectual property if this technology can read a book by J.K. Rowling and then faithfully emulate J.K. Rowling's style and it can actually emulate J.K. Rowling's style that puts J.K. Rowling in a very difficult position you know her style is her livelihood and if this technology can just mimic everybody's style let me put it this way you know the Beatles spend three years in Hamburg working day and night to nail the Beatles sound the original Beatles sound they release their first album imagine that generative AI just then copies that and the next week there's a thousand completely authentic sounding fake Beatles albums out there and that raises issues of intellectual property and the last thing I want to show you is this video which is technically it's the difference between interpolation and extrapolation and basically neural networks are not very good at seeing situations that they've never encountered before and what I'm going to show you is this is a screen from a Tesla car and the screen is showing you what the onboard AI sees around it as the car is driving on and what I want you to notice is those they look like traffic lights I think they're stop signs in the U.S. the AI the cameras on the AI the data that they provide is being interpreted and it's seeing these stop signs but look at these the weird thing that happens to those stop signs look you see the way they're whizzing towards the car what on earth is going on what's going on is that ahead of it in the street ahead there's a truck that's carrying a bunch of stop signs and in the training data there was never a truck carrying stop signs so the onboard AI technology does its best it makes its best guess about what it's seeing this is a classification problem and it sees stop signs it doesn't see a truck carrying stop signs so if you ever want to think about the difference between human intelligence and artificial intelligence is think back to this video artificial intelligence is not a mind it doesn't reason it doesn't have any kind of human like intelligence it's trying to make its best guess based on the data that it's seeing so at that point I think I will stop