 Today's speaker is Craig Mundy. Craig has a very distinguished career in technology. He worked, it's too much to give a short introduction, so I'm going to, sorry, too much to give the full introduction, so I'm going to give a short one. So Craig is perhaps best known because he became the head of technical strategy in Microsoft research when Bill Gates retired. He'd already worked there for a very long time, but he was at the helm of the strategy for Microsoft for years and years, and we heard about him in all over the news all the time. He advised no fewer than three United States presidents on technical strategy from various angles, and he had a very big influence on the telecom industry, particularly the security of it. At the moment, my understanding is that he's deeply involved in a few very, very exciting startups in many different areas, but one or two in biotechnology, one in machine learning. He had some institutes, and it is really a great honor, Craig, to have you speak to us today. So for those of you who haven't read the announcement, Craig will talk for approximately 30 minutes about artificial general intelligence. And after that, we'll have a roughly 30-minute discussion where we can ask Craig some questions, and I'll try to moderate it at that time. Craig, welcome, take the stage. Thank you, Peter. Good morning, everyone, and good afternoon there. So this talk, while titled Artificial General Intelligence, I put a subhead on it today called The Advent of Polymathic Machines, and the reason for that is I think that as we seek to build machines that do general intelligence or have general intelligence, we're going to start to see useful machines emerge along the way. And I've taken to calling this the idea of polymaths on demand, that if we have machines that are capable of learning at pretty high rates in many different areas, then when we want to pursue a specific project or area, I think we will increasingly have the ability to develop a machine that is at one moment, both an expert in many different areas, and therefore has some ability, if you will, to reason across those things in ways that humans find difficult. So let me just talk through this. So a few years ago, when I first started talking about this, I created this slide that described, well, what was artificial general intelligence? You know, I often say to people, we stuck the word general in there because a lot of people have scondid with the term artificial intelligence to rebrand in a sexier way, you know, the ideas that were more or less machine learning. And I think that, you know, they're qualitatively different. But general intelligence, you know, I think we'll ultimately have sort of four attributes. One is this sort of polymathic capability. And so the question is, you know, when does a narrow machine become a general machine? I don't think we really know the answer to that. But I think we're definitely going to get these polymathic capabilities along the way. I think it's a system where you can combine both supervised and unsupervised learning, along with reinforcement learning to master complex systems. It's a new way of creating and representing knowledge that can be directly actionable. And I think people, I've read some papers recently where people are saying, you know, that it's important to think that the machine has to have the ability to directly act on things, not just that it's trying to build bigger and bigger models, and then, you know, just make predictions on those. And I think it provides a new way to explore causation and not just correlation. You know, a lot of people can see in classical, statistical, or bioanalytics that they can see correlations and things, but it still doesn't lead us to understanding. And I think that these machines ultimately may be a vehicle to get to that point. So to me, the path to these intelligent machines, and these slides, some of them, were given to me by my friends at OpenAI, where I spend time, too. And, you know, basically have to do some research, obviously, to think through how you're going to try to approach the problem. Increasingly, algorithms will continue to be an important part that just trying to brute force this by bigger and bigger machines, you know, won't, in and of itself, take the algorithms that we know and yield the answers that we want. But once you start to build the machine and combine it with these algorithms, you find a requirement now for a much more sort of intimate flow of development. You need to basically build bigger and bigger machines, but we've learned that the machines that we want to do this work are really not the kinds of machines that we've spent the last 40 or 50 years building for classical computational modeling and simulation, for example. And I think importantly, the way that communication has to happen in these machines is quite different. As the models get larger and larger, and we try to approach capabilities that look more, I'll say human like in their capability, I think we're probably going to see an evolution toward a lot more sparsity in the representations. But that's also, you know, produces some new challenges in the way you architect the machines themselves. But as we develop each generation of these things, and in the course of the work we're doing at OpenAI, we are both trying to take human feedback at each stage in the process and factor it back in, as well as the data that we train on at the beginning. And of course, this has become a very expensive endeavor. And so we're trying to take these products that emerge from a pre-AGI capability at each stage, use them to solve economically interesting problems, and use that funding, you know, in a sort of a feedback way to allow us to build bigger and bigger machines. The one thing that has definitely become clear, at least, you know, the work I'm closest to is sort of the combined work of Microsoft and OpenAI. You know, they developed a partnership a couple of years ago that combined Microsoft's longstanding work in some aspects of this, and OpenAI's work in developing these hyperscale models, and in particular, you know, have combined their efforts toward building the machines that are big enough to attack these problems. One of the things that's been fascinating to watch and experiences over the last seven or eight years, the rate at which the computational capacity has grown, basically has gone up by about 10x per year. So at a time where the classic Moore's law improvement is, if anything, slowing down a little bit, you know, it may not have gone, you know, into stasis yet, but it's definitely getting harder. The increasing capability has come largely through architectural change, and not just, you know, brute force faster circuits. And so now it seems that growth in compute is more or less limited by money and your strategy for parallelization, and not strictly by, you know, the rate at which the underlying semiconductor technologies can improve. And so we had a 300,000x increase over that seven-year period. That's big and has yielded impressive results. And so one of the things that OpenAI that we've been thinking about and trying to guesstimate is how is, that's our forecast for that rate of improvement going forward. And at least in the work that we're doing between Microsoft and OpenAI, we can see a path, you know, continuing through last year and 2020 and 2021. And now as we enter 2022, where we believe we will sustain that 10x per year increase, at least through about 2025. Now, what's interesting is we've kind of come up with estimates of how complex is a human brain in the sense of how long would it take to train it? And what, to me, I found interesting is that if you think, let's just say that a human brain gets relatively fully trained in, say, 15 years in the normal world, and you do the calculations based on the rate at which we're training things with the machines we've had, then it looks like that the crossover point to train a human brain, if we're training it the way we train it with the machines we had, probably happened last year in 2021. That if you had a way to provide the same stimulus that a human gets over, let's say 15 years or so, to the machines that we've currently built, and you just let it grind away for that entire time period, it probably would have achieved roughly the same level of training capability as a human does in their first 15 years of life. So what gets more interesting is to then forecast, as the training time declines with algorithmic and machine improvements at, again, about 10x per year, then as the machines get out there to about 1,000 xflops in 2025, we think it might take less than two weeks to train a, quote, adult human brain, unquote. And to me, this is a very profound thing, because this is where the ability to create polymaths on demand comes from in my mind. That I don't have to think that this thing is generally intelligent. I don't have to think that it in fact learns or knows everything that a human knows. But if I can take an adult human brain, and in every two weeks or maybe double that time, start from nothing and build it up to a level where it has deep expertise in a number of related fields, then does that end up yielding the ability to create specialty machines that could do things that no human could do, or even no group of humans can do? And that's what got me going down this path to thinking that in a relatively short period of time, you need to start thinking differently about the role of the researchers and their relationship to these machines in the pursuit of our current investigations and discoveries. So many of you probably have seen this stuff, but I just show you a few examples of how these machines are starting to learn things that you can only assume are really conceptual learning. We saw this in the early work with GPT-3, where it was trained just on a lot of text on the web. And along the way, we realized that even with relatively small amounts of multilingual information that it was trained on, and in fact, without any corpus of works that were presented in multiple languages that you could learn one language and its relationship to another, that the machine had learned several languages. And you could ask it questions in one language and it would be able to respond in a different language. So in some sense, it started to look like a young child that lives in a multilingual home. It doesn't know one language from the next, it just knows different ways of expressing the same concept. And then as we started to move into what we think of as multi-domain capabilities where we provide pictures and then images and even low-rate videos, then you started to see that you could stimulate the models as they got larger and larger with just questions. So for example, here the prompt given was do an illustration of a baby daikon radish in a 2-2 walking adult. Now, we're pretty certain that it never saw any of these images in its training set, but it knew what each individual item was. But it ended up anthropomorphizing the radish, created the dog, put a leash on it, and drew all these different illustrations in a relatively short period of time. And so similarly, here's another one where an armchair in the shape of an avocado, not something that one would have expected it to be trained to draw. You could tell it draw the same cat on the top as a sketch on the bottom with just different sketching techniques. The painting of a copy bar is sitting in a field at sunrise. And of course, you can also say what type of painting style you want the thing to use, and it will do that as well. And so more and more, as these things get pushed forward, we find the ability to take a conceptual description in text and have it generate, in this case, an image with specific attributes of something that didn't exist in nature, so to speak, and certainly wasn't part of what it was trained on. So then that leads to the ability to have it make other translations, if you will. So once the API was put out there, somebody took it and said, all right, just give me an image of a nutrition label. And from the label, it was able to find the text, from the text, it was able to understand what it was, and then having been trained on things related to health or goodness versus badness in diet, the thing then produces a little emoticon to tell you whether this thing is good for you or not. And 100% of the input after the training is just a picture of a label. So you begin to realize there's a lot of different information in there, and the thing has to be able to process it in many different ways. One of the things that was more interesting in last year was we embarked on this collaboration with GitHub at Microsoft to create this thing now called Copilot, and which is an AI pair program. Many people use pair programming as a technique to accelerate or reduce errors in writing code or in developing interesting ideas for good ways to do algorithms. And so here we took an IDE, a development environment, and integrated this Copilot thing. In this case, we took the basic text models and added a new level to the base training, which included billions of lines of programs in the public repositories of GitHub. And so in doing it, the thing much has happened when we showed it many different texts on the internet. Broadly, it learned different human languages. Here after it was exposed to billions of lines of code, it learned 12 programming languages. And it seems to be quite facile at being able to express algorithms in different languages. And so in this pair programming environment, it now literally writes programs. And so a programmer now can sit there and instead of talking to a human, it can talk in English to the machine and describe what it's trying to do. And the system gins up samples of code in different algorithms and different languages. It shows it to the programmer. The programmer can accept it. They can just edit it themselves to improve it and make comments about what they liked and didn't like about that particular approach and the machine will literally do another one. And so now you've got a real-time iterative code generator. And here these things are at the subroutine level, but it seems pretty clear that this capability will be able to write the Novo programs probably on the order of a thousand lines of code within a year at the rate that progress is increasing. So each of these things, to me, when I stand far back and while not being an expert in any one of them, I see these patterns where we start with, I'll say, a basic brain. We train it up to a fairly high scale. And then we start at the margin educating it about other fields or other domains. And so it seems clear to me that if we organize our work, that we can embrace this idea to think about having the machine start to at least help us, if not ultimately be able to make conclusions about high-dimensional problems that currently defy human ability to understand. One of the things that got me thinking a lot about this turned out to be four years ago, roughly my wife was diagnosed with cancer. And this is actually a course, a time course of her plasma proteins. You know, I met Peter from through Larry Gold and I've been working with Larry and the people at Somalogic on this sort of high-scale proteomic capability. So this was before I really got very involved in the AI work. And I became fascinated by thinking of the proteome as a direct proxy for the state of all the organs in the human system. And when my wife came down with cancer, you know, we agreed to do a N of 1 experiment on my wife, where we took, every time she was treated in any way, we took a blood sample. And the Somalogic people basically processed that into a proteome at the time. And this was in the day when they were doing roughly 5,000 proteins per analysis. Now, my wife and I had also been part of a program at the Institute for Systems Biology where we were trying to study humans in as many different sort of biological dimensions or characterizations as possible. And even, and the one that was missing at the time was proteins. We didn't have a good proteomic analysis. And, but we had biobanked the samples. And so it turns out, if you look at the top row of this chart, let me explain what this chart is. So this chart is essentially a self-organizing map of the 5,000 proteins where each pixel, you know, is roughly about 20 proteins. And they were self-organized the way people, and these tools were developed for people who were studying genomics. And because the genome was so big and complicated, people were trying to, you know, figure out, okay, well, what relates to what? And in a sense, it's a very difficult problem. But the way to think about this, and of course, you know, those of you in physics and other fields would recognize, we've thought for many, many years that that the best way to help humans, you know, to understand these high dimensional problems, so to speak, is through visualization. And we spent lots and lots of time in graphs and all kinds of fancy ways of visualizing things. Well, but, you know, visualizations like this are really just dimensionality reducers, so that the human can have some hope of ingesting through its only high bandwidth mechanism, its vision system, you know, some large amount of information. But unlike the machine, the human can't remember the details of any of that information. It only deals with it sort of in abstraction. And if I have this problem, in this case, my wife's total system biology proxied by 5000 proteins, you know, the machine actually is quite happy to look at every single protein, remember every single value for every single, you know, feature in the time course. But a human stares at this and, and in fact, still doesn't know what to do. So the people at the Institute for System Biology suddenly were inundated with this proteomic data, which dwarfed the amount of information that we had collected on my wife and all the rest of us using every other known, you know, measurement system that was available to us. And later on, we did a broader characterization and discovered that that, you know, we had about 3000 parameters that we could understand about humans from all the non proteomic sources. And when we added the 5000 proteins, we had a little over 30,000 different things that showed up, and we could correlate. So here was essentially just a tidal wave of information. And despite perhaps being the best people in the world at systems biology, they looked at this data and really never could see, you know, patterns in it that led them to understand what happened. The last step in that process and the one that, you know, sort of stopped us up to this point was this one where we used this dimensionally reducing visualization to cluster these things. Interestingly, the first three blocks show, you know, a pattern that's clearly mostly blue. And these were all done as deltas. So the first one is just her baseline. And then you can see in 2016 in April, something dramatic changed. Now, it turned out it wasn't until the end of 2017 where she was actually diagnosed with cancer. And it seems pretty clear, but we can't really prove it, that the cancer really started back there in between the sample in 2014 and the sample in 2016. And we can see the interventions when they did the surgery. There was a momentary time where it tended to trend back toward blue, and then rebounded again. And of course, we've now seen that for cancer metastasized. And so you can say at this level, you can see a correlation between what really happened and this kind of visualization. But it just really made me believe that there's so many dimensions to these problems, especially in biology. But I think in many, many fields of science that the humans, despite their best effort, are throwing a lot of the data on the ground in order to deal with our inherent weaknesses in trying to ingest it and understand it. And so that led me to this question. What if a hyperscale AI model could learn to predict your next proteome? So if you did that, then you'd say, OK, well, I'll start with a big model. I'll pre-train it. I'll give it lots of basic medical and biological information. I could add the genome, individual metadata, biological constraints, et cetera, use previous proteomes and maybe an immunome assay to kind of just give you a molecular model of your history and your current state. And you prime it, give it an input of what you want for environmental or behavioral evolution, and then have it predict that. And if it was possible to predict this complex proteomic state from that information, then I would argue to do it, it must in some sense understand your biology. And if it does, then it gives us a new way to deal with biological complexity. It may give us the ability to probe them up. Even if it now understands something that we don't. Can we start to ask it questions? And it can give us answers in part. If you just said, explain it to me, the machine might just, again, dump so much information on you that you wouldn't be able to understand it anyway. But we might be able to incrementally develop more understanding by querying it. If you can predict using it, then many of the things that we do in medicine today, for example, clinical trials or drug exploration, but all become something that was perhaps done using the model as a way to do it. And it might allow us to explore interventions that would give us a new way of treating these things. And so the more I thought about this and then the more I got involved in a number of other areas after working with the open AI people and the Microsoft people, you know, I came to the conclusion that these AGI's, even if we build these polymathic capabilities in the next few years, are going to give us really powerful ways of starting to explore different fields of science and engineering, perhaps. And so the question, really, to people is, are we really ready? Are we prepared to accept a different role as scientists? And in some sense, if what we're really doing now is thinking we're going to raise polymathic machine prodigies, you know, it'll take a village just like it does to raise your kids. And if the machines learn like kids do just a lot faster, how do we do a good job as parents? Such that, you know, when the machine grows up, you know, it's able to beneficially, you know, help us answer many of these difficult questions. And, you know, I've now started to see interesting paths to use this, for example, in material science. I'm involved in a company that's doing a lot of work in that area. And, you know, that's just obviously a very high dimensional problem. And much as Peter mentioned, you had a presentation from the DeepMind folks on the Alpha Fold work. And I think that's just another great example. One of the things that's been interesting as an observer is in these high scale or hyperscale AI areas, whether it's the work of DeepMind or OpenAI, there is a qualitative change in what you can do as these machines and improved algorithms get really large. It's clearly not a linear, you know, progression of capability. And so right now, very few places in the world have had the luxury of operating with machines and models as big as those two in particular. And so many people who question, well, is this really possible? Could this really happen? Are largely basing that on an experience of machines that are many orders of magnitude smaller than the ones that these bleeding edge capabilities are being explored on? And so I think, you know, that poses an interesting question, which is, you know, how do we deal with the cost of building these capabilities and providing access? But I think if we don't find a way to deal with that, you know, it's going to be a bit problematic. But so let me stop here and we'll open the thing to a broader discussion. Greg, thank you very much for this wonderful short lecture. So the idea is to lead in with a few questions from a few people that I invited to be, let's say, on the panel. And so it's Daniela Portoletto, who's the head of high energy physics and Chris Lindholt, who's a professor of astrophysics, and Andre Lukas, who represents theoretical physics. And I think we should start with a question that Daniela shared with us a few hours earlier, because I think it's a very, very interesting one. Daniela, over to you. Well, actually, the question came from one of my students. You know, we were discussing, you know, what to bring here. So the question is, a primary goal of science is to find explanations for observed phenomena, which generalize sufficiently well, so to be predicted. If near term AGI brings system super human capabilities to scientific model building and data analysis, what will the consequences be for scientist ability to extract these explanations? In other words, how does science and science communication look like in a world where answer to scientific question are so complex that no human can understand them? Is there a need for more discussion within the community about strategy to align the scientific AGI, such that it's reasoning, respect certain quality standards? Is this a useful goal on the way to the much harder program of general AGI alignment? I'm going to start with the last part of that and then back into the first part. One of the big issues that people have had even before we got to the AGI question is just, are these models black boxes? And therefore, I give it a bunch of stuff, it gives me an answer, and I'm totally unsatisfied because I don't know how it got the answer. And therefore, it leads me to question, well, is it a good answer? Is it a correct answer? Does it have biases, for example, be based on the training, et cetera? And so one of the things that I've, again, concluded myself is that the only way I can see to make the AGIs, I'll say well-behaved, relative to all the questions you're asking, is to use another AGI in a supervisory capability. And then to think that as we build these things, whatever system we're building, that each system, if you could, in some sense, regulate it this way, would always be required to have this intimate combination of the supervisory AGI with the domain-specific AGI. And the thing that got me thinking about this is just, again, the incredible complexity that you want to reason about, even in trying to determine this question of safety or efficacy or mores and values, I mean, that's a tough, tough problem. I mean, even in humans, we don't have a nice uniform specification for all those things. And if you're building these machines that actually have to function correctly in a multicultural environment, there's not even necessarily one good answer for some of those questions. It's sort of an evolution of a problem that we had at Microsoft over many years when we started trying to build nominally one kind of product, and we were delivering it in 192 countries, and each version was delivered in 39 different languages. And then it had to comport with the laws and regulations of each of those countries. And so we spent a lot of time trying to find canonical representations of some of these things, and that was incredibly narrowly defined subset of the problem. So I more recently said to people, we need a complete academic program of research to try to figure out how to train the supervisory AGI. And then we need a, I'll say, an engineering discipline that's well informed for what would be deemed a sufficient coupling of the supervisory AGI into the system that's either doing the reasoning or controlling some system, whether it's your self-driving car or other things. You see people when they talk about autonomous vehicles, for example, always giving you these sort of philosophical questions, the self-driving cars going down the street, and the kid runs out from the right side at the same time that somebody else is coming from the left side, which person does the car kill? And how does it decide? And humans don't have great uniform answers for doing that, and yet we still want to hold the machine to a standard better than the human. And I think that what we really have to do is to recognize that if we want human values as well as laws, for example, to be uniformly represented in the way these systems behave or learn or answer questions or whatever it might be, then I think we have to see the emergence of a discipline that's going to try to figure this out. I'm not much of a neuroscientist either, but humans have this part of the brain, I think it's called the amygdala, which is the fight or flight response control. And in a sense, that's a supervisory override in our own brains that says in certain circumstances, you're going to do things that you otherwise would never rationally think you could do or would do, like run out in front of a car to save your kid or whatever it might be. And so I think if you look, you find that we have woven in our own intellectual fabric these special supervisory capabilities that intersect the normal autonomic functions and other things. And I think part of the problem right now is we have to have not started to try to parse what we want and find independent representations for each of these things in models that are equally capable to the ones that we're training for other purposes. And so in each of these areas now, that's what I'm starting to advocate for. Henry Kissinger is a good friend of mine, and four years ago, at age 94, Henry got interested in, based on a conversation that he had with me and Eric Schmidt in this, and you just recently wrote a book with Eric Schmidt and Dan Huttenlocker about these kind of challenges, more in the abstract philosophical sense. And Henry and I talked quite a bit about this point that Henry, who's a historian and one of the philosophers says, where are all the philosophers? You guys are building all these computers and they're really getting powerful and they're changing the world. But like in the Renaissance, there was a philosophical basis of some of these changes. Where are all these people now? And I think that that's a fair question. And one that you should put to your colleagues at Oxford says, okay, how does philosophy and all of these related subjects intersect this problem? And so I think the idea that it can be done on a happenstance way, like, oh, well, the guys that are training the cars, they'll make some localized decision about how to make that life and death choice is wrong. That what we really need is a focused effort to take the same machine capabilities and train them on these broader sort of questions of law, mores, values, goodness. And to some extent, if humans, I mean, my own personal views that ultimately, we're going to see some kind of hybridization between humans as we are in these machines. I don't see any other ultimate path unless humans just throw in the towel and decide to give up. And so I think we need a real research effort to use the same machines in a different way and then figure out how to intimately couple them together. I think that's an excellent answer. Great. There is actually an effort in Oxford to look at the ethics and other things related to aspects of AI that are in the long-dose lines. But are they actually doing it with the idea that their job is to now train the prodigy machine in those dimensions? I don't think it's exactly. I mean, I haven't found anybody described. I have not found anybody yet who's in those fields thought, oh my God, my job is now to become a parent of a machine that's going to become a philosopher. And they're really asking themselves these, I'll say more general questions, that they then would like to project onto the engineers who have to build each individual system. And I think that's going to fail. Yes. Yes. So are you emphasizing the orthogonality of the two problems? So Jeff Dean said, somewhat of similar lines, he said, the others have to be ethical, for example, for safety reasons and so on. But he didn't propose what you did, which was to introduce a secondary system to extract these other aspects out of the problem. I'm just saying, I think if you leave it to each, I mean, it's sort of like saying, hey, all programmers should write perfect programs. How did we do? All right. Look at the cybersecurity challenges. I've spent 20 years of my career at Microsoft trying to figure out how we're going to write safer software. And their safety was just a question of keeping it from being hacked. It had nothing to do with correctness. And so I think the idea that we're going to expect the union of all models ever trained by individual people in any of your problems to suddenly be correct, well, we'll go the same way the cybersecurity problem has gone. It's just a bigger mess every day. So I think we have to head this one off at the pass and say, no, no, there's a whole discipline of computer science and philosophy and law. And they all have to basically be projected into something that we're willing to ratify as representative of our value set, you know, that we then want all other machines to comport with. And then it becomes an engineering problem of how to ensure that one can't bypass the other. We should indeed have brought all the other departments to this talk if we could ask. Yes. Let me find a question from Andre or Chris. Sorry, I can go ahead if you want. Maybe I'd like to ask something that's perhaps related to what Daniela was talking about, the slightly different direction. Now, it has to do with the relation between experiment and theory in physics. And traditionally, this is a path that's been pursued by humans and by a lot of intuition. So traditional physics theories have been developed from experimental data by humans having the right intuition of what kind of theory might be behind it. Now, in the 20th century and in the 21st, this has become in some sense more difficult in fundamental physics. I hope that Daniela agrees with that. I mean, the relationship between theory and experiment has become very indirect in the sense that going from a theory to the experimental information is sometimes a very difficult process. It involves huge amount of mathematics. So the top down is very difficult. And that means the bottom up, the trying to infer from experimental information, the right kind of theory, that's even more difficult because the inverse path is already very complicated. And that is related to the possible space of theories, the theory space being a very, very large space. So I know that and I have tried to use methods, for example, in the context of reinforcement learning to try to address some of this. And that is not entirely unsuccessful. But I'm wondering what your perspective would be on applying AGI to this kind of problem in science. I definitely think that it's going to be applicable. I mean, I got a few examples. I mentioned one in material science, you know, which I think is, you know, a good example. We saw the deep mind alpha fold example. And then you could say not physics problems. But what's been interesting to watch is each time that someone with access to these high scale models has allowed the machine to explore the space unbiased by human, you know, knowledge. I'll put that in quotes. The machine has rapidly produced a better answer. That was true in like the Alpha Go work, you know, where Alpha Go zero completely smoked Alpha Go, which was trained on 3,000 years of the best humans. And then it cleaned it out. I was with someone who recently was talking about exploring the project, the decode project in Iceland, you know, where they have all the genomes. And they've been studying this for a long time without machine learning. And the guy running the project said recently, you know, the thing that we realized produced the best results the quickest was by not exposing the machine to any of the things that the humans thought was the answer. And just let it explore the space itself. And it came up with better answers quicker. In the material science space, we're seeing this now where, you know, we train it. In this case, it's in in organic materials. And, and we think that the machine can then very quickly predict, you know, structures and ratios of of atoms, you know, in significant numbers to produce materials with a specific property. And it may be that we finally then have to refine them. Another area that I think is similar, you know, I started Microsoft's program in quantum computing about 17 years ago. And unlike most virtually every other one, we decided to invent a new kind of qubit based on topological physics. And, you know, it's taken us 16 years to sort through this. And, you know, we've had really good results lately. And, and what we, and I would say the thing that created those results was we actually stopped taking the intuitions of the experimental physicists, which had guided us for many years and never would converge. And we just focused on building incredibly complicated models. And then we asked them, then we let the models explore the high dimensional space using week long runs on 60,000, you know, machines. And in months, we were able to, you know, produce better results at the physics level than we had produced in 16 years, you know, guided by all the world's best work in topological physics. And so, you know, at some point, you just have to realize, we're just bumping into it over and over again in biology, materials, you know, topological physics everywhere, we can now produce data, you know, at such a high rate that the human can't digest it. If it doesn't know what the model is, we don't know what to do with it. And you're stuck with the problem that you described. So I, I definitely think, because I've now seen it in each of these other areas where you get a surprising result, when you sort of resign yourself to the fact that, you know, our job is to give the machine, you know, the best possible training, not in what we think the answer is, but in the raw data, you know, and then try to let it explore the space. You know, I say to people, you know, that humans have these few problems, one, our head can't get any bigger, we're not going to get any more circuits, we're slow, we're carbon based, you know, so we all got 30 Hertz clock instead of a three gigahertz clock, you know, and we don't have any good memory, and we have really crummy IO. So other than that, we're great as a computer. And, and so, you know, the, the, the big machine has none of those problems, you know, it can remember everything to the last digit, you know, it runs many orders of magnitude faster. And therefore, its ability to explore these high dimensional spaces, you know, is just so much richer than what humans can do. And I think, you know, we really have to start to turn our attention to how to use these machines, you know, in some combination of modeling and simulation, in order to go, you know, get a handle on these high dimensional problems. And can I follow up on this? Yes, of course. Yeah, I mean, I think I want to say something in defense of humans here. That's a natural reaction. Yeah, I know. So, I mean, a lot of physical theories. I mean, I appreciate everything you said about, you know, exploring large spaces and, you know, the capabilities, for example, of AlphaGo Zero and how that very quickly exceeds the, you know, the previous programs and the human players and so on. But, but a lot of physical theories, they have something else. They're not just based on exploring large numbers of possibilities and picking the one that matches the data best. They are, they have been, at least traditionally, based on ideas of beauty and simplicity and natural mathematical structures. So, so that, I mean, I don't know what role that will play in the future, but that's clearly been a very powerful traditional guiding principle. I mean, I agree with that. I mean, it's a bit like the math question, you know, what about math and theory? Well, I mean, theoretical physics is a lot of mathematics and the way that physics describes nature is by mathematics. So mathematical structures come into play automatically. So my question really is these, these principles, what kind of role would they play in this? I mean, you have emphasized the exploratory capabilities of these machines, but there are other features there that are important. Yeah, the thing, I'm in no way convinced that the machine can't ultimately deal with the maths too. All right, I mean, you know, the kids only a few months old right now in terms of its, you know, learning, but, you know, it's already starting to do grade school math, I mean, in conceptual terms. So, you know, if I say, okay, you know, and we really haven't been focused that much on trying to teach it math. All right, but, you know, this is an area that the open AI people have been working on. And I think in the next, you know, couple of years, particularly, it's hard, humans have a tough time understanding anything that grows exponentially. And when the exponential growth is 10x per year, you know, it's really hard to understand something that seemed like it was impossible last year. Suddenly is, you know, you got to think it's 1000 times better, you know, next year. And, you know, it isn't at all clear to me that, that these things won't be able to essentially do that kind of conceptual reasoning too. That may be farther down the road. And the reason I focused, you know, on this sort of polymathic idea now is it's pretty clear that these machines will be able to win in the exploratory element. You know, whether it, you know, kind of gets there in the pure theory sense, the way you're describing, you know, in terms of mathematical representations or beauty, I don't know. But, you know, it's interesting. You know, there's people now, I read a paper the other day, a company using the API on GPT-3, who's now commercially making artwork, where, you know, he's mastered the way that he describes in English the features he wants of the painting. And then the machine creates the novo, a painting. And the painting, you know, is deemed to be beautiful enough to be sold as commercial art. So, is that beauty or not? You know, somehow the machine learned what humans think is beautiful, and it can generate it on demand. And so there's an amazing amount of hope here, indeed, that this will end well. We are actually very, very keen to get Chris's question, maybe from an astrophysics angle into the discussion. Is it all right to interject that here, Craig? Oh, sure. Yeah, we can go on another meeting. Thanks. I've been listening to this wonderful vision of the future, which I'm very much looking forward to. But by nature, a cynic and an observer. So I require quite a lot of convincing. Take for granted that I think that lots of the uses to which AI has been put, the protein example you showed is a brilliant example. The expression of higher dimensional data sets is something we're working on to find unusual things. So take for granted that all of that is great. I struggle to join you in jumping from that to AGI for science. And you said something in your talk that made me think that maybe I wanted to ask whether this was the difference. When you talk about GPT-3 and the illustration that the Radish generated, you said that we can only assume that the network is, in some sense, conceptual. Which I took to mean that it's sort of thinking in some way. Can you say more about what you meant by that? Because I think that's a crucial step in your argument that there's something more there than pattern recognition that would let you predict that you were going to go on to general intelligence and this childlike intelligence that you talked on so warmly. You know, in the conceptual comment there, I wasn't trying to imply at that point in time that the thing was thinking. What I was trying to say, but I don't even know if we know what thinking means. So what strikes me about that, and the same was true in, I'll say language, in human language. If you grow up as a native speaker, you have multiple languages, like you grow up in a multilingual household. You don't think in one language and then translate. You just actually can say the same thing different ways. It's sort of like, I only speak English, but I have synonyms. That's great. Well, what's the linguistic equivalent of a synonym is an alternative language? And so the thing I was struck by was watching that in each of these fields, the machine comes up, I'll say with some representation where it doesn't distinguish between text in multiple human languages from images and its ability to accept input or generate output. So right now, you can describe any picture you want in English text and the thing will draw it. So to me, it's more than a simple mapping. That's why the daikon radish, I mean, you know, tutu, you know, no one ever showed it a picture of that, and therefore it just had to dredge it out of its memory and bend it around a little bit. You know, it actually had to understand, you know, the concept of the radish, what anthropomorphization meant, you know, what walking the dog meant, you know, what dressing a particular way meant. And so I don't know that I call that thinking yet, but it's clearly not information retrieval in the classic computing sense. Yeah, okay. Yeah, it's not go to point B in memory and retrieve whatever's there, right? I agree with that. And so the question is, how does the human do that? And at this, I'll say fairly finite level of capability. That's why I kind of talk about it like it's your kid, you know, you know, at that level of capability, can you prove that it isn't doing it the way you did, you know, when you were two years old? And, and so that's why I say so now when the machine gets 10x, 10x, 10x bigger the next three years, and I, and now what happens is my ability to train it to the same level is compressed in time. Does that in itself represent a qualitative change in the way we should think about how to exploit those capabilities, even if they have not yet, you know, bordered on anything we might consider general intelligence. Sorry, Peter, yeah, go. So in this direction, one thought that keeps going through my mind is that in, in science, so in there is, you know, a very disciplined form of thinking. And so for example, people learn geometry, and they learn algebra, and then they're ready to learn algebraic geometry. Yeah, and these sort of compositional ideas about knowledge, and, and derivations, they are, by the way, sometimes promoted for AI. And sometimes they are actually sort of deliberately held at bay, saying, you know, let's not tell it too much about how to do it, it might figure it out. Will we ultimately be able to perhaps get the bits and pieces back that made of the AGI's mind? Because the explanation for scientists, the explanation might be more interesting than the answer. Yeah, in, in society that may be generally that may be different, they might want the answers, they don't perhaps care so much exactly how it works. But the, the internals of these mechanisms is I think of great interest to us. Yeah, a lot of effort is going into trying to figure out how can you, you know, interrogate the model, ex post facto, you know, to try to, you know, quote unquote, understand it. The thing I struggle with myself is to say, all right, let's say this thing gets, you know, gets some good answer in some high dimensional space, that let's just take the material science problem. So, you know, I say, hey, look, I'd like some handy dandy new material to make lightweight, you know, airplanes or something that are, you know, weighs nothing in stronger than steel. And so the thing grinds around and comes on and says, here's the formula. All right. And so out of this huge high dimensional space, it says try that one. All right. And so if I go and I say, okay, tell me why it's that one. You know, I know I don't know that it even if it could, in some piecewise way, tell me each aspect that it chose from that at the end, I would understand. And, you know, that's a difficult thing for us, you know, who've been at the top of the food chain, thinking wise, you know, for a long time, you know, to grapple with. And but I think you have to allow for it. And so I just don't know. Sorry to interrupt. I'm just going to say that when we teach our students or when we talk to people about science, like one of the skills we try to teach is communication, because you don't have much impact on science if you can't communicate your thoughts, whether to peers or whatever else. So one piece that seems to be missing when I go to talk to about AGI about science is that that if we want to change physics, like to change the fundamental theories of the subject, rather than apply them, like maybe we need an AGI that learns how to communicate with its peers or with us about discoveries. I was also thinking, if I can expand on that, what is going to change in the ways that the university teaches any subjects if you bring this upon us? You know, you can see already that there has been a switch, like for example, I can see that often now there is a lot of development that are happening in industry where you can do this kind of research and you have these facilities more than a university setting. So there has been a change and we can see, for example, even at CERN that we are losing brilliant people, you know, to the mind of Microsoft or Google, etc. So how is the teaching, how is it, you know, it's maybe more going back to the university where we are instead of communication of science, but how are we going to do it if the model is changing so much? Well, I mentioned, you know, what I think is a big problem and I've watched this problem in the high performance computing space, you know, where, you know, as the biggest machines for classic modeling, you know, became more expensive, all the university systems basically, you know, were too poor to own their own. And so we came up with a few national machines and we tried to timeshare them and blah, blah, blah. And I think one of the things that as a society we have to reckon with is not only is the nature of science going to change, but can it be done without access to these capabilities? I mean, it's a bit like saying, all right, hey, you know, you're studying biology, but you know, you don't get a microscope. All right, you know, we never had microscopes, you know, just keep staring at that peak addition, tell me what you see. And I just think what's happened is that the role of computation writ large, we got the warning sign on the HPC problem and never resolved it, you know, institutionally. And I just think that now you're watching a qualitative change in the role of computation, you know, as it relates to all fields of scientific endeavor. And you were even farther behind the power curve. All right, I mean, as far as I know, you know, there is the machines that Microsoft OpenAI and Google DeepMind, those pairs, you know, have, you know, dwarf every other machine, you know, anywhere in the world, including owned by governments or anybody else. So, you know, there's now, and I just think this is a fundamental issue. And so unless we're willing to acknowledge, you know, the role of computation, you know, not merely as a, you know, just the next little tool, you know, this is why I think such a mistake being made by people who are taking all the intellects, like the people on this call, and saying, hey, you know, go work on AI. And then he said, okay, well, what do I use for computing? Well, you know, get a GPU, you got one in your laptop or, you know, you've got a little server under your desk or down the hall. I mean, these things are just toys. I mean, they are many decimal orders of magnitude less capable than the machines that are being used to, you know, to push the state of the art in the AI field itself. And then, so then that begs the question says, all right, even if we start giving you these pre-trained models, and all you have to do is train them at the margin, you know, all right, how are we going to, you know, set that up? You know, I mean, to put it in perspective, a single training run of a hyperscale model, you know, in one of these companies, you know, costs upwards of $10 million per run, just operating costs. That's assuming you already bought the hardware, you know, which is billions, all right. And so, you know, there's just such a disparity now between what will define the future and how society is equipping its intellectual enterprise. That unless something changes, you're just going to watch, you know, the best people go where the action is. And, you know, unless that action, and I'm saying, you know, it used to be that many people would go to the university, they like the environment, they like the intellectual environment, etc. But if it turns out that, you know, a handful of guys at DeepMind are going to basically smoke every person in the field of, you know, proteomic protein folding after 30 years of their best effort, you know, you can't even say, you know, you're going to be leading your field, you know, in the science. And so I think that this is a huge issue. I mean, this may be a huge issue, but the proteomics example actually is a beautiful counter example in one aspect of what you say, namely that it was actually done with a very, very humble set of computers that might even be owned by our university rather easily. Yeah, that's, I think that's very exceptional, but there is still good work we can do without asking the hyperscalers to do it for us, I think. That's true, you know, but, you know, you don't know how much the Google people had to learn about how to do this by brute force before you can reduce it, you know, to be done on the smaller machines. I think this is another huge, where's the computer science department, you know, there's the architecture question, you know, could we make these machines, you asked me about the power question, you know, how naturally are the machines architected to be able to do these things at lower power or higher performance? You know, what's the next generation of physically compacting these things? You know, should we be running them cold? You know, I mean, there's just an endless set of opportunities if you start to think, no, I have to organize myself in order to optimize to get these tools. And yeah, I mean, lots can be done. I mean, inferencing is a lot simpler than training. And so F, small number, and that's what's happening now with, you know, the open AI, putting an API on GPT-3, for example, was radical in that we had these machine hyperscale models that were sort of like, you know, here's your kid at age 15, now go teach them some new subject and ask it what you want to do. All right, that marginal training is way cheaper, you know, than training the base model. It's sort of, you know, you had to grow the kid age 15, and then you could send them off to learn, you know, a specific subject. And I think this kind of partitioning of the work has got to be well developed now. And, but, you know, if you just, I mean, everywhere I look, I don't see any systematic program, you know, in any of the major research enterprises that are organizing themselves around this concept. I think that this is a extremely worthwhile observation for us to reflect on. I want to thank you very much for that. Yeah, very, very important. I would, we need to wrap up in a few minutes, but I wanted to pick out of order one question out of the chat questions, because it's, I think, such a beautiful question. And Val, would you perhaps like to ask it yourself? Oh, hello. Thank you. Thank you, Craig, for a wonderful talk. So I had the opportunity to attend a talk by Google a few years back in Spain, and they were talking about a sort of similar topic. And at the time, the speaker suggested that to his young kids, he would start teaching different things that they sort of normal curriculum at school. So what I was wondering is, what is your, what are your thoughts in terms of, where would you focus the education to young kids today? So it complements best the way and the world that is coming on, because they will have to do with the AI in the day to day life. And so what things do you think that machines will never be able to do? And so we could focus education to the younger generations of in those areas. Thank you. Well, I was good right up until the last clause, which said, you know, should this be, should we be focusing training kids on the things that the machines will never do? You know, I just don't know what the machines will never do. On the other hand, you know, I've watched, you know, I'm 72 now, you know, when I, when I was at Georgia Tech doing my undergraduate work, there was no computer science school. All right. There was a, it was a library science school. And, and so in my one man's lifetime, you know, I've sort of watched computing go from, you know, small machines that I used to have to boot through the switches on the front panel, you know, to the things that we're talking about today. And so the question is, you know, how will machines evolve over, you know, these young children's lifetime, you know, who will go through a similar experience? I think that I have gone through, which is several macrogenerational changes in what the computational capabilities are. So I think, you know, I mean, I look at my grandkids these days, you know, who at age, you know, five and six, you know, are completely facile users of, you know, multiple computer systems, you know, I mean, for them, it's, it's just like breathing or talking. I mean, and, and yet, you know, I know many people my age and others who still, you know, struggle to do email. And so I think the most important thing is to start to teach kids how to, I'll say work with machines. I think that's the real qualitative difference that's emergent now. Up to this point, we really have, we've treated computers the same way we treat other instruments, you know, in, in scientific exploration or automation in business or whatever it is. It was there to help us get insight, you know, to show us something so we could understand, we could make the decision, we could, you know, refine the theory. And I think, you know, we now have to say, is the machine really going to be much more of a partner like the AI pair program? You know, and, and ultimately, in certain fields, I mean, you could take any one of the fields that you are experts in, you know, if I sat with Chris or Danielle or something, you know, you, you, you all would look like these, you know, super intelligent machines in your domain compared to anything that I know in those fields. All right. So I, in a sense, have to trust you or learn from you. And so the question is, can we get the kids to accept the idea of partnership, you know, with these increasingly intelligent machines and not to be afraid of it, and to actually, at least now go into it thinking, Hey, this is a partnership. And I, you know, just as I have a partnership with teachers or parents, you know, should I think, Hey, this machine is, is, is my partner, partner in learning, not just because it brings me web pages, but because it's something that I can talk to interact with, get ideas from. And I think, in general terms, the more we can get young people to have comfort with respect to that idea, the more easily they'll make the transition, you know, to a, to a world where, you know, that it's only through that partnership that, that the biggest problems will ultimately get solved. And I think that is a qualitative change. Thank you very much. You're welcome. Well, thanks everybody for spending time with me this morning or this afternoon, as it may be, it's always fun. I'd love to have more of these conversations that could go on for hours. And I, I learned a lot by the questions that you ask. And, and I appreciate that. Well, we would like to thank you. It was a fantastic talk with messages at such a different level than we normally get. And it's, you know, the many of our talks are very technical and also super interesting. But this is looking at humanity and the machines as a whole. And I think we've all greatly enjoyed it. Thank you very, very much. Okay. Have a good day. Bye bye. Thank you. Bye bye, everybody.