 Welcome. My name is Ben Bennett. I am the director of HPC Strategic Programs here at Hewlett Packard Enterprise. It is my great pleasure and honour to be talking to Professor Mark Parsons from the Edinburgh Parallel Computing Centre. And we're going to talk a little about exascale, what it means. We're going to talk less about the technology and more about the science, the requirements and the need for exascale rather than a deep dive into the enabling technologies. Mark, welcome. Hi Ben, thanks very much for inviting me to talk. So I'd like to kick off with, I suppose, quite an interesting look back. You and I are both of a certain age, 25 plus, and we've seen these milestones, I suppose they're the SI milestones of high performance computer performance computings come and go, you know, from a gigaflop back in 1987, a pteroflop in 97, a petaflop in 2008. But we seem to be taking longer in getting to an exaflop. So I'd like your thoughts. Why is an exaflop taking so long? So I think that's a very interesting question because, you know, I started my career in parallel computing in 1989. And when I joined, an EPC was set up then, you know, we're 30 years old this year in 1990. And, you know, the fastest computer we have there is 800 megaflops, just under a gigaflop. So in my career, we've gone, you know, already when we reached the petascale, we'd already gone pretty much a million times faster. And, you know, the step from a pteroflop to a petaflop scale system really didn't feel particularly difficult. And yet the step from a petaflop, petascale system to an exaflop is a really, really big challenge. And I think it's really actually related to what's happened with computer processors over the last decade. Where individually, you know, a processor core like you have in your laptop or whatever hasn't got much faster. We've just got more of them, so the perception of more speed, but actually it's just been delivered by more cores. And as you go down that approach, you know, that happens in the supercomputing world as well. We've gone in 2010, I think we had systems that were, you know, a few thousand cores. Our main national service in the UK for the last eight years has had a hundred and eighty thousand cores. But looking at the exascale, we're looking at, you know, four or five million cores. And taming that level of parallelism is the real challenge. And that's why it's taking an enormous amount of time to deliver these systems. And not just on the hardware front, you know, vendors like HPE have to deliver world beating technology and it's hard. But then there's also the challenge to the users, how do they get the codes to work in the face of that much parallelism? If you look at the complexities delivering an exaflop and you could have bought an exaflop three or four years ago, you couldn't have housed it, you couldn't have powered it, you couldn't have afforded it and you couldn't program it. But you still, you could have bought one. We should have been so lucky to be enabled to supply it. The software, I think, from our standpoint is looking like where we're doing more enabling with our customers. You sell them a machine and then the need then to do collaboration specifically seems more and more around the software. So it's going to be relatively easy to get one exaflop using Limpac, but that's not exascale. So what do you think an exascale machine versus an exaflop machine means to the people like yourself, to your users, the scientists and industry? What is an exaflop versus an exascale? So I think, you know, supercomputing moves forward by setting itself challenges. And when you look at all of the exascale programs worldwide that are trying to deliver systems that can do an exaflop or more, it's actually a very arbitrary challenge. You know, we set ourselves a pet-a-scale challenge and delivering a pet-a-lop and somebody managed that. But, you know, the world moves forward by setting itself challenges. I think, you know, we use quite an arbitrary definition of what we mean as well by an exaflop. So, you know, in your and my world, we either, you know, we first of all see a flop as a computation. It's a multiplier or it's an adder, whatever. And we tend to look at that as using very high precision numbers, 64-bit numbers. And, you know, we then say, well, you've got to do, for an exaflop, you've got to do a billion, billion of those calculations every second. You know, at some of the last an arbitrary target, you know, today, from HP, I can buy a system that will do a billion, billion calculations per second. And they will either do that as a theoretical peak, which should be almost unattainable, or using benchmarks that stress the system and demonstrate a real exaflop. But again, those benchmarks themselves are tuned to just do those calculations and deliver an exaflop in quite a sterile way, if you like. So, you know, we've kind of set ourselves this big challenge, you know, the big fence on the race course, which we're clambering over. But the challenge in itself actually should be much more interesting, the what are we going to use these devices for having built them. So, getting into the exascale era is not so much about doing an exaflop. It's a new generation of capability that allows us to do better scientific and industrial research. And that's the interesting bit in this whole story. I would tend to agree with you. I think the focus around exascale is to look at, you know, new technologies, new ways of doing things, new ways of looking at data and to get new results. So, eventually, you will get yourself an exascale machine. One hopes sooner rather than later. Well, I'm sure you'll have to sell me one then. It's got nothing to do with me. I can't sell you anything Mark. But there are people outside the door over there who would love to sell you one. Yes. However, if we if you look at your, you know, your your exascale machine. How do you believe the workloads are going to be different on an exascale machine versus your current petriscale machine. So I think there's always a slight conceit when you buy a new national supercomputer. And that conceit is that you're buying a capability that, you know, and many people run on the whole system. Now, in truth, we do have people that run on the whole of our archer system today. That's 118,000 cores. But I would say, you know, looking at the system, people that run over say half of that can be counted on your on a single hand in a year. And they're doing very specific things. It's very costly simulation they're running. And so, you know, if you look at these systems today, two things shine. One is it's very difficult to get time on them. Baroque application procedures. All of the requirements have to be assessed by your peers. And you're given quite limited amounts of time that you have to e-count to do science. And people tend to run their applications in a sweet spot where their application delivers the best performance. And, you know, we've tried to push our users over time to use reasonably sized jobs. So I think our average job size about 20,000 cores nowadays, which is not bad. But that does mean that as we move to the extra skill, two things have to happen. One is actually, I think we've got to be more relaxed about giving people access to the system. So let's give more people access, let people play, let people try out ideas they've never tried out before. And I think that will lead to a lot more innovation in computational science. But at the same time, I think we also need to be less precious. You know, we need to accept that these systems will have a variety of sizes of job on them. You know, we're still going to have people that want to run 4 million cores or 2 million cores. That's absolutely fine. And I absolutely salute those people for trying the really, really difficult. But then we're going to have a huge spectrum of use all the way down to people that want to run on 500 cores or whatever. So I think we need to broaden the user base in an extra skill system. And I know this is what is happening, for example, in Japan with the new Japanese system. So Mark, if you cast your mind back to almost exactly a year ago, after the HPC user forum, you were interviewed for Premiere Magazine and you alluded in that article to the needs of scientific industrial users requiring an exaflop or an exascale machine. It's clear in your previous answer regarding the workloads. Some would say that the majority of people would be happier with, say, 10 or 100 petaflop machines, democratization, more people access. But can you provide us examples at the type of science, the needs of industrial users that actually do require those resources to be put together as an exascale machine? So I think it's a very interesting area. At the end of the day, these systems are bought because they are capability systems. And I absolutely take the argument that why should we buy 10, 100 petaflop systems? But there are a number of scientific areas, even today that would benefit from an exascale system. And these are the sort of scientific areas that will use as much access onto a system and as much time and as much scale of the system as they can, as you can give them. So an immediate example would be people doing quantum chromodynamics calculations, particle physics theoretical calculations. They would just use whatever you give them. But, you know, I think one of the areas that it's very interesting is actually the engineering space where, you know, many people worry that engineering applications over the last decade haven't really kept up with the sort of supercomputers that we have. And I'm leading a project called Asimov, funded by EPSRC in the UK, which is jointly with Rolls-Royce, jointly funded by Rolls-Royce, and also working with the universities of Cambridge, Oxford, Bristol and Warwick. And we're trying to do a whole engine gas turbine simulation for the first time. So that's looking at the structure of the gas turbine, the aeroplane engine. The structure of it, how it's all bolted together, looking at the fluid dynamics of the air and the hot gases that flew through, looking at the combustion of the engine, looking at how fuel is sprayed into the combustion chamber, looking at the electrics around it, looking at the way the engine deforms as it heats up and cools down and all of that. Now, Rolls-Royce wanted to do that for 20 years. Whenever they certify a new engine, it has to go through a number of physical tests and every time they do one of those tests, it can cost them as much as $25 to $30 million. So these are very expensive tests, particularly when they do what's called a blade off test, which would be a blade failure that could prove that the engine contains the fragments of the blade. It's a really important test and all engines have to pass it. What we want to do is use an exoskeleton computer to properly model a blade off test for the first time, so that in future some simulations can become virtual, rather than having to spend all of the money that Rolls-Royce would normally spend. And it's a fascinating project. It's a really hard project to do. One of the things that I do is I'm Deputy Chair this year at the Gordon Bell Prize, and I've really enjoyed doing that. That's one of the major prizes in our area that gets announced super-computing every year. So I have the pleasure of reading all the submissions each year. What's been really interesting I've been doing, this is my third year of doing it, being on the committee. And what's really interesting is the way that big systems like Summit, for example, in the US have pushed the user communities to try and do simulations nobody's done before. And we've seen this as well with papers coming out of the first use of the Fugaku system in Japan, for example. These are very, very broad. These are earthquake simulation, large eddy simulations of boats, a number of things around genome-wide association studies, for example. So the use of these computers spans a vast area of computational science. And I think the really, really important thing about these systems is they're challenging people to do calculations they've never done before. That's what's important. Okay, thank you. You talked about challenges. When I nearly said when you and I had lots of hair, but that's probably much more true of me. Let's talk about grand challenges. We talked about, especially around the Teraflop era, the ASCII red program driving, you know, the grand challenges of science. Possibly to hide the fact that it was a bomb designing computer. So they talked about the grand challenges. We don't seem to talk about that much. We talk about exascale, we talk about data. Where are the grand challenges that you see that an exascale computer can, you know, can help us? So I think grand challenges didn't go away. Just the phrase went out of fashion. Much like my hair. I think it's interesting. I do feel that science moves forward by setting itself grand challenges and always has done, you know, my original background is in particle physics. I was very lucky to spend four years at CERN working in the early stages of the LEP accelerator when it first came online. And, you know, the scientists then I think they worked on LEP 15 years before I came in and did my little PhD on it. And I think that way of organizing science hasn't changed. We just talk less about grand challenges. I think, you know, what I've seen over the last few years is a renaissance in computational science, looking at things that have previously, you know, people have said have been impossible. So a couple of years ago, for example, one of the key Gordon-Bell prize papers was on genome-wide association studies on summit. If I may be one of the winners, if I remember rightly. And that was really interesting because first of all, you know, the sort of the genome-wide association studies had gone out of favor in the bioinformatics, biosciences community because people thought they weren't possible to compute. But that particular paper showed, yes, you could do these really, really big combinatorial problems in a reasonable amount of time if you had a big enough computer. And one of the things I felt all the way through my career actually is we've probably discarded more simulations because they were impossible at the time that we've actually decided to do. And I sometimes think, you know, we need to challenge ourselves by looking at the things we've discarded in the past and say, oh, look, you know, we can actually do that now. And I think part of the challenge of bringing an extra service to life is to get people to think about what they would use it for. That's a key thing. Otherwise, you know, I always say a computer that is unused should just be turned off. There's no point in having an underutilized supercomputer. Everybody loses from that. So let's bring ourselves slightly more up to date. We're in the middle of a global pandemic. And one of the things in our industry has been that I've been particularly proud about is I've seen the vendors, all the vendors, you know, offering up machines and making resources available for people to fight this, this current disease. How do you see supercomputers now and in the future, speeding up things like vaccine discovery and help when helping doctors generally. So I think you're quite right that, you know, the super computer community around the world actually did a really good job of responding to COVID-19. In as much as, you know, speaking for the UK, we put in place a rapid access program. So anybody wanted to do COVID research on the various national services we have down to the tier two services could get really quick access. And that has worked really well. In the UK, you know, we didn't have, I mean, Archer is an old system now, as you know. We didn't have the world's largest supercomputer, but it has happily been running. Lots of COVID-19 simulations, largely for the biomedical community looking at drug modeling and molecular modeling largely. And that's just been going in the US. They've been doing really large combinatorial parameter search problems on summit. For example, looking to see whether or not old drugs could be reused to solve any problem. And so I think, I think actually in some respects, COVID-19 has been the sounds wrong, but it's actually been good for supercomputing in as much as points to governments that supercomputers are important parts of any scientifically active countries research infrastructure. So I'll finish up and tap into your inner geek. There's a lot of technologies that are being banded around to currently enable, you know, the first exascale machine, wherever that's going to be from whomever. What are the current technologies or emerging technologies that you are interested in excited about looking forward to getting your hands on. So in the business case that I've written for the UK's exascale computer, I actually characterize this as a choice between the American model and the Japanese model. So in America, they've very much gone down the, you know, cores plus GPU or GPUs root. So you might have, you know, an Intel Xeon or an AMD processor center or an ARM processor for that matter, and you might have, you know, two to four GPUs. And I think the most interesting thing that I've seen is, is definitely this move to a single address space. So the data that you have will be accessible by the GPUs and the CPU. I think, you know, that's really been one of the key things that stopped the uptake of GPUs today and that that one single change is going to, I think, make things very, very interesting. But I'm not entirely convinced with the CPU GPU model because I think that it's very difficult to get all the performance out of a GPU. You know, it'll do well in HPL, for example, high performance limpad benchmark we're discussing at the beginning of this interview. But in real scientific workloads, you know, you still find it difficult to find all the performance that is promised. So, you know, the Japanese approach, which is the course only approach. I think it's very attractive in as much as, you know, they're using very high bandwidth memory. Very interesting processor, which they are going to have to, you know, which they've co-developed together over a 10 year period. And this is one of the things that people don't realize the Japanese program and the American Exascale program has been working for 10 years on these systems. Now, I think the Japanese process is really interesting because it, when you look at the performance, it really does work for their scientific workload. And that's that does interest me a lot. This is this combination of a processor designed to do good science, high bandwidth memory and a real understanding of how data flows around a supercomputer. I think those are the things that are exciting me at the moment. Obviously, you know, there's new networking technologies, I think in the fullness of time, not necessarily for the first systems, you know, over the next decade, we're going to see much more activity around silicon photonics. I think that's really, really fascinating. All of these things, I think, in some respects, the last decade has just been quite incremental in improvements. And I think we're super comedians going at the moment. We're at a very, very disruptive moment. And, you know, again, that goes back to start this discussion. Why is Exascale being difficult to get to, actually, because it's a disruptive moment in technology. Professor Parsons, thank you very much for your time and your insights. Thank you. Pleasure. And folks, thank you for watching. I hope you've learned something or at least enjoyed it. With that, I would ask you to stay safe. And goodbye.