 So we have the supercomputing conference here in Denver and who are you? My name is Heidi Poxson and I'm a member of the programming environments group at CRY. And right now I'm showing you the arms to scale it up depending on your application. So we're introducing our new support for the ARM, the KabyM TX2 processor on our XC line. And we're talking a little bit about the software stack that we have available. So if you have any questions specific to it? So it's got a whole bunch of KabyM Thunder X2 that is 8. So basically what we have is we have dual socket nodes. So we have two sockets per node and there's four nodes on a blade. And then they're connected to our custom Aries interconnect. And they plug into our current CRY XC50 system. So this is your fast networking system? Very fast. Yeah this is our Aries interconnect and that's used today on Intel processor XC50 and now introduced with the KabyM TX2. And what does a node mean? A node? Because it's like nodes. So there's a lot of talk about nodes in the servers and supercomputers right? Yeah so basically a node is it's two CPUs on a node and the node is the grouping. It's typically running an instance of Linux or an instance of the operating system. It has a number of cores and you typically run either in HPC run MPI or open MP jobs within a node and you run MPI jobs across nodes. So you're able to have two CPUs in one node? Right. So you run one Linux with two CPUs? Right. They have each of them have 48 or how many cores? Yeah it depends on the configuration. You can have cores. Some of our processors we support are 32 cores. Some are 48 it depends. And this is ARM. This is a big deal right? It's something different. It's not Intel. So what are you doing to make it work? So it's not the X86 ISA. It's a different ISA. It's actually the ARM V8 ISA. What ARM refers to as NEON. And what we've done is we've built a software stack to support the processor. So we have a custom Cray compiler that is designed to work on ARM. We have a set of scientific libraries like LAWS and Scale of Hack, FFTW that are designed to work on ARM. And we have performance profiling tools. And we have some debug support tools as well. So it sounds like a whole bunch of software. Cray is not just doing hardware right? Right. And one of the things that we do is a lot of people sell commodity hardware these days. And so one thing that's very important to Cray is how to extract the most performance from the system and allow users to scale to higher-sized jobs than anywhere else. To higher-sized jobs? So yes. So performance at scale is very important to us. So we can, for example, our profiling tools and our MPI implementation can run at very high job sizes. So over 200,000 MPI ranks, for example. So we want to be able to run at scale. And we want to be able to extract the most performance for the user application. So maybe if you can stand just on the other side here. This looks beautiful and awesome. How soon is this shipping? So the current plan is to make this available in spring of 2018. Spring? Yes. Right now the plan is April of 2018. April 2018? Who's going to be a customer for this? Anybody who likes planning to make a giant supercomputer? Well, I've got the right person to ask about a giant supercomputer. I don't actually know that. But I do know that we did have some information from Simon McIntosh Smith who's done some work on our early access. And they want to launch 160 nodes. So maybe more significant. Is that their testing right now? Right. He's here right now, actually. Something big. Yeah, it would be better to talk directly with Simon. Alright, thanks a lot. Hi, so who are you? Hello, I'm Simon McIntosh Smith. I'm the PI of the Ism Barge project, which is the first production ARM-based supercomputer. Ism Barge, what has that name come from? So there's a famous Victorian engineer in the UK, a guy called Ism Barge, Kingden Brunel. He's a fascinating person. You should look him up. And in the 1800s, he designed lots of fantastic technology in the UK. He designed and built the Paddington train station, the whole rail network in the south of the UK. He was designing bridges, tunnels, ships. So he was the innovator of his age. Like a Da Vinci guy? Very much a Da Vinci kind of guy. So we've named this system in honour of him. And is there some stuff on this wall? Where do you say stuff? There's a slide going around. So you're working with Craig, getting the supercomputer to work. You're at the Bristol University, right? Yeah, I'm at Bristol University. The Ism Barge project is a collaboration of four universities. It's Bristol, Barth, Exeter and Cardiff. Those are four universities in the south of the UK. They're all quite close. All connected by the railways that Ism Barge built. So that's one of the other nice connections. And we're also partnering up with the UK Met Office. So this is the organisation that does the weather forecasting for the UK. They have a big supercomputer and they have a nice big data centre. And they've actually offered to host the Ism Barge machine for us. And they're partnering on the project. And they're actually trying the machine themselves as well. So if this works out, the weather reports are going to be more accurate? Actually, I think what we're looking at is, could we use ARM-based supercomputers for running real weather forecasting and do it at a rate that we've never seen before. So it's really looking at whether this might offer advantages in terms of performance, whether it's performance per dollar, all sorts of things. That's what we're really looking at. And you had a speech from a speaker at the ARM Music Group meeting and you said you've been looking for it for two years. Why is that? That's right. So we started the Ism Barge project about two years ago. So I go back to the Mont Blanc project, which is a European project. One of the early ones exploring whether ARM was a viable option of high performance computing. And during the Mont Blanc project, it looked like, yes, this was possible. But that was very much kind of a prototype kind of situation. So we thought what we really want to do is build a real production machine to see if that works and find out what any of the issues are. So we had that idea about two years ago. And it's taken two years of working with Cray, working with Cavium, sorting out all the funding to actually make this possible. Working with ARM? Working with ARM as well. ARM have been super helpful. Helping make a lot of this happen. They're also part of the project. They've even given us some funding to help do some of the software side of the work that's required. So ARM's been a brilliant partner in the project as well. How about working with Leonardo? I think we're not working directly with them, but I think we're benefiting from a lot of the work they've done. So a lot of the software stack we're using is stuff that those guys have already done. And so lots of things are just working. So that's quite nice. And that might be their ambition is their stuff is just under the hood. You don't even need to know it's there. It just works. And that's been our experience so far. So you've had early access to this Cray stuff, but do you have some beautiful ones like this already? No, we don't have any of these. So this will be what we have for real with our production machine which we'll get in about May next year. But so far we've just had early access nodes. They're basically white boxes. But the ones we've got have come from Cray. So we have the Cray software stack on those white boxes as well as all of the open source software. So we have three different compilers we can use which is really useful. But we're mostly focusing on the sort of node level work because of that. So going multi-node is not so interesting just for white boxes because you've just got 10 gigahertz now. When we get a whole bunch of these we're going to have 160 nodes, four nodes in here. So we have 40 of these. It's basically a whole cabinet as well. So I don't know if you can see the cabinet just like this. Like this? Just walk around. You're going to have a nice big one like this kind of? Yeah, it's a whole bunch of these. Do you have the key? I don't think so. Can we open it? Hey, oh yeah, sure. So it will have something beautiful like that. It's going to be very much like that. So it'll be a whole one of these. It'll be 40 blades. Every blade has four nodes, about 160 nodes. So that will be Isambar. So we'll have about 10,000 cores, 10,000 Thunderex cores in that one box. So that's considered a pretty cool supercomputer. What do you think? That would be very cool. 10,000 cores was a very deliberate size because for the UK National Supercomputer Archer most of the jobs we run are up to 10,000 cores. So by having 10,000 cores we can do some really nice comparisons with the Archer National Supercomputer. Archer National Supercomputer, that's one of the good ones in the UK? What is that one? Yeah, in the UK that's the main science machine that we use for all of our chemistry simulations, weather simulations, things like that. So it's a 118,000 core X86 Cray-XC30 actually that's in Edinburgh. And that's the main machine that we use in the UK when we're doing our science. So how is that going to compare with one of those? So this will be like a small chunk of Archer but like I say, the way we use Archer we don't really have people use the whole machine at one time they just use up to 10,000 cores of it so what we'll have for Isambard is something that would let us replicate that situation where somebody might have up to 10,000 cores of Archer they could have up to 10,000 cores of Isambard and see how that compares to Archer. So do you have a bunch of students working with you and are they really happy to work on this kind of stuff? Are they like computer science students? Yeah, sure, so I have a whole team of guys from PhD students to postdocs working on this but the GW4 project actually has people across the whole four universities so we have researchers, we have professional software developers, we have sysadmins all working to make this a reality. So let's just one more second over here. So would you consider it successful what you've been doing thus far and how much more is there to be done? Sure, when we dreamt of the project two years ago it was at the time a little bit crazy and probably quite risky because there was nothing anywhere near this yet but it felt like the trajectory was going in the right direction and that's why it felt like a risk but a worthwhile risk and where we've got to now as soon as we started seeing what ThunderX2 should be like the specs sounded good, that was building our confidence and then when we started to get some remote access back in the summer just around sort of IFC time we started to get some very early numbers on some remote hardware and that started to look good and at that point we started to get excited and then a few weeks ago we got our early access node so this is the first hardware we've had in-house and that's worked really, really well so we've actually done almost all the early evaluation we want to in just two weeks which is fantastic so things are looking really good we're in pretty good shape when we get the real machine our goal is, our ambition is to try and go into a production live service as soon as we can over the summer and just open the machine up for scientists to run on it and just get science done and so far that's looking very promising You had a hackathon already? The first hackathon already On this kind of early release hardware with our eight early access nodes They all sign NDAs or? This was all under NDA because it hadn't been launched the next one we do, now it's all public won't need to be under NDA but the last one was all under NDA and we've been sat on the numbers actually we were still tweaking the numbers up to Sunday but we are so excited to be able to tell people at last what's happening and it's looking very promising What do the numbers look like? Basically it's better than the Intel stuff or what? How do the numbers look like? What did they say? The Thunder X2 processors have got more memory channels than most of the other processors shipping today so for codes that are very memory bandwidth bound you get really good performance and it beats the best from everyone else at the moment which is really cool If you have more compute bound codes then it's a much closer run thing because the current ARM processors only have 128 bit wide vectors whereas other CPUs that are shipping have 256 bit or 512 bit wide vectors so there are some codes that really do benefit from that so there are some cases where we wouldn't expect to win but for anything that's primarily memory bandwidth bound which is many of our real scientific codes these things are going to be the ones we'll expect to have the greatest performance which is great So that's great for UK ARM even though now it's Japanese-owned but it's great for Europe to have something exciting and all these students coming out of this project and super computers the market is exploding also and it's all related with servers too and eventually with phones It feels like a very interesting time and I think it's great to have this diversity hopefully that should be good for the health and the vibrancy of the whole ecosystem that it should be competition is always great everyone generally responds well to competition everyone ups their game the rate of innovation increases the cost effectiveness improves so it's really great to see this happening I'm really thrilled And you were talking about 10,000 cores, right? 10,000 cores, yeah But would it be cool if some of your team and maybe somebody else and says okay let's just build something to be the Chinese and be number one and it should be ARM powered Does that make any sense? At least one of the Chinese X scale machines has already been announced as being ARM based so some of the Chinese X scale machines will be ARM based the Japanese Rican X scale machine and Fujitsu is ARM based I'm sure there will be ARM based big machines in the US and clearly in Europe we're very interested in this as well so I think ARM will show up all over the world including China And all these people want to know your numbers and hear what you've been doing and they're all talking with you right now you're very busy at this conference, right? Yeah, we're running around like a mad thing It's been great, we've kind of been mobbed by people with interest that's why it's great that the numbers are all online if you go to the goingarm.com website which was for the user group yesterday goingarm.com it has all the talks from yesterday and the slides for my talk if you look for my name Simon McIntosh-Smith we've got early results for Risenbard all the numbers, it has all the specs of the benchmarks the specs of the hardware we were using so we're public at last and I have a video of your speech You did that, okay good, you have that too Cool, thanks for having me Good