 All right, let's get going. Thanks everybody for coming. Great turnout. So I'm up here with Jintan and Vojek Turuk from Cambridge and Monash. We're going to talk about how we've approached luster integration for our OpenStack HPC environments. The way we're doing this talk is Vojek's going first. He'll talk about how things have done at luster and some of the benchmark results that they've got with their setup. And then we'll go ahead and do the same. For Monash, at the end, we'll come together and talk about the challenges and so on that we face doing this. And take questions. OK, over to you. Hello, my name is Vojek Turuk. I work at the University of Cambridge. I'm the team leader of search computing platforms. So I start by telling you a little bit about what we're doing in Cambridge. I'm the research computing services division, which is a part of the main university information services department. And we have a broad remit to provide research computing to the broader university, the whole university. We have over 700 active users from 42 university departments. And our systems are 80% utilized all the time. So our focus is on providing cutting-edge research computing facilities to our researchers and supporting them. But we also do a lot of industrial outreach. So we work with industrial partners. We work on projects with Jaguar Land Rover, Royce Royce, and many others. We also work with vendors on developing solutions for research computing. So I'll talk a little bit about our research computing infrastructures. Two years ago, we built a new data center. It was a very big project. It has four data holes, high ability, very high resiliency. We've got 180 cabinets in total in the data center, three megawatts of power, and there is 247 stuff. And also it's very high security standards. The research computing part is actually taking, as you may guess, most of the data center we're actually using 100 racks at the moment. And we keep growing. We have two megawatts of power. We're not using that much already, but we quickly consuming more and more power as our infrastructure grows. And we're using very new technology to cool the racks. We have some racks with 35 kilowatts. So we're using a cool doors and evaporative power. And we are actually managing to get a very good PUE of 1.15. So currently, we have a number of research computing platforms. We got 600 nodes computer cluster code Darwin. It was installed in 2012. And at that time, it took a 93rd place in the top 500 list of the biggest supercomputers in the world. We could also 128 node GPU cluster, which has 256 NVIDIA K20 GPUs. We've got a total of five petabytes of storage currently as provided as last powerful system. And we've got 20 petabytes of type code storage. We also recently built a new platforms for our new project, Biomedical Cloud, where we have 200 nodes. And this is a hybrid infrastructure, which I'll be talking about a bit later. It has HPC, OpenStack, data analytics elements, and storage. We also have a large Hadoop system with 200 nodes and OpenStack. So we're providing research computing services to the entire university. We have a very wide range of scientific discipline using our systems. We've got astronomers. We were involved in the SKA project. You might have heard during the keynote from my boss talking about Biomedical Cloud. And we have also departments from physics and engineering over large projects and requiring very large amount of computing and power and storage. So for most of them, we're using high-performance parallel file system, and we're using Laster. So probably not all of you know what Laster is, so I will give you a quick introduction to Laster. So Laster is a high-performance, highly scalable parallel file system. It consists of server side and client side. The server side is object storage servers, which then have object storage target. And they store actual data. And then there is also a metadata element with metadata targets and metadata servers. And the beauty of Laster is that it can be scaled up as we wanted by just adding more either object storage servers with more capacity or more metadata servers and storage if we need more metadata performance. So it can be presented as a single name spacing and can grow to tens of petabytes and can provide hundreds of gigabytes per second. The client and the file system supports multiple network connections, so it works with interconnects like Omnipath, Infiniband, Rocky, or just standard Ethernet. So it's very, very flexible. So one of the projects that we did in 2016 was building up a biomedical cloud. This was a project done with a School of Medicine and School of Biology in the University of Cambridge. And the requirement was to build a new computational platform to provide them with better facilities to analyze their data. And we ended up building up hybrid infrastructure with HPC components with Hadoop cluster and also OpenStack and with a range of storage backends, a large three petabyte Laster file system and 1.2 petabyte NAS storage and a large tape backend for storing cold data. So these platforms are connected via private networks to the hospital and to departments around hospital. So the data can move to our data center and then be processed, analyzed, and used by the researchers. One of the first projects that we implemented is now beta testing. It was a project with Walsum Brain Imaging Center. So they were replacing the brain engineering facility. They were installing new scanners and also replacing the compute facility. So we worked with them and agreed that because of the requirements on isolation and security, we built up this new computational infrastructure inside OpenStack. So we designed the platform. Essentially, it is a cluster with compute nodes, login nodes, scheduler, and different types of storage. On your left, you can see the green boxes. This is actually external storage, which is Laster and tape and NAS storage that we then enable inside our OpenStack tenant. So we used for that, we used provider networks. And we created VLANs within OpenStack and on the physical network to provide access inside the tenant. And we are managing that tenant. So we are the system administrators of that tenant. It's all deployed with Ansible Playbooks. So it's fully automated and can be rebuilt and recreated and changed as we need to. I'll talk a little bit about our OpenStack. So our OpenStack infrastructure consists of three zones. We've got 50 gigabit ethernet, which we use for cinder storage, access for SRIOV instances, for fast network and RDMA accelerated instances, and also used for VXLAN for tenants network. We also have 10 gigabit network on each hypervisor for using it with provider and external networks. So our Laster file systems will be accessed via that 10 gig network. And we also have one gigabit network for management and provisioning. So we use VLANs for SRIOV networks and for provider networks. But inside tenants, we use VXLANs to separate the networks. So we use Melanox switches. They are the latest range called Spectrum. So we've got 100 gigabit switches. We then split those ports into 250 gigabit ports. And we've got also Spectrum switches which provide 10 gigabit ports. So this is a specification of our OpenStack nodes. We've got three controller nodes and 80 compute nodes and three different type of networks. We've got a small SAV pool for Cinderback and also InfoGlans. And then we have a large Cinder pool, which is actually two large Cinder pools using Nexenta Store. And we use Red Hat OSP8, which is essentially a liberty. Our last infrastructure is so we have multiple file systems. In this slide, I only describe our latest file system that we use for the bio cloud. And this consists of essentially, as you can see, it fits into one rack. On the top of that rack, we've got metadata servers with metadata storage. And the rest of the rack is filled with object storage servers and object storage. The entire rack provides two petabytes of usable capacity after raiding it. And we've got two metadata servers and six object storage servers. And we expect to achieve six current gigabytes a second on the Tengig network and 20 gigabytes a second on the Infinibar network on the bare metal clusters. So we use Intel Enterprise Edition Laster on our production file systems. The current version is IEL301. And we also use Red Hat 7.2 as operation system. So we've done some benchmarks by using virtual instances on our bio cloud OpenStack platform. The Laster is connected via provider networks. We created 12 instances with small flavor. This flavor is using pinning, CPU pinning. First test we've done with small instances, four cores, two gigabytes of RAM. And we've noticed that the write performance was very, very slow. That's the blue line. And read was actually pretty good. So we increased the size of instances to eight cores. And the write improved a little bit for small thread counts. But again, if we increased number of threads, it actually went down. And we didn't continue to increase because it was already pretty bad. So we decided that we increased the instances to a bigger flavor. And we created 12 instances using 12 cores and 16 gigabytes of RAM. And the behavior changed dramatically. Suddenly we're actually getting four, six gigabytes a second on read and write. And it's very consistent. We run a test with 144 processes. So we use IOR benchmark. IOR benchmark is a very popular HPC benchmark, which uses MPI to synchronize the threads on the compute nodes. And as you can see, we actually achieved very good performance on the large instances. We also run some metadata tests. So this test is in every run, it creates 1 million files. And you can see that for very small, so if we use just single instance, the metadata operations are not great. But as we increase number of instances and processes per instance, then the file creation start and file removal increases. So it's scaling very nicely. So this is all from me. And I would like to thank you, my colleagues from my team, Paul Brown and Matt Rastabarnett, because they did most of the heavy lifting in the numbers done. So thank you very much. Last one. Do you want me to do that? OK, well, you already know who we are. OK, so here's what I'm talking about. Quick overview of Monash, particularly some of our imaging facilities at the Clayton campus. The 21st century microscope abstraction that we use as our catch cry at the moment, and our research cloud, and particularly also the Nectar Research Cloud Federation in Australia, which we're a member of, and then on to Massive, which is a particular HPC facility that we run, and M3, which is the latest resource, part of the latest cluster in the Massive project. OK, so Monash quickly, probably people in Europe and the US may or may not have heard of us, where in the top 0.5% of universities worldwide, very research-intensive university. We also, I think in Australia, we're actually the largest university by the number of students, six campuses also around the world. So Clayton campus, which is where we work, is a globally unique hub for particularly medical imaging. We have a lot of interesting microscopes here. I'll talk a bit more about a couple of those in particular later on, particularly the cryo-electron microscope and ladder slide sheet. So microscopes these days, though, not just the facilities that we have actually dealing with light and so on. There's a full integrator's deck, where HPC particularly is a big part of that, and that's where you see Massive in the picture there, and also Rackmon, which is our research cloud as well. So Rackmon is a node of the Nectar Research Cloud in Australia. We have a whole bunch of regular commodity infrastructure as a service compute and big pile of SEF storage. We also have a bunch of specialist capabilities, SSD, high memory, GPUs, that sort of stuff. The Nectar project is worth mentioning here, because we've been running this now for quite a while. So Nectar sort of went live in January of 2012, the first release of OpenStack that we installed at Monash was Diablo for a test cluster, and then I think we went live on Essex. We now have eight different nodes around Australia, 10 data centers, and over 40,000 cores available for public researchers to use. That's sort of the public part of the research cloud. We also leverage the Nectar Research Cloud model to build our own resources particularly for Monash or other projects, so for example M3, which I'll talk about as we go on, is one of those. We're using other cells, so at Monash we have three cells, two of those are public cells, one of those is purpose built for our HPC, and there's also the rest of the nodes around the country, you can see those on the map there. And underpinning all of that, we have our NRIN, ARNET, connects us all up nicely as well, so we can have, we also run like a globally distributed cluster across those sites, which provides all the image storage and so on for the cloud. I didn't even realize we had transitions on these slides. All right, so the massive project, massive is a special HPC facility for characterization, specializing in imaging and visualization. So there's a few vital statistics up there, but what massive is, one of the defining features I guess of the massive project is that it supports a large cohort of users who are new to HPC, so people who have never touched a batch system, and so there's a few key projects that massive runs and develops to help those users and bring them into HPC and support their applications. The other defining feature of massive is that it has a very strong instrument integration agenda, and so you sort of, you can see that depicted here, we have a number of instrument facilities which run special data capture software, which will help stage data into the HPC file system for the image processing and so on that needs to occur to support those facilities and also then subsequently manage the raw data that's come from the instrument and the resulting processing, help the researchers share that sort of thing, and provide it into desktop environments that they can further analyze it in. This is just a quick example of one of those examples which is for the imaging beam line at the Australian Synchrotron which happens to be just across the road from us and is one of the instruments on that slide earlier. I'm not gonna go deep into the details here, it's just illustrative, so one of the things I mentioned before was the number of users that are new to HPC, so scientific desktops play a big part in the way we help cater to those users, and when Jin talks later, she's got just at the end of her result, she's got a little demo of the software that the massive project develops for that because we've seen a bit of interest in that before as well. I'll make a note here too, so fast file system access is quite important for these applications, of course, and many of them are not tailored to parallel IO necessarily, so individual client performance is important for these applications because they don't necessarily know how to do MPIO or necessarily even modded-threaded. Damn you, Vojtek. Not you. Okay, so the massive project started off with two HPC clusters, M1 at the Australian Synchrotron, and M2 at Monash University. At the time, they're sort of getting probably a bit long on the tooth now, but still very useful resources. At the time, they were quite new and pioneering and bringing in a lot of GPU compute, which, given the amount of image processing and so on that goes on, massive is understandable, and we've continued that with M3 as well. So M3 was announced earlier this year, projects funded by Monash University. It's specializing in next-generation data science. We've got 1700 Haswell cores, a whole bunch of K80s, a few smaller GPUs to support lower-end visualization workloads, and about 1.2 petabytes of luster, all the details there. We also have a whole bunch of the pre-release Malinok Spectrum Gear, and there we go. One of the key instruments that M3 was built to support is the Titan Creos, it's a cryo-electron microscope, which was recently, we recently had a new center built for that out at Clayton. This thing can produce terabytes of data per day and is used for, where are we? It can image down to cellular and subatomic level by freezing the specimen, essentially. We also have another interesting facility that's just coming online at Clayton, which has the latest light sheet microscope, which is used for live image sampling, which is going to start using M3 as well. So as I mentioned before, file system performance is a priority for M3. The M1 and M2 resources are standard, typical bare metal HPC facilities, but with M3, we decided to leverage the research cloud. We'd already done this with Monash's own local campus internal HPC facility, Monak. And so with M3, we've continued that. The interesting thing about this is like actually Vojtek mentioned in his slides before, this allows the HPC cluster to be built as a virtual cluster and start using those cloudy techniques for deployment. M3 is also deployed with Ansible as well. And it also allows us then to take the same physical cluster and repurpose small parts of it for particular users as required. So for example, bioinformatics and so on, they like to have Ubuntu, whereas we normally use CentOS, that sort of thing. The other interesting thing about M3 that's a bit different from a regular HPC facility is that the network is all ethernet, although it's high bandwidth ethernet. And we have sized, we've got a fairly high over subscription ratio. We're now in our left spine topology, which has been sized basically just for the file system bandwidth. We're not expecting and not planning to run any cross top of rack switch MPI jobs, because we can manage to fit about 1,000 cores within a rack, so that's enough. Okay, so virtualized HPC. This is something I talked about a bit in some scientific working group meetings recently. So I thought I'd put a couple of graphs up here quickly just to show the sort of numbers that we get once we apply a little bit of tuning. It's been discussed for a long time in the literature, but it's interesting that there's not a lot of production adoption yet. There's, you know, the NFV folks are very active at the moment in OpenStack land and they have very similar requirements, I think, to HPC users. So the main thing is we just don't want to end up with a uniform memory access layout like we have in that little diagram there, where the guest virtual machines can't see the topology of memory or CPU on the host. And fortunately now, OpenStack makes it pretty easy to do all of this. So we use, for example, just image properties to select the nomin topology and pinning and so on that we want. And that gets passed through down to the turnover on the host to appropriately set up the advert to pin. And there's, of course, a few other features to disable on the host as well. You probably don't want to be over committing memory and that sort of thing. So I've got a couple of graphs here. These will run just on a single host. It's a two-socket box as well, 2680, 200, 60 gram, it's got K80s in it, Malinox, all that stuff. We ran both high-performance LIMPAC, which is MPI-based and Intel-optimized LIMPAC, which is using their MKL libraries. And that's just an SMP benchmark. We, our host environment here is using trustee because we're running OpenStack Liberty components, but we use a Xenial kernel and grab QMU from the Mataka Cloud Archive. The reason we've gone for that version of QMU is actually to do with PCI pass-through because there were some issues with using K80s with anything lower than 2.3. We have kernel same page merging disabled and transparent huge pages disabled here. And the guest is a central seven guest, 3.10 kernel. It's one of our large GPU computing flavors, so it takes up a whole host. Whereas if we were to say if we had the host using setup for interactive desktops, then it might have four virtual machines on it. So this is what we see with high-performance LIMPAC. So you've got the yellow or orange line is the naive configuration of the guest, which is where the guest is 24 sockets of one core, each one thread, no pinning, and one big 240 gig single Nürmesele. The blue is the optimized version where the CPU topology and the Nürmesele topology has been enabled in the guest and the virtual cores have been pinned into the physical cores and the memory as well. So you can see the average there is about 98.5% performance across the different LIMPAC problem sizes. The other couple of lines there are, the green line particularly is one instance where we pinned in reverse. So we've got the compute cores pinned to the opposite Nürmesele. So that green line illustrates the impact of Nürmesele locality basically for you. That's the unoptimized version. The red line is just demonstrates an interesting issue that we bumped into where, which we still haven't really gotten to the bottom of, but we've found a workaround for, where inside the guest, the hardware locality library that's part of OpenMPI, when you just pin based on the normal Haswell core layout and say it's in alternating cores per socket. So you've got one, three, five, seven on socket zero and two, four, et cetera on socket one. It thinks it's got overlapping CPU sets and so errors out and looks like it can't, then can't decide to pin itself properly. Fortunately, if you just give the guest a sequential core numbering and pin to the physical socket in the same way, then you get the nice blue line. Though that trend continues for Intel M-Care, the Intel S&P Limpac, so blue bars are the bare middle and that's all in, that's in gigaflops and you can see that the optimized guest is coming up to basically the bare metal performance there too. And I should mention these are all average numbers of a large number of runs. The standard deviation here is very low except at the small end of problem sizes where in the virtualized setup, it takes the first couple of runs and they take their sub second runs for these problem sizes slower than the bare metal. The bare metal sort of jumps up to that performance straight away. Okay, so now for high performance and high class system integration. The way we've approached this is we've kind of done it a couple of different ways and for M3, we've chosen just the direct SROV path. The M3 nodes can have, they've got a number of different provider networks available to them. They've got an internal management network. They've got a public network. They can attach ports for any of those things of course but all the flavors also just have an SROV VF assigned to them as well. So the, and the hypervisor is pre-configured with virtual functions attached to the data VLAN. This is okay at the moment because we just have essentially a single tenant on these machines but going forward we actually plan to use the Malinox ML2 driver so that we can have that orchestrated properly. But this, the plumbing here is essentially the same. So we have this data VLAN which is ARIME capable so the network's been set up to be able to run rocky traffic and the luster here is bare metal luster. It's the Intel Enterprise luster like Cambridge are running and that's configured to use the luster network O2IB driver and so given the audience I figured somebody would eventually ask why not CFFS so I thought I'd throw this in. Well for one thing CFFS actually wasn't even announced production ready when we started planning this project and just that pace of development is perhaps one of those, one of the reasons why we chose luster here. Also as I've mentioned a couple of times the fastest in performance is very important for these applications and at the moment we're uncertain not confident about whether CFFS right performance will be able to match luster given you've got the overhead of replication on the network. That's something that we're interested in certainly interested in playing around with and of course support. Okay so that's me done so Jin's gonna come up now and show you some of the benchmark numbers that we've got up on our setup. No, oh yeah okay. Hi guys I'm Jin, I'm here to talk about luster as a high performance fast systems. It's great that I don't have a lot of marketing slides so I'll just jump straight into technical details. So M3 luster hardware. We got two MDSs and two MDTs as Warchet mentions before we need the MDS, MDT, OSS, OST to run the luster fast system. In this case in M3 we are using the Intel manager for luster to manage the fast system. And we got a total of 1.1 petabytes of usable fast system for dedicated to the M3 HPC users. So this is the actual M3 luster layout. So you can see that we have the management network as well as the data network. So the luster clients will talk to the OSS as well as the MDSs that directly on the Milano spectrum network and the management network is used to manage the service. As you can see as well the OSSs cost pay so that it allows the failover for the OSTs. So one thing I like about luster is luster supports various types of network. The one that I'm familiar with for example, IB, GIG-E, IP, ROCKY and some of those that I never came across. Outnet is a set of protocols and API to support the high performance, high availability as well as recovery of the fast system. So as a CIS admin you can use the Outnet self test. It's actually a tool to measure the performance across the host networking. And as the users you can actually stripe your files to decide how you want to split your files across the OSTs. LT, so in this case the Outnet configurations. So how did we actually set up the luster files systems on M3? So basically you just have to install the effects on both servers and clients. So, and next thing you need to make sure that you have the UDEP proof set up. You want to make sure that the interface name, the MTU, any DNS, DSCP, IP address as you want is consistent across all the instances. And I would suggest that you set it up as a NCP role for luster in this case. And next thing you need to make sure you have the module file set up so that the NIC address is actually using the O2IB driver because for both server and the client you picked up the TCP instead because we are using the ROCKY network. Next thing you want to consider writing a system disk script as well for mounting and unmounting luster on the client as well as putting the NHC script to check the fast system and network availability for the slum, for the scheduler. So I did the two tests as well. I would say it's a benchmark. I'll probably say more like a performance test in my scenario because the system that, while I was doing all the tests the system is actually quite busy. So I use MDTest and IOR as well as was mentioned before. MDTest uses MPI to create states and remove the files. In this case, we are using open MPI that come with the, we compile the open MPI with the mononox MXM which is the accelerator that come with the OFAC installations. So it measures the performance in Ops per second and the next one is just the commands that are used to perform the tests. And in this case, I only perform the tests on files only. So this is the graph, a very simple one showing the file equations, file rate and file removal operations between the bare metal and virtual machine. As you can see, the results are quite promising which is especially the VM is actually not optimized which is we didn't actually apply any new methodology in this case. All the tests is running on across two nodes. So IOR is something that I like to use because it's really easy to set up and it, you can define how do you want to run it as well to accommodate your small, medium, large file size as well as the throughputs. So the good thing about IOR is you can use the MPI to sync the tasks and define how many processes and how many tasks you want to do. In this case, it's rather small file size because in our environment, we don't have a lot of big size, large file size. So in this case, it's 12 gates using the 24 cores. Yep, this is not what I expected when I finished running the test. For some reason, the blue line, which is the virtual machine actually performed well better than the bare metal. So yay, so we're quite happy but I have no idea why it happens. One thing that I want to point out is the fast system was really busy. It's fully reserved when I was doing the test as well as I'm not running the test on the same machine that run the instances. But the reason I put it in the graph is to show you that the more MPI tasks that you join, more thread, and it actually increased the performance of writing to the fast system. In this case, we actually get about close to six gigabytes per second for 24 tasks. Okay, so Blay mentions about visualization is very important in Massive. And in Monash, we have in-house development which is called a software that is called Strudel. Name after the desert, but it's not. It's the remote visualization tools. You can go to the link if you have an account. Just go to desktop.massive.org.au. It's a cross-platform applications as well so you can run it on Mac, Windows, or Linux machine. The backend technology is just a simple SSH tunnel that launching a turbo VNC sessions. Sorry that it's not a live one because I don't trust the network here. So I have to throw in a video on how a user can go in and request the desktop sessions. So you log in as the user and then it will, this is a federated identity in Australia that we're using. And you can select which desktop session that you want to run on which cluster. You can accept the default configurations or you can change how many processes, memories that you want to use for your desktop sessions. And this will actually integrate with the SLIM job scheduler. So it depends how much resources you allow in your projects. And you just have to click a show desktop. No, I think your laptop died. Oh, seriously? Yes. I've still got caps lock, it's good, it's good. All right, even the plain B doesn't work. Oh no, I can use my laptop. Yeah, we need your last slide anyway. Yeah, sorry guys. If you got, if you want, it's power. If you want to ask a question, can you come up to the mic, please? So we can record it. Set up your laptop. I can ask a question about the, do you have data to support your conclusion that last train is better than CEPFS or it's just a, you know, assumption? Well, no, no, I didn't say Luster is better than CEPFS. But I said that that's what we chose for M3 because of the reasons that I listed there. So the primary issue was, for one thing, CEPFS wasn't production ready at the time that we actually purchased this system. But also one thing that we're unsure of and that we would like to do some testing of, because we're also a big CEPFS user, is of performance for, for example, for a single client write performance, whether or not that can with CEPFS architecture actually ever be as good as a system like Luster or not. Because you have to, for every chunk of data that you write to CEPFS, you have to write that N times for the replication and then you incur latency every time. Okay, so I'll just fast forward to the end. Okay, and then you get the desktop sessions and a list of the applications that your project is allowed to use and all the modules will be loaded when you launch the applications. So this is really helpful when a researcher comes in and say I want to just see the image and analyze it on the desktop. And it's very easy for people that doesn't have a lot of experience with changing the job scripts as well. Okay, so lastly, nothing is perfect. There's a lot of challenges, try out the problems that we have. For security reasons, with Luster, we only want to deploy on the managed trusted tenants. And we find it that the performance is really poor for the small flavors. And for multi-tenancy, we will still need to upgrade the Luster because it's only available in version 2.9 and it's actually fairly new as well which will support the subdirectory amounts, kind of authentication by the kerbos. Getting the network working is not easy. They can jump in. It requires a lot of combinations of NIC firmware, Switch firmware and OFAC. Well, both Monash and Cambridge for these systems that we've talked about were early adopters of Spectrum. And there was at least one instance where there was Switch firmware updates that weren't compatible with NIC firmware. Suddenly host ports stopped working until firmware's upgraded and that sort of thing. We've also found, because we're using SROV, that some, in some cases, firmware updates on the NIC will remove the SROV port. And if you have, you know, you want to do a host upgrade but keep the guests on that host alive and bring them back up afterwards. So you want to take a host down for half an hour to do an upgrade. If you still have no the compute running when you do that and the SROV port disappears, then no one will delete it from the database and then the host will come back with and have its port disappeared on it. So that's a bit of a gotcha. And this last point is one of Vortex's challenges that they've been working on at the moment too with VxLand performance not being quite what's advertised. Yes, so we're supposed to be having offloading on the NICs and we're still working on getting the performance that we're supposed to be getting with that offloading. Yes, and then I think, although your deployment was done a few months earlier than ours, I think we went through the same process of buying, but I think the level of support we received was good and we have a working configuration and we're working on fixing these performance issues so I think we're on the good way. Alright, so questions please. Hey, you seem to have done a lot of tuning of the configuration of Nova, the images, the instance types and so on to map to the hardware, the number, et cetera. Is it something that could be published as a reference architecture on OpenStack? Yeah, definitely. So Stig, who's co-chair of the scientific working group has been working on a bunch of reference cases, use cases over the last couple of months, one of which I've contributed to and we will... It actually doesn't yet have all that level of detail in it but that will be a live document and we will continue to flesh those details out. When that stuff... There's also a book being put together for supercomputing and I think when that happens that will go live somewhere on OpenStack site too so we'll hope to continue adding to and building on that. And did you find anything that was missing in Nova that you would have needed? No. No, so we're on... Now that we're... What did I say we were running? Liberty? I think since Keelow, all that stuff's been there. Okay, thanks. We're being told time. Apparently we all have to go to beer. Yeah. Go ahead. We have a cryos that's currently in boxes waiting to be unboxed. Did you have to make special provision for it? Well, I think what happened is it showed up and we were like, oh, shit. How do you mean special provision? I mean, yeah, it has its own room, it needs all sorts of physical power. Just to guarantee the throughput that it requires? Okay, so I guess we kind of... I mean, when you want the capture rates and stuff we're from the cameras on it and the cameras that they're planning to get as they upgrade. So the acquisition rate though does not necessarily correlate directly to what you need the file system to do. I think we just knew we needed about 1,000 cores. The main thing being that we want to be able to keep the thing acquiring at a reasonable rate. I don't know how many samples they can get through in a day, but they might do maybe three or four samples or something in a day, is that right, Steve? Great. The other quick thing was you're using SROV, so I assume that means you're not bothered about live migration. Correct. Thank you very much.