 OK. Thank you all for coming. That's a lot of faces out there. It's great. Good to see you all. My name is Dave Holland. I'm a principal system administrator at the Sangre Institute. If you don't want to hear about this, then you're in the wrong place. Go somewhere else. I've got quite a lot of slides. Some people would say too many, so I'm going to skip through them. If we can keep questions till the end, that would be great. The slides will be available on our blog. I'll put the website address at the end as well. So I'm going to tell you a bit about the Sangre Institute and why we're using OpenStack now. I'm going to tell you where we are, how we got to where we are, some of the exciting things that happened along the way, some of the things we've achieved and where we might go next. So the Sangre Institute is always set up as a genome sequencing centre that was founded in 1993 as part of the Human Genome Project. That was a 10-year international project to sequence and assemble one human genome. So 25 years ago, that's all we could do. I'll come back to that point again later. So we celebrated our 25th anniversary this year by sequencing 25 genomes of interesting, unique, special animals around the British Isles. And as you'll see, sequencing lots of different things becomes a theme for us. So we have some core scientific research programmes and supporting those various facilities. I'm in the IT one fairly obviously. There's various other things going on. So genome sequencing. Basically you put in a sample of blood or a tumour or some other body part and you get out a sequence of ACGT, the DNA letters, so far so good. But then the analysis comes in. It's like a jigsaw typically. You've got all these little fragments and you need to put them together. You need to compare them. You need to find out how the pieces fit together. Now that's embarrassingly parallel with all these little fragments. It's a really traditional parallel HPT task. So that's what we've done. All these cores of compute, all this lustre high-performance storage, so far so good. But there are downsides to this approach. It's POSIX, so you get what POSIX gives you, which is OK but not brilliant. You get limited flexibility. The users, the scientists don't have roots on the compute farm, so they get what they're given or they get what they ask for if they ask really nicely. Finally, reproducibility and portability in particular is hard. Scientists and informaticians tend to build things that rely on certain facts about their environment, whether that's database locations or firewall rules that they expect or path names that they think should be a certain way. You can't usually just lift that and run it on someone else's cluster. So, enter OpenStack, enter Cloud Computing. So on the left, that's what we've got. That's what we like. On the right, that's what OpenStack can bring us. Sounds great. Where's the catch? So, a lot of things can go, I wouldn't say wrong, but let's say differently. If you're using things like Lustre or NFS, those file systems will trust the file system client in certain ways. If your tenant has got root on the instance, which is traditionally in OpenStack Cloud, they can become another user and read someone else's files. Or, you know, if they're root, they can read anyone's files. We need something that's high-performance. Typically, if you're looking at Cloud, that's virtualisation. That means extra layers of abstraction. That means extra latency. Doesn't that mean it'll be slow? Let's see what we can do to make it fast. So, we started our learning curve towards the end of 2015. So, this slide is the 80 months before production rather than the 80 months in production. As you can see from there, we started off December 2015. We worked through Liberty, we worked through Kilo. And then, 2017, February, the actual announcement of our flexible computer environment, as we call it, that was opening the doors to everyone. So, everyone liked some specs and some pictures of hardware. This is what we bought through our production hardware. We started with just over 100 compute nodes. We thought we might want to run some tests side-by-side, so we bought six controller nodes, so we can have three in HA cluster, and none of the three doing something else. We probably went a bit OTT on the spec without really knowing what we were getting into. One of my colleagues is known as Captured Overkill, so perhaps that's where we went with this one. We started with Red Hat, OSP8, that's Liberty. We had Seth for the storage. We started off with nine storage nodes. I'll come to very shortly how quickly we needed to upgrade that. People really liked object storage. We started off with Seth Jewel running on Ubuntu, because that's what the guy running Seth was most familiar with at the time. The initial benchmarks were okay, but then if you think about the number of storage nodes there, that's okay. The numbers get better as we go later on. Networking, we had some great support from Arista. They supplied us with a seed kit of equipment so we could get started. We have built on Arista for further expansions later on. There it is, in all its glory, in its first home in one room in our data centre. If you're eagle-eyed, you'll see there's more than nine storage nodes. That is not an original photo, that's after the first round of expansion. Some of the technologies were used. I mentioned Red Hat Open Stack already. Some of the pitfalls we found were maybe due to our ignorance and inexperience, maybe due to how it was packaged at the time, maybe due to some assumptions. What we found was that scaling is hard, scaling is really hard. Defaults for things like file descriptors and number of threads are always way too low. We had some exciting time with hardware acceleration index, things like offloads. We got as far as talking to Milanox about that, and it turns out there are some good combinations of kernels and drivers and there are some very bad combinations as well. You can guess where we started out by mistake. Something we found as soon as we let the users on in anger is that, OK, I can start one instance. Can I start 100? Can I start 200? As soon as they start doing things at scale, you find out where some of the corners are. We found interesting races to do with starting many instances simultaneously. We mitigated that in liberty by directing all the API calls to one of three HA controllers. We also had really quite a lot of problems with RabbitMQ to start with, and we eventually diagnosed that to damage fibers, which we're not showing as faults on the network, but we're causing enough latency for Rabbit to have some problems. As you probably know, Rabbit goes away. Everything else falls down like a whole set of dominoes. Other things we found, there's a default in Nova for how much memory it reserves, how much memory it considers it's holding back as an insurance policy. If you like, we set that way to load to start with. You need a buffer to make sure that if you've got other workloads, like syslog or agents, or if you've got any memory leaks going on, you need a bit of a buffer against that. We set it too low, easily fixed when treating what we were doing. Over time, we did various package upgrades for bug fixes, for features, for security fixes. We did those all by hand because having tried minor upgrades in particular elsewhere, we had endless problems upgrading the Red Hat package to OpenSec. I've heard that from other people. We're keeping a close eye on Red Hat's proposed upgrade plan for the future. We had some problems with Port Channel, and we worked with a restaurant at it, and eventually we decided that was simply down to a slightly dodgy Nick card. Sef, what's to say? It's really rather good. It had a bad reputation when we first started. Some of my colleagues said, Sef, you don't want to do that. We found it really quite robust. On one of the earlier slides, there was a power limit for one of the racks of equipment, about 20 kilowatts, I think, which is suspiciously closed to our power budget for the racks, which was about 20 kilowatts. We managed to flip some breakers, lost a rack full of Sef, and it carried on. There was really no drama. It was just what we wanted. I don't want to speak out on that. We found a startup brace, again, in the packaging to do with Sef. There was a single lock, so if you had a lot of OSDs, then each OSD, taking the lock in turn to start up, would eventually time out, so we mitigated that. We've had some wrinkles with Sef Ansible as well, and we've submitted patches upstream, so that if you're making changes in, say, the OSDs, it doesn't also try and restart the mons as well, things like that. We're using Ansible for customization. This is partly due to our own experience with it. Partly it's a strategic choice for IT within Sanger. You can see that there's a whole catalogue of things we've changed. The real problem is that some of these get overwritten by the Red Hat Deployer, the puppet machinery. We don't do anything like a scale up or a scale down. The Deployer effectively reruns and overwrites your customizations. That can be a bit of a nuisance, and we're looking at ways to mitigate that. Monitoring, it's all very well-building something, but if you don't know what it's up to. We've started with Nagios. We've started with the absolute bare minimum. Are the hosts there? Are they up? We've evolved it. Now we've got custom views for customers and users. We've got views for us. We've got availability reports. We have a scorecard that goes to management. We've got open stack this month. How healthy is it? We've written our own scripts. We've written our own checks. For metrics, we're using Gryffana. We get some really nice dashboards and some fairly ugly ones as well. We're using our Syslog and Elk. Nothing special. Simple Syslog. This is quite an early example of a user facing dashboard. Simple lights. Is it green? Is it red? Is it good? Is it bad? There are a few subtle things there. One of the lights is labeled instance creation. It's an end-to-end test using heat to bring up an instance. Plumb it to the network. Attach a volume. Eventually, can I ping the instance? If and only if all those succeed does the test go green. So that is the Uber health check, if you like, for the open stack. This is the sort of thing we use as engineers. We have this on a big screen in the office. It's like an ops display board. You can see it's not just our open stack on there. We've got Lusterfile systems. We've got LSF. We've got bits and pieces. Have you noticed it's gone red moment? If you can just see it out of the corner of your eye, it's the early warning for something is not quite right and if you're lucky, you catch it before the users do. Behind the scenes you can drill down. These are all the different checks. Some of them against the front end. Some of them against individual controllers or compute nodes. Metrics, again, we're not doing anything terribly clever. It's collectee, talks to Libvert. We've got a bit of Python there to send things off to our graphite service. We've got some other scripts that take data out of Notchie. They aggregate it. They do some mathematical messing and they spit out a score for each tenant. How efficient is your usage? Efficiency is difficult to define in terms of open stack. What we've gone for is all your CPUs that you've allocated being run full tilt. So 100% efficient is really hard to achieve. Some people get quite close, which is good to see. The only problem we've had with scaling up is that you need to increase the number of carbon cache processes. The more data you send in, the more carbon caches you need. So here are some dashboards. This is quite an early one. The problem with collaborative working is that everyone likes to stick their all in so you get a nice simple overview. Wouldn't it be nice if we added a number of instances? We should add the load and we should add the efficiency. Eventually your overview has got every single detail in it. This is not the best design dashboard, but it's one. You can do lots of things with it. This is a slightly cleaner one. This is intended for the users to look at their tenant usage. The workload shown here is actually me running stress on a bunch of new hypervisors before they're added into the cloud. 97% is not too bad for something running full tilt on CPU. We found a lot of interesting ways that you can use Grafana to display data as well. It's not just little line graphs. We found a situation where the OVS vSwitch daemon had a memory leak and it would grow and grow over time. Eventually, as I said before, Nova has this insurance policy of reserved memory. It would overflow that and start out of memory giving processes on the machines. The instances were dying, users didn't like it, we needed to track it down. Grafana gives you heat maps. The top one is a heat map of how much memory are some of the OVS vSwitch daemon using over time. The bottom one is an identifier for which ones have memory leaked. It makes it really easy to go around and remediate. Another monitoring interface we use comes from Arista. This is purely for the network switches. We didn't see any particular need to rebuild that wheel. It works fine. I don't know how well you can see from the back. One of the graphs there is showing a broadcast output rate of 6,000 packets per second. We were seeing that on every port in the entire estate. This is kind of strange. We looked a bit more closely and it was ARP packets. It turns out there was a bug in that particular firmware version on the Arista switches. If you had VXLAN encoded ARP packets, they would just be broadcast and be broadcast forever. The undying ARP packet. Fixed with a firmware upgrade as soon as we found it. So, we started with 100 compute nodes and we grew it. We also upgraded to the Newton version on the side-by-side version. This was to see how easy it is to run two separate open stacks on one set of network switches and one set. The answer is quite easy if you are careful. That was a good experience for the sys admins. We only met a few users on that as guinea figs to check if we built something that works. The common production system is running Pyke. That's readout, open stack 12. We were really pleased to see a lot of our customisations and tweaks, default changes. They're no longer necessary. Things like file handle limits have been vastly increased so that particular bitfall is out of the way. One thing we did find, the side-by-side migration took quite a long time. It's fair to say weeks or months. For some reason, we couldn't get live migration working in our Liberty installation. So, we had to do a lot of negotiation with users. I need to take this instance down. I need to migrate it. I need to either shelve it, unshelve it, or can you just kill it and rebuild it somewhere else? And the users are generally quite accepting of that. But it did take time. Something else we found having jumped quite a long way forward from version 8 to version 12 is that the overcloud services were, by and large, containerised. This means there were different ways to manage them, different ways to debug them in particular, different ways to customise them. One thing we did implement was that tenant networks could be done as a VLAN rather than an encapsulated VXLAN. That's using the Rista ML2 plugin. We still haven't gone forward for many purposes for that. It's complex. There are extra moving parts and there are bugs. So, we're using provider networks, which we'll talk about shortly. Having had some experience of using non-CPU commit instances, we turned on CPU over commit. We've done it by host aggregates, so there's an aggregate for... You get everything you're asked for and there's an aggregate for... Well, depends what the neighbours are doing. It might get a bit busy in there. And the users really like that. They know exactly what they can expect. We enabled jumbo frames, which solved a few problems around Docker and encapsulation overheads, but it brought a few other problems because the rest of our Sangria internal network, so a few ugly corner cases there. For me personally, one of the biggest time savers, instance live migration works out of the box. That's a brilliant win, you can say. Nova, evacuate host, come back in five minutes. The instances are gone. There's somewhere else. In terms of Chef growth, it went absolutely ballistic. We were really surprised. Possibly as a result of our history using iRODS, that's the integrated rule-oriented data storage system, it's a bit of an object store with a lot of metadata on. As a result of using that, then our customers, our users, knew exactly what put and get rather than posix means. So they were able to use the S3 Radoff Gateway interface to Chef quite happily. Currently, we've got 1.3 petabytes of S3 objects, a little bit less than a petabyte of Cinder volumes. That Cinder amount has really gone up in the last month or so due to one particular project. Before that, it was probably 90% of the usage with S3. So there's our Chef, it looks quite happy. This on OSDs is quite a lot, some people say. That's not very much, some other people say. This is Chef on a good day. I think it's running a benchmark, that looks like an artificial workload. Maybe you can see from the back, we're nearly peaking at 15 gigabytes per second, which I'm quite happy with. This is Chef on a bad day. We had electrical maintenance. For a variety of reasons, we lost two racks. Chef did not like that. The graph on the top is showing degraded, misplaced, unfound objects, what you can't see is this is running around going, what happened there? But to be fair, once we plugged it all back together, it was finding it. Another thing we did was physically relocate the entire system, mostly driven by power constraints. The original data centre all I said was about 20 kilowatts per cab. The new hall we're in has data called cabs. We can go to over 35 kilowatts per cab, which is quite a lot. We've recently added another 67 compute nodes and we've got another 32. When that's gone in, that will be about double the original size. There it is in its new home. That picture is slightly out of date. There's another two cabinets going on to the left-hand end since that photo was taken. That's the back of one of the cabs with the door open. You can see the plumbing coming down the door for the hot and cold water. OK. Some little bits and pieces that we did ourselves. Chef is great. It has an S3 interface. That's great. It pretends to be Amazon. It's really cool. It's not quite the same as Amazon's S3. There were some subtle differences. ACLs are there, but in the dual release that we have currently, bucket policies are not. We had a developer that wanted to allow it effectively, an anonymous upload to a bucket. We used pre-signed URLs. He had a JavaScript front-end on his web interface. He could generate the URLs. Fantastic. It doesn't work. I personally had first-hand experience of working with Chef upstream, with Red Hat, getting the bug fixed and pushed through to release. That was really good to see everyone working together to get our patch accepted and taken. That's just a diagram of what happens. The user is talking to the server. They get a signed URL and they can put to that URL without any other permissions on the bucket. As the usage took off, we started to wonder, we've made this S3 interface available to the world. What are people doing with it? To be fair, most people are using properly authenticated buckets, but we found quite a few publicly open and, on one occasion, publicly writable. So we developed our own audit system, a little monitoring thing. Every night it goes and checks all the buckets. Are they publicly readable? Are they publicly writable? Better tell the users. It sends them an email. Provider networks, we use quite a lot. For anyone who hasn't met them, it's basically plugging a vlan into the back of your open stack. So you can hang anything you like off there, whether it's a file server, a database, some piece of hardware. We're using it for several things. Secure Luster, I'll talk about in a moment, is a multi-tenant, isolated version of Luster, which we developed with DDN. We've got networks hung off the back of our open stack with scientific instruments connected to them, sequencing machines, microscopes. The data comes off the machine and comes straight into servers on open stack. Farm 4, that's the fourth incarnation of our LSF batch compute farm, is actually virtual. It's running as instances inside open stack. They're plumb to a provider network, then on the Sangre internal network, they're not hidden behind a firewall. We have had some interesting niggles with security groups. If you try and deploy an instance with a port on an ordinary network, a tenant network, and a provider network, and you've turned off port security on your provider network, it doesn't work. You have to do it in two steps at the tenant port, then at the provider network port. My personal concern, provider networks, it's a hammer, but you're hitting a screw. You can solve lots of problems with them, but you're stepping away from the software-defined network because you have to define all these by hand, at least in our experience. We might think about using it for faster S3 access, but we've got other plans for that. I mentioned a secure luster. We worked in collaboration with DDN to use two features of recent versions of luster. That's where a subdirectory of our system can be exported, and user ID mappings can be enforced for particular clients. You can build something like this. You've got the luster file system over on the right-hand side, the dark-coloured servers. There's a luster router which knows which clients are allowed to go to which subdirectories of the file system. The tenant networks come into the router, so even if a tenant is compromised, they're not on the other tenant's network, so they can't send the packets from the right place, so they can't even see that the data belonging to that other tenant exists. We're really happy with this as an isolation for multi-tenant luster. But what's it all for, said my wife? So we've got lots of bits and pieces running on OpenStack. Some of them are used by the IT department, things like CI runners, Farm4 I've already mentioned. Some of them are run by the scientists, of course. That's why we built it for them. We've got things like the Mutational Signatures project, which I'll speak about in a moment. Human genetics informatics people are running Avedos, that's a location-aware data scheduler if I've got the jargon right. We've got a project called CellphoneDB, which, despite the name, is nothing to do with cellphones. It's about cellular phenotypes, so I look at this cell, what is this cell? They've got a database of different cell types running there. CloudForms is an orchestration system for less technical users. It's a bit like click by my instance. I'd like to buy a database server, click. There it is, it's ready for you. And various other things. The Mutational Signatures project in particular speaks to me. This is becoming something along the lines of personalised medicine. You can take a person's DNA sequence or a fragment of it and feed it in, compare it with known changes in DNA caused by tobacco smoke or ultraviolet in sunlight or lots of other carcinogens or pathogens. So this is a step along the road to personalised medicine based on DNA sequencing. And I think this is the future. If you'd asked me 20 years ago, can we do this? No, no, no. But it's today, it's happening. You may have seen things like the Minion, the very portable USB connected sequencer with one of those in the doctor's surgery and one of these available. Who knows what might happen in just a few years. Other things that people have built. This is our studio running in a web browser and it's personalised. One of the users without any knowledge of OpenStack can come to a web portal, they can log into this. Personalised instance running on Kubernetes will spin up inside our OpenStack and they've got our studio and they can analyse data, they can upload, they can download and it's all hands off. There's no management, which is brilliant. One thing I didn't mention before is a service we called Science as a service. We used CloudForms orchestration on top of OpenStack and VMware to give an orchestration service for spin out companies because something that Sangu is trying to do increasingly is to make use of DNA. It's all very well sequencing stuff, but if the data is there, we need to actually use the data. So we're looking at spin-offs for how can we actually use this data and we thought, okay, we can do this, we can provide an orchestration service and we had a pilot implementation and the initial users seem to like it quite well. A project that's taking up quite a lot of our time recently is the Vanguard project for the UK Biobank service. UK Biobank is a medical data source. The Vanguard project is for 50,000 whole genome sequences to be done over 18 months. The full project, which is not yet decided who will be doing it, but fingers crossed we will, is 500,000 genome sequences and that's quite a lot of data. The machine that looks like a washing machine is an overseas 6000. It can produce six terabases of sequence over two days. Naively let's call the terabase a terabyte and you can do the sums as well as I can. This is a data flood. It's not going away. We've got dozens of these machines. I'm pleased to say we won an award for this project as well. Just the other day Chap on the right is my boss's boss, Tim. He was collecting our award for the best use of HPC in the cloud at supercomputing so that's really nice to be acknowledged like that. But then we stand back and compare and contrast. 25 years ago the Sanger Centre as it was then was founded and it took 10 years to do one genome. Now we're doing how many in a day and it cost how many billions back then and you can do it for barely thousands these days. I think you look forward to things like the Darwin Tree of Life project which is to sequence 66,000 different genomes in the UK of different animals, birds, bacteria and wider afield the Earth Biogenome project which is going to sequence 1.5 million different species. Maybe we can solve everything by sequencing. Maybe we can't. We're going to find out one way or the other. But you can't do all these things. They don't magically happen without helping people, without teaching people. So education is a part of that. We've done some in-house courses. We've done some external courses. We've sent C-submins on Ceph courses, OpenStack courses. We've sent users on HashiCorp courses for things like Packer. We've had bespoke training written for us. It's all pieces of the jigsaw. It all fits together. Different people learn in different ways. Some people learn just by reading the documentation. And it's fair to say there's an awful lot of documentation out there. Some of it's good. Some of it's bad. But what I found carries the most value is to have in-house documentation describing what will work on your system. If you do this, it will work. It's supported. If you want to try something else, great. Have fun. But at least here's a starting point. So that's the front page of our Confluence Wiki. If you start here you can find out almost everything we know. Some people don't read documentation. That's fine. We can talk to them. We have regular coffee mornings. We bring the scientists in. We talk. We have coffee and biscuits. We present. We talk. We complain. More coffee. User engagement is absolutely the key for some people. They won't go and read documentation. They won't use it if it's not, dare I say, handed to them. But once you've explained it, once you've helped them they're always off and running. We have a Slack system for internal communication which some people love against some people hate. It's great. You can get a really fast turnaround on some problems or an early heads up. You can get buzzed on Sunday night when there's a problem. Maybe that's not so good. Looking outside, the wider community. We're all here. It's an open stack summit. They're great. I've been to lots of different conferences and I have to say the open stack summits are by and large most worthwhile in terms of meeting useful people who know useful things. There are special interest groups. I'm in the scientific special interest group and I see some of the members here. It's lovely to see you. An open stack operators mailing list is always good for little background nuggets of information. I think that's soon to be renamed open stack discuss. That's where we've got to. That's how we got there. What's next? So, yeah, it's an upgrade. Everyone's upgrading, always upgrading. We're looking at Queens because we want to be on LTS releases. Despite what Red Hat was saying the other day about releasing every three months, I think that's kind of enthusiastic, kind of a bit too speedy. We think we can probably get away with staying with LTS. But the crystal ball is cloudy. Who knows? We've already started doing test deployments and again we'll do the sideways upgrade. We'll install Queens side by side with Pyke and migrate the users across. One thing we are going to do differently is to have dedicated network nodes and we hope that keeping the tenant networking separate will prevent some of the rabbit MQ problems we've had. And we're trying to move the customizations out of Ansible into the deployment template again with the hope that that will reduce some of the, oops, the deployer just overwrote my changes. We're going to add some new features and we're trying not to turn on the shinies just because they're shiny. Barbican has been asked for for a long time. The particular thing that enables is encrypted volumes so data encryption at rest for some data access agreements that's something we've signed up to. If you can't encrypt the data when it's sat on disk you can't have the data. So some people are doing that with things like the Linux looks feature at the moment or other encrypted file systems. Having it in the platform will be a lot more convenient. I'd like to offer Manila to users but unfortunately our Cep is not yet luminous and won't be luminous in time for this upgrade so that will get pushed back. Octavia we're looking forward to for the few odd use cases that the existing load balancer as a service doesn't cater for. And Sahara, I'm not sure. People talk about Hadoop. A few people play with Hadoop. I'm not sure whether we will see enough widespread take up that it'll be worth our while. The Cep upgrade, it's an upgrade will go from 10 to 12 bug fixes features that will make it capable of stable Cep FS in the future. For the back-end storage format we're going to go from file store to blue store. That's the XFS file system based to the binary database on disk. That's going to give us hopefully a really good performance uplift. If you're here at the Cep day on Monday you'll have seen Stig Telfer saying we've got two times, four times, maybe six times improvement by going to blue store and we can certainly use some of that. We'll do an in-place upgrade. We think that one will work nicely. We've got tests underway at the moment and we're quite confident. In some ways it's a victim of its own success. When you have an adult conversation in the corridor and find someone who's running a tier one vital external facing service on your open stack and it's like okay, that's good news. I'm glad you're enjoying using it, I wish you told us. We're looking at things like disaster recovery and business continuity. We've got small open stack and Cep clusters at the Janet shared data centre in Slough. We're looking at global load balancing. We can do proper load balancing across two sites or we can do active standby. We don't know yet what we're going to do about identity management or federation. Shall we run the second keystone down there? Shall we try and copy the database? Shall we do something with cells? I don't know. I need to talk to people. So if you've done something like this, I'm all ears coming to talk to me. We also need to investigate Cep replication. I know there are things in the rados gateway that can do that. That's probably a way we'll be going. We're working on persuading the scientists that this is the data flow model that we want them to follow. A scientist who is used to a POSIX file system can just analyse in place if you like. We try really hard not to let them analyse on NFS because that's not great for your NFS file server. Lust is okay for that. So what we're trying to get is object storage, iRODS S3. We don't mind. Pull the data out, analyse it, push it back. Pull it out, analyse it further, push it back. Then push it into some sort of archive when you're finished. Some pipelines, particularly the ones that are being written from scratch are absolutely doing this and it's working well. Anything else that's above my pay grade is evolution and federation of data analysis. There are lots of clouds out there. Some of them are public, some of them are private, some of them are hybrid, some of them are collaborations between institutions. The Global Alliance for Genomics and Health is looking at standardised APIs and standardised ways of advertising. I've got some data. They call it a beacon because it's signaling here's the data. If we can have standard ways of getting data analysis to the data by shipping an image rather than shipping terabytes or petabytes of data, that's got to be a win. This sort of thing is already starting to bear fruit. This is a slide from a colleague of mine about some cancer genome analysis. Maybe you can read from the back at the top. 2800 different homogenome sequences needed analysing. That's quite a lot for anyone's site to do. Okay, let's share the analysis. This is great. We can package it up, either as a docking container or an open stack image. We can ship that ram. That's easy. That's megabytes. We can't ship the terabytes or petabytes of data. It's working. I'm no mathematician, but at the bottom I can see happy scientists in the sum. If happy scientists are there, then my boss is happy. To summarise, it sounds a little bit trite, but open stack is opening doors for us. It's opening so many new possibilities, so many different ways of data analysis, so many different ways of working. Of course it brings challenges, it brings pitfalls and speed bumps, but we're working with it, working with other people. I think touch wood is all going well and will continue to go well. I finished a couple of minutes early. If you have any questions, that's great. Otherwise, my email address is there. We've got a blog. The slides will appear there shortly. Thank you very much.