 So welcome to the stock. We're going to talk about how to use OpenStack in a research environment and also how to manage OpenStack with a small team or with one member team, more or less. I'm Hans-Refis. I'm from Nautralis Biodiversity, which is a... Oh, it's not working again. From Nautralis Biodiversity Center, which is a research center based in the Netherlands. At the website it says, at Nautralis Biodiversity Center we want to describe, understand and explore biodiversity for human wellbeing and the future of our planet, which is a very nice way to say it. But we do research on life in a very broad way. And so we do DNA sequencing to geology. All the parts are there. We also have a very big collection of objects. We have a very big tower with many, many insects and stones and whatever you can find. And we also now, since I think a few months, we have a TRX, a real TRX, another replica. I mean, I still like replicas, but in real life a replica is good. So we are a first company or something outside of the U.S. to have a real TRX. So that's quite cool. My job is being a sys-app and a dev-oper at Nautralis. And I've been working at Nautralis for about five years now. And it's been nice. Our current infrastructure, just to get some context of what we're now to, is we have a high performance of a cluster of 35 nodes, a backup cluster for backup to disk of 18 nodes, and still an older OpenStack cluster of 18 nodes, and still an old VM where a cluster is still running. Our default configuration is OpenStack with Cephas storage backend and instance boot from local disks. And like four years ago, we were just a simple IT organization within Nautralis. We just had some Windows service running on VMware. And IT was not really a big thing at Nautralis. It was just a service for people to do some stuff. And then came a change, and IT became big within Nautralis. So we wanted a better infrastructure. So we thought about getting a change into our IT infrastructure. And it started with providing compute to scientists. Our scientists, some of our scientists use quite intensive calculations. And the solution in the beginning was just to buy a big enterprise workstation. And so they can do the calculations on there. But they tend to break their computers and install software where it doesn't work or whatever. So it's really, really hard solution to support. It takes a lot of time, and it's also very efficient. So you have like one scientist who has a desktop workstation, and then uses for 24 hours in a week. And then the rest of the week is just behind the desk. So it wasn't a very efficient solution. So we thought of different solutions, and one of the solution was Obstec. And why Obstec was a good solution is, since it's a cloud system, it's easy to share resources. So you have your workstation, your desk, you have some servers in the cloud or in your data center or wherever you are. And you have a more or less sandbox environment, so they can play around with their OS and destroy or whatever. We don't care about it. They just can play around with it. And it also gives them a lot of freedom. You can play with multiple instances and et cetera, et cetera. We also considered at that time, it was like three years ago, I think, maybe four, Cloudstec and Open Nebula. But at the moment Obstec had the best feature set. And at the moment I say it had been a good choice to go for Obstec. So getting your science department into Obstec is not a trivial thing. First to get some context is that the word HBC can mean quite a lot of things. And the most talk here over high-performance computing at the summit has been, I think, about parallel, big cluster calculations. But at natural, most scientists use very CPU-single-threaded high-memory instances. So on parallelization it's not really a big thing at all. Because they want just a high-performance desktop in the cloud to do their work on. So that's how we also, with that in mind, we also created our first Obstec. So reasons for scientists to go with Obstec is they can share resources more easily. So it's dropping the cost of your systems. They have a lot of freedom. They can do what they want. They have access from anywhere so they can do it at home or somewhere in a jungle or whatever. They can connect to the cloud if they have internet. And also I think it's very important part is they can replay their calculations since you can use the API or cloud in it or whatever. You can recreate your research part in a totally exact way. Downsides of using Obstec at the moment, at that moment, and still is a bit, is that scientists also need some knowledge of computing also, of how to install an operation system. You can help them with that, but still if they have a Linux command line shell, it can be hard for them to work with that. So how to help your scientists in getting them working with Obstec is you really have to teach them, help them quite a lot. So simple things like creating a SSA key pair or whatever, concepts they don't understand. So you really have to help them learn this stuff. And they're not app developers, they're really just scientists. We want to do science on biology or geology or whatever. And you have to create a really good documentation also to go with there. And also we did a lot of workshops, you know, just getting the people there and do their works and help them for a whole day with getting into Obstec. And I think maybe one of the most important things is to recruit some key users of some early adopters who are scientists who are already more or less on the IT side of science who can, well, make amazing things. And also help other less experienced users into Obstec, since as IT guys we can help them, you know, with installing software, etc. We don't know a lot about software, of science software, so I don't know how to configure and how to configure rightly a DNA sequencing set, whatever. So you need these certain set of users to really get them into your Obstec. And what was still missing and is a bit is, well, 3D is still, you can do really like 3D or reduce GPU stuff in Obstec now, but it's not really out of the box yet as far as I have seen. So that's the still thing missing in Obstec. And we do actually quite a lot of 3D stuff at Naturalis. And also, and that's in the biology science software is that the cloud-ready software, which can run over multiple nodes, multiple clusters, etc. is still not there or using like one GPU full-time. It's just not really fully cloud-ready. So these are still downsides of Obstec at the moment. But I think we saw in the last three, four years, people still using more and more Obstec, and they really, in the end, really like it. They also have the ability to boot a Windows instance, and they can just point and click like they do on their desktop. So they have all the freedom. So the second part is about maintaining your Obstec. Now you keep it simple so you can manage your Obstec more or less on your own as this happened. So we started two years ago. Well, three years ago we started using it for science. And now we are almost ready moving all our workload. There's also for applications, web servers, and all the stuff also moving all it to Obstec. So we're almost there. We still have a few VMware running services, but we're almost on there. So Obstec has grown from, I think, 10 nodes to 17 nodes now in about three years. And that's going to be a lot more in the future. So some tips and thoughts, I think, are really good for making your Obstec easy and easy to maintain. It's, as an IT nerd, it's very interesting to get some really nice configuration, but I think it's very important to keep, to get away from the really exotic situation, just choose the general defaults, which most people use in Obstec, which may not always be the fastest or the cheapest, but just, you know, like for Hypervisor, go for KVM, or maybe Xen, and not for some strange Hypervisor provider, which may be really fast, but we run into trouble in the end with getting some port of help from other people. Also, important is to get a really sensible configuration. Just look, you can do a lot of things, and you can, for example, go for HH, for high available controllers, whatever, but you really have to think about, especially if you just look at science, do I really, really need that? Does my science department really need 24-7 availability of the Obstec? So if you remove high availability from your cluster, then it just gets simple and gets easier to debug and easier to maintain in the end. And also, do you really need shared stories for booting your instances? We almost boot all our instances to local disks, which is way much cheaper and also way much faster, not always, but in general. So really check and give a really good review of what you really need in your Obstec. Also very important, especially if you go from the traditional VM where mindset is to get rid of erosion pads and beat your kettle. So I think most people are familiar with the pads and kettle thing. So try to get rid of your very big windows installations, which takes you lots of hours to set up and try to split your servers. So if you have a window windows that does like many things like web server and file server, whatever, try to make them in smaller pieces and make them all different servers. So that's a very important thing to make your life easier with the Obstec admin. Also very important is go for configuration management. Tools like Puppet or Chef or Ansible or CF Engine. You have a really big array of configuration management tools. Obstec is really a perfect problem of a platform for working with these tools. And we almost have our full infrastructure into a code, into where you use Puppet. And we can recreate our whole infrastructure just replaying this code. And one thing I think we learned about using configuration management is that you have to invest time to make it really good and really start using it when you know it's really good don't do off work. And then it's going to be a pain in the ass in the end again. So when you do it, do it really good. Also, that's maybe a little bit from the other slide, try to go for a system that don't depend on a high available storage. So go for web servers with low balances, et cetera, et cetera. So things like that. What also helped us to make life a little bit easier is to use Obstec API. So you have, of course, a horizon dashboard or the command line tools, but you can also write scripts upon the API to help you manage your system. For example, we use it to create flavors and create projects and to assign access to projects. And also we use an API with a synchronization for Active Directory. So we don't use the direct LDOP connection, but you just keep it a little bit more simple. Since if you have a lot of users and somebody goes away from a company, you generally forget to remove them from Obstec, et cetera. So it's really handy to use these tools to make it easier. I have some examples of that. For example, if you, well, here are some YAML configuration of creating a flavor. So we just define how many CPUs, how many RAM, and how many disk. And the nice thing is we put these configurations into GitHub. You can refer back to previous configurations, if you have just a change management of how you change your Obstec. And this is the YAML file which defines the configuration. And here is a piece of Python code which creates these flavors. So it's about 20 lines of code. So it's really simple to use this API. We also use it for creating our projects. And this is the first speed of a YAML, and next slide is the next piece of YAML. So it just defines how the project is named and which groups have access to the project, and the quotas, and et cetera, et cetera, et cetera. So here's the network configuration, also, of how many subnets and fully a piece it can create, and the flavors which has access to it. And I don't have the piece of code over here, since it's more complex Python code to create it, especially the Neutron was quite different from the rest. And the last thing, I think it's really important to make life more easier, and I third-party support. We self-use Morantis, but it's totally up to you and up to your situation, which party you use, but it's, you know, in the end, you have always complex bugs or bugs who need to get fixed upstream, and if you totally, you know, deploy your OpenStack yourself, then it's really hard to get these bugs forward. We had in a older level of OpenStack or whatever. The first OpenStack we wrote, we did it ourselves with Puppet, and we deployed the OpenStack ourselves, but getting upgrades then was really a complex thing, so then we choose to go third-party supplier. One thing, though, is that I found that if you go for a third-party support, then, since OpenStack is a freedom and it's a vendor lockout system, if you go for a third-party support, it's harder to move to other parties, so if you go, for example, to Morantis, then Ubuntu won't support you. Then you have to really move your whole cloud to a different cloud, etc. So choose wisely. And then the question part, and first before I'll say one thing, is that you can also ask me questions afterwards personally. There's also a colleague of mine over here at the corner who also can answer your questions. So, are there any questions? Yeah? In most cases, the scientists, they use some computing, and then machine-reservation is on, but the machine is empty. So your question is, I think, is that how do you manage your resources on your hypervices in an efficient way when scientists only use for 10 percent of the time? And your first question is that, well, you first assume that the scientists don't pay for the hypervices. Well, in our company, more or less indirectly, they pay for the hypervices. And since at the moment we still have enough resources, we don't really care about inefficient resources. So in our situation, it's not really a big problem. But you can, of course, teach the scientists to delete their or snapshot their instances when they're not in use. Any other questions? Yeah, I know. It's also a solution. It's also a very good solution, I think, though. At the time, then, it was a more complex situation than writing your own scripts. So it's just how you use it. And I believe you also have to update your ID schematics to allow Keystone to afford, but I'm not sure. Any other questions? Well, we use a Merantz-based deployment which uses fuel as a deployment tool, which, in the end, is at the moment still Puppet. And our first obstacle was using Puppet. Yeah, the fuel scripts are written by Merantz, not by us. Any other questions? No? Okay, well, that was it then.