 Hello everyone, I'm Abraham Martin. I work for the University of Cambridge as you may have guessed from the huge logo on the screen. I was thinking about making a little introduction and I thought everyone knows what the University of Cambridge is so I think I have a little to say about that and I thought I can show you some pretty pictures so you can invite me where I live and this pretty place and where they work. This nice architecture, this nice rivers where you can pond with your friends during summer, well the two days of summers we have a year. The classic math bridge and you as you may know we have a lot of clever people there, some Nobel Prizes, people that usually work around the city dressed like this which are guns. It seems weird but you really see when you go there you really see these people working around academics. This seems pretty classic, pretty old but we also have pretty nice new buildings like this one which is the University, the new University data center which is one of the top data centers in the UK. It's pretty big, it's green, we have a lot of innovation inside it. We even have a HPC service which has in 2003 was the second most green computer in the top 500 so it's not just classic building and architecture we also have some cool things. This is the computer lab where I used to work. I did a well my relationship with the University started with the computer lab where I was doing my PhD and then I worked there as a postdoc. This building is called William Gates building because Bill Gates paid for the half of the building and I work on this building now which is the University Computing Service where we provide IT services for the rest of the University. Both buildings share a common history. They used to be the mathematical laboratory where this machine, brownie points if you know what it is, it's the exact computer, one of the first computers in the world based on the von Neumann architecture was built there. We still have some pieces in the computer lab as a museum but also other things were built there and in the part that I'm working now which is the University Computing Service like Exim which you probably know about because still 50% of web mails sorry mail servers use it. So we have pretty cool people working there. I'm not of any of them but we have some pretty good people that also work in a lot of open source projects. So what I want to explain you today is one service that was proposed a lot of years ago which is the managed web service and was born of a problem. We have a lot of researchers in the University as you may know and a lot of them usually hold the conference, they do research, they want to do some simple website with some statistics, show statistics, show results from the research or either do questionaries etc etc. So they end up using their own web servers under the desk. It was a cheap computer, it was running under the desk, it was not maintained usually because the academics usually use that computer for the conference and then they left this computer under the desk, the software was not updated and then we get security problems. We get servers hacked etc etc that you know. So the proposal that the 80 services and the University did for solving that problem was centralizing these web services. So the solution was to provide a service where you don't have to worry about maintaining the OS or the software, you only have to worry about maintaining the web application. So we maintain the OS, we give basic web hosting capabilities like external services does, you don't have to worry about backups and you have some dedicated resources to your web app. So that's very very old and when I say in gold it's like 15 years ago. The first version of the managed web service was using a Solaris 7 running into a Solaris machine. So you can see that it was using a very old version of Apache PHP MySQL and it was using a true system to maintain the separation between these different web page. The second version that came sooner than the other one provided new software like Solaris 10, Apache 2, more new software and started to use Solaris Zones, which is a kind of virtualization inside the Solaris machine. It's kind of a container. So we were using containers before it was cool but it's still pretty old. It also had more enhanced features like database driven script so you could do in a script based on some information in database. So it's centralized, it's very to manage some needs and NFS server, very classic set of files system. So it provides also snapshots, which is good. And the users were able to create the host, ali access, et cetera. But the problem was everything is manual. So they had the users send us an email saying I want this and then we make the changes, we execute the script, the script made the changes. But everything is manual so we need a lot of human intervention. So when it started to grow, we currently now have more than 200 users and more than 400 websites. It started to become a little bit difficult to manage because it requires a lot of time. So before we ended up making a new version of the managed web service, another solution was in parallel, which is the Falcon service, which is a clone based. You only get a clone instance. You don't get access to a server or anything. It's just a CMS as a service. And we also have like 200 websites there. So if you go to any university website, you probably will end up in either Falcon service or managed web service service. So for example, one of the most visited websites inside this service is the Stephen Hawkins website. So we decided to make from scratch a new service. So restart what we have done because we don't have more Solaris machines. Solaris machines are routing and they're pretty old. We don't have a replacement for that. So we also thought, let's do more automation. We can do more automation. So it requires less time from us. So we decided to go to the classic dedicated VMs, but still maintain the same things that was proposed by the previous ones, like no root access for users and everything is maintained by us. I want to say by eyes, I mean by ansible because we didn't touch anything. But we will see that later. And so to end up these emails that comes to our inbox saying, can you please install these packets? Can you please install this? We created a web panel using Django where we delegate some power to the users. So the users can do things without having root access or anything. So architecture is basically a devian aid machine. We installed the basic packages that we have been installing up to now, like the Apache MySQL PHP, which is the most common feature of the month. But we have to support all the Apache available, like Maud Whiskey, if you want to install Python, Django, et cetera. We have a list of system packages that you can install and that are preapproved. So you don't end up with a machine with a lot of packages that you don't need or that are strange to need. And we give them all the power to do authorization to the sites, create the host, apply for domain names, install TLS certificates in the machines, do the buckets from them, password reset, power management, et cetera. So we give them the power to do a lot of things that we were doing before. So they have a panel. Don't blame me for the design. It's an in-house design. If you visit any Cambridge website, you will see that all of them look exactly the same. So it's just a panel with some options to manage your site. So when you create a site, you get this web panel based on Django. You have some options to create vhost, ask for domain names, et cetera. And you get an extra VM, which is a test server. So you can clone your production server to a test server. You can test things there without having to compromise your production server. So that's good, especially for people that have Drupal installed in the manager of servers when they Drupal. Really bad things happen. So you can test it before, and if it goes right, you clone it back to the production server. So architecture looks like that. We will go one by one. Let's see how we build it. Be aware that this is not a talk about OpenStack, neither about talker. So don't expect any of that. But we use all the, most of them are Python technologies, and we did that project in a few months, where it's still not finished. We are still working on it. But most of it, we have done it by using 1.2, 1.3 FT. So it's not much people. So the amount of resources that you require for doing it, although it seems like a huge service, it's not that big. So we have here the VM architecture here. The VM is separated. The VM service is separated from the rest of the stack. So we start describing the VM architecture. The VM architecture is just a VMware solution. You may know these VMware solutions. It's just ESXCI servers, and you can manage these XCI servers using vSphere control panel. And so we have a standard backup server where we do the backups, but it's not replicated. So if something happens, then we rebuild the VM and recover the things from the backup server. So the flow is easy. A user enters to the Django web panel, authenticates so we know who he is, and then he asks for a new managed web server. A host name and an IP64 are located to this site. The VM API creates a new VM. The VM API installs the OS. And when the OS is ready, Ansible is executed. And Ansible is the one that configures the whole machine. So we are using Ansible as a configuration management, and it does everything we need. For those that doesn't know Ansible, it's just a bunch of things together. They are easy to script, so it's very easy to understand what they are doing. They separate into folders, which is very good, and you can find the file that you are looking for, and there is separation of things in the different files that you can see. So it's pretty good to use. It also has inventory, so you can define all your servers based on dynamically or static, so you can have a file with all your servers, or you can inject the output from another API as a list of servers you have, or even the database, et cetera. So it's pretty nice. It works very well. And it based on playbooks. Playbooks is just a bunch of roles linked to a bunch of targets. So you have a role, and the definition of role is things you want to install in this role, in these machines that have this role, and then you have targets, and then you say, this target, these machines, I want to install this role, for example, a web server. A web server can be a role, and the web server role has a lot of tasks that install Apache, configure Apache, et cetera, et cetera. This is a playbook. As I said before, you define the host where you want to install things, and then you define the roles that these machines in this list will have. For each role, you have tasks, templates, which are changed to templates, scripts, handles, and variables. You can have also global variables or variables entering into the script. And this is how a role looks like. It's just a bunch of tasks inside the role. You can see that here we're installing packages. It's a Jamel file, as you can see, and it's pretty easy to understand what you are doing. So if you're working with more people, it's easy to modify the file, change the configuration, et cetera. You can see here that the templates can be used with variables. We use variables there, and, therefore, we use the same templates for all the configurations of all machines. You also have handlers, which are basically callbacks. When some function in Ansible is executed, then you have a callback later, and you can, for example, if you have updated the Apache configuration or your Django app, you can restart Apache, and the callback is cool. So this is from the VM part. We use this VM for infrastructure. We use the APIs. We launch it. We create the VM. Everything is good. After that, Ansible configures the machine, and then we can offer the service to the user. So if we start from the top of the stack with the authentication part, we have our own authentication. We use Raven. Raven is our authentication service, so you can see that we have a lot of services interconnected using a lot of APIs. It is based on a web of API, and we have to build a custom Django app again. But this could be substituted by any authentication that you could use. You can use the Django one if you want. You can link with your own enterprise if you want, et cetera. So the second layer is authorization. We have a kind of a Neldubbish service. It is called Lookup. And then what we have there is just a list of users and a list of groups. We can see these users if they belong to each institution they belong to, which groups they belong to. So the end user can configure the MWA server based on this list. They can search for another user, authorize them as administrators, authorize by groups, et cetera. So it is just a basic list. We use that instead of using the Django groups, because it is more useful for us because people are using the university to this Neldubb service that we have, so they create the groups there, and they are automatically updated if someone leaves, et cetera, et cetera. So when the user has authorized the user, or the user to enter to the machine or use the service, we need to still install the user into the machine. So we have another service over there called Jagdo, which provides more information about users. It is like user identity management. So we get from Jagdo a unique UID from the user. We need that, because if we install the same user in different machines, we still need to identify the files that belong to him or to her in different machines. So they have the same unique UID, and we use this unique UID in all the user installs. Users install using Ansible as well. It's installing all the VMs where it is authorized. And we have periodic refreshes to refresh the lookup groups we have authorized. So if the groups change, people different change, and we allow people to upload their own SSH keys, and the SSH keys is also installing the user configuration. So they can enter either using the password, which is checked with this LAP server, or either the SSH that they have installed in the panel. So once they can access to the Django panel and they have the user installing the VM, they can already access the machine. Everything is configured for them, so they can start using it. Previously, we had also another communication with the IP register API, which is on the bottom there. This allows us, this is another external service. So as you can see, we have the main service, and then we have a lot of other services that we communicate with, which provides the university a registration for Kamakak domains. So if you want to register a new Kamakak domain, for example, importantstudies.kamakak, you can launch an API. We launch an API request using the same Django panel, Django panel sends the request, and then we get the domain name alias for that site. So everything is configured automatically. The user doesn't have to worry about the internal processes. This API tells us if the user is authorized for that domain name, or if this domain name is already in use, et cetera. And the same API also provides us from IP addresses. We have to preallocate some IP addresses together with the hostname. So when the user requests a new site, he can access directly using the domain name without having to wait for a DNS refresh. So what we do is just preallocate some addresses, and then when the user gets their site, they can access using the hostname without waiting for a DNS update. We use two IP addresses. We have one address, one hostname as well, as a host address. So the communications for the host and another for the service. So if we want to move the service to another machine, we can do it without having to modify the hostname or the host of the machine. So we can separate what is the service and what is the host. And we can move the service IP. And you will see this is useful later. Additional, we have SSH SP records and DNSSEC. Anyone knows what SSH SP records stands for? No one. Good. This is to forget about that. I'm sure pretty much you all have seen this screen like that. And what it does is you can upload an SSH SP record into the DNS with your public host key. And if you have DNSSEC activated, you don't have to check the host fingerprint manually because the DNS server does it for you. It gives you, and the DNSSEC gives you the fingerprint. And when you connect, you check that the fingerprint is the one that is in the DNS server, which is secure communicating with you. And then you don't have to check manually if the fingerprint or the machine you are connecting is the one that it's claiming to be. So that's pretty useful. As I said, we have a lot of services in the same architecture. We have an inventory there, which is using another API. It's based on JSON API and pool consume. So we give to this service code base the data of all our services. So we can use it as an external database as well when we know where all VMs are. We know where they allocate the IP address they have, etc. So it can be used even as an inventory for Ansible or it can be used for other purposes. So as you saw, we have a lot of APIs, different ways of accessing APIs. We use SSH APIs, REST, non-REST, HTTPS, using JSON, non-JSON. But we have to deal with a lot of them in an asking way, because we don't want Django, the main threat of Django, to be stopped by that. So what we do is executing as a background processes, using either Chrome jobs, which is the easy way if you don't need the API executed after just after the user has launched the petition, or if you want the execution schedule, you can use salary and credits, which is what we use. Salary is pretty good for us, because it works pretty well with Django. It's very easy to configure. You just have to add, on the top of the function, you just have to declare that it's a shared task. You can use different templates, like this task with failure. You can define the number of retries. You define the template, so you can define if it fails to log something or send you an email, etc. So it's pretty easy to configure, and it works pretty well. And you can also execute Chrome jobs from the same salary. It's called salary bit, so it's different, but the jobs are just configured as it was a Chrome job. So it's pretty useful for us. It works well, and these architect, these salary, these APIs, and these services are supporting all of these, all Ansible-driven. So the changes are then in Django, Django stores these changes in the database, and then Ansible is executed, takes these changes from the database, and then it executes these changes on the VMs. So we have this service. We went to the community in the university. We made a workshop, and we said, we have this for you. We thought that you will like it. And they said, hmm, we will like it, but what about if the service fails? And then we thought, well, we have a backup. You can recover for the backup. It won't take too much. We create the VM, et cetera. And then they said, hmm, but I need an SLA if I want to switch to you, but we didn't have an SLA because we had a backup. We have a plan, but we didn't have to, we didn't thought about what happens if 300 VM fails at the same time, which requires a lot of time to recreate and a lot of time to take from backups. So some of the people were saying, hmm, we are thinking to change into MWS3, but only if you provide high availability. So lucky for us, we designed the application so it can cope with different VM architectures, which is good because you don't have to worry about the VM architecture that the VM architecture you are using, because you are creating the VM using an API, which may be this one that we are providing, or it could be an Amazon EC2 server. And then we execute everything through Ansible. Ansible only needs an SSH connection, so it's pretty easy. So we just need to replace this component, which is the VM architecture. So we thought, okay, let's update VMware with high availability. And then we saw that we need replicated vSphere, replicated storage, which we didn't have, and replicated storage for a lot of servers is very expensive to maintain because you need to do a huge file storage that is shared between all the VMs. So we had a lot of things pending from this architecture, so we thought that's pretty risky for the low time we have better take another architecture, because still it is self-suspensive to acquire all the hardware and software that we need. So we decided, okay, we don't want to maintain a huge share file system. So what we do is replicate each one of the VMs file system to another one. So we thought we can use the VMware, still use the VMware infrastructure. We had a pacemaker coursing, which is basically a cluster that checks that all the VMs are in contact with each other and then can change the service network configuration. This is why it's useful to have service configuration to any of these two production VMs. So we have a replicated VM, the second one, the second column is just a VM that is waiting in something fails to be changed and start acting as a reactive VM of the cluster. And then we replicated the storage individually for each one of the VMs using the RBD, which is basically a driver that sends all the rides of a machine into the other VM. So the storage is replicated to the second one and the pacemaker takes care that if some of the components fail the switchover is made automatically so we don't have to worry about it. But then we thought this is maintaining a lot of clusters. We may end up with one cluster for each one of the VMs that we're going to have because for each one of the VMs we will have to have a pacemaker cluster and this is very expensive to contain and it may fail. If we need to execute Ansible, we need to execute Ansible in both sides in the two VMs so they are synchronized so that's a lot of work and it's going to break quite easily. So we thought, okay, let's start from scratch. We move away from VMware and we decided to use Shen. Zen can be configured very similarly as the VMware. So you can see there are two Zen servers there. They are also executed pacemaker and but the difference is we don't do clustering for each one of the VMs. We do clustering for each one of the Zen servers. The Zen servers have a lot of VMs inside. So if something happens with one of the servers, the whole server and all the VMs that are inside one of the Zen servers that you can see on the top or the Zen VMs that have a Zen server are automatically migrated, live migration to the second one and you don't notice anything. Even the circuits are kept open. You don't notice that the switchover has happened. With the VMware solutions you will have to wait until we restart the VM for example and that kind of stuff. With Zen you don't notice anything. You just don't see that nothing has happened. So inside you have changed your VM from one Zen server to the other but it's completely transparent to you. So for doing that it's a bit more complicated because this is the file system that we had to use. This is a very complex file system where you have all of these. These disks are tied together in a physical volume and then you have the left one is the DOM zero which is basically the operating system that is running in the Zen server and then each one of these are each one of the independent duals Zen hosts. You have all the storage replicated to the other Zen server which provides this life migration. So we had a happy transition. It's working well now but the architecture is the same because we designed the architecture so we could change the VM provided but just changing the API. The API is a middle API so we only had to write an API for the Zen server and then execute everything was exactly the same. So we are happy with that. Seems people maybe happy with that as well. And we changed it from the VMware solution to the Zen solution which is three node clusters node clusters are in different locations. We can do life migration so the user is not in anything it's still using Ansible and we also use Ansible to deploy more clusters. So this is an example of the Zen server clusters. We can deploy many of them. It is easy to deploy because it's Ansible. So if we want to create more Zen server clusters it's just as easy as well get the machine the physical machine start it and launch Ansible so it's pretty easy. Let's talk one minute about security because we like security well we are I am not an expert in security we like to enforce security to our users so we then end up with problems. We decided to not use root passwords when we create the Zen host so the people we don't have to manage pass route password which is difficult to to secure a lot of through passwords in a database we don't want to manage that way so we usually only use K keys we connect to the machines using keys Ansible connect to the machines using keys etc. We have a separation of privilege for example we need to pre-generate the host keys of the Zen host the host needs to be generated previously because we need to upload the SSHFP records before even the machine is created so we need to have a pool of host keys that we can use in the future and we can install in the machines and we use user service which provides a useful interface so users can execute commands from root to users or other more privileged users based on some filtering and some templating and we also provide a TLS certificate service this one an additional one because we want to follow one of the a new initiatives from the FFF and the Mozilla and Acame which is HTTPS everywhere or Let's Encrypt Let's Encrypt is an open source CA which provides you with a free certificate for your web page HTTPS everywhere is EFF trying to force everyone to have HTTPS web servers and even the HTTP2 specification doesn't include the to enforce HTTPS but a lot of people are saying well when we move to HTTP2 everyone is going to be HTTPS that sounds really true but because the specification doesn't say it but all the implementations from Microsoft in Internet Explorer Mozilla and Google in Chrome only implemented HTTPUs to HTTP2 if it uses HTTPS so we like to test our servers I encourage you to do the same if you enter to the SSL labs you can get a qualification how secure is your web server which is pretty good because it gives you some hints of if you have any open open back all the specification all the version of open SSL etc etc and apart from security changing topics we use also some metrics and logging systems so we can have users we can give users some information about how their host works for example we use a metrics service which is basically a statsd and collectsd in each one of the machines installed also used in Ansible we have a cluster of measured brokers that get this information from all the hosts and then we have a cluster of carbon graphite which is stored this this information gathered from all the machines so the user can see these graphs in their panel in the web panel and they can see how the machine is behaving etc etc and we are now trying to implement logstash, elastic search and Kivana which also provides information about the hosting at the web server and how it behaves where do you have the the visits how it behaves during different periods of time etc etc you can have a lot of logs gathered by logstash stored in elastic search and then showed in Kivana so that's pretty much all I want to talk about I hope you like it thank you thank you Abraham and we have a few minutes for questions any questions yes why did you choose Shen instead of KVM for instance I mean what made you choose one thing and not the other this was a long discussion we had between within the developers which is basically three we didn't have any a strong reason a strong reason to choose one or the other we saw doing a little bit of research that Shen worked a little bit better with DIPD which is one of the main companies we wanted to use because we wanted to do the replication of storage from one sense over to the other so we saw that it was integrated inside the same Shen server we decided to go that way but we could have chosen KVM it was in the list of products that we had to research and decide hi thank you for the talk it was really fun but to balance I didn't really understand the subtleties in the last architecture because you had several hard drives and that you have several Zen servers actually overlap on several drives so were there virtual servers or yes so this the whole picture so this is the picture of the VM architecture on the top this is more the view of the file storage this is a single machine with a rate of disks and this is a file storage for a single machine so you have the physical volume and then you have the first column is where it used to be it's called DOM 0 which is the operating system that manage all the VMs when you access to the Zen server you access to this it's also a VM but you access to this and it has direct access to the hardware instead of the VM which have access to the hypervisor and all the other columns each one of them are a Zen host which are one of these VMs on DOM U as it's called in Zen and each one of them has the DRVD device which is basically a virtual block device which is replicated from here to another DRVD server like so you have each one of these DR device are inside this list of virtual block devices and they are replicated through the network in real time when they are written or read or only written because it's a sync to the secondary Zen server so you it's done automatically through the network when it's executed but the difference is each one of them has a DR device okay thank you okay we have maybe one minute if anyone has a really quick question okay please join me in thanking Abraham once again