 So, hello and welcome everyone to our presentation from Cloud and Heat, challenges and advantages of a highly distributed cloud that also heats homes. My name is Amir Feghi, responsible for Business Development Cloud at Cloud and Heat and together with my colleague, Stefan Schlott, we are going to deep dive into our concept of a distributed cloud infrastructure. So the outline of our presentation is, I'm going to give you a brief overview of what we do and who we are and then Stefan is going to present the general concept of our geographic distributed cloud infrastructure, also describing use cases, challenges and advantages and obviously at the end we give you some time to provide, to give us some any questions you have and we provide the answers. So Cloud and Heat, what we do, I mean, what's so special about us, well, our business model unites two markets on the one hand, the traditional heating market and on the other hand also the fast growing cloud computing market. So in case of classic data centers, they use additional energy to heat, to cool their servers and in our case, we use the waste heat from those cloud servers in order to heat up buildings and provide warm water. So this way, we offer an efficient green tech alternative for both markets, making us the green cloud from Germany. This picture illustrates the typical cost structure of traditional data centers. Of course, servers are set up in individual buildings and this way, no building or rental costs are incurred in our case. Our servers do not need to be cooled, so this way we save costs on energy and this together, we have a cost advantage of roughly 50% in contrast to traditional data centers. So this also has an impact on the environment and that's why we consider ourselves the green cloud. This is a picture of a multi-family dwelling, so the heater emits hot water and air and the hot water, the air and the heat is all stored in a buffer storage and then the buffer storage is basically used in order for heating and also providing warm water. This was basically a brief introduction of who we are and what we do in a nutshell and now I'm going to hand over to Stefan, who's going to present our geographic distributed cloud infrastructure. Hello everybody, my name is Stefan Schlott, I'm a team-leaved IT architecture and integrations at Cloud and Heat. I'm with OpenStack, I'm working with OpenStack since the cactus release, okay I will do the more technical part of the presentation. Because of our business model, the cloud, our cloud consists of many more smaller micro data centers compared to what you would normally get with a bigger public cloud provider like AWS or ReqSpace, because the size of the data centers is basically determined by the size of the basements we can place the heaters in or heaters our servers. Because of different sizes of the deployments and different connection characteristics, we have many heterogeneous data centers and right now they are all deployed all across Germany. To the right you see a picture where the blue bubbles basically mean the internet connections and the rest of the data centers. Because we are a fairly small company compared to HP or IBM or companies like this, we have a very small team when it comes to development and devils. So that's why we follow the integrator approach. We tend to use OpenStack software first when we have a problem. As an integrator, we try to pick the best of breed, so pick the best one which is out there. If there is no open-source solution, we also go to proper solutions. The whole thing about the integrator approach is that we are almost always trying to fix our issues, but we are probably not the first one having this issue. So there's probably already something out there which we can use like it is or we just have to make minor adjustments. These minor adjustments then we normally call the glue. So we pick the best of breed out there and just add bits and bolts to make it work for us. As I said in the beginning, this whole approach is not really by choice. It's actually a necessity if you only have a small team. So you can't really start from a green field often and you don't have that many developers to build something big on you. And it's also not a good way to do it actually. Here's an excerpt of the open-source tools we use at Cloud and Heat. When we choose open-source tools, we tend to focus on the fact that the tools are fault tolerant so that they give us less pain when it comes to operations later on. And they normally should also have features like scalability because we have so many small data centers and so we can easily switch between a local area network, local data center and a distributed scenario. Okay, there's a list of tools we use. The tool we probably use the most is HAProxy. We not only use it for load balancing, we also use it for even for some security features or some little bug fixing. That's quite versatile for such use cases. Internally for the databases, for the central database, or for all databases in OpenStack like the central database, we try to use solutions which have at least three replicas so that if something goes wrong, it's easier and quicker to come back from it. So we use MariaDB Galera cluster for the normal back-end database. We use MongoDB with Cylometer. We also use quite a lot of WebSource, Apache, EngineX and WSGI, not only for the OpenStack Appian points but also for our internal services we provide. Our internal services we normally provide or program with using the Python Flask Micro Framework. Yeah, that's on the next slide. So here's more details on how we develop our glue, so how we provide our glue. We tend to use a microservice approach. I know it's a buzzword but it's actually a good description of what we do. We try to do as less as possible when we provide a clue. The whole thing is a service-oriented architecture like OpenStack itself. We try to keep our services or middlewares as small as possible. The services should be loosely coupled, means that a service only does a certain thing and that does it good at this point. We use the Python Flask framework for it and normally our services or internal services provide an easily consumable HTTP REST RP. So by design these services are already designed for later distribution. So if they're initially run in the local deployment because of the design choices with the REST RP and so on, they are already designed for later distribution. So if you want to distribute them across more than just the local deployment. This whole approach also with this Python Flask framework which provides you a lot of flexibility in a way that you're not like for example Django which is also a bad framework. With Django you get a lot of batteries attached already so you have a certain amount of choices already made for you, what kind of backend you've got in the using and things like this. You only get a very small core and you can choose from backends for example. We could use MongoDB instead of an SQL based backend and things like this. It makes it very slim and very flexible for our use cases. The whole approach with using the microservices and Flask is also good not only for internal use cases but also for integrating with external partners. Last but not least, if you use this microservice approach you will have data probably but when you have data then you should take care of it in storing it in a full torrent distributed data store. Like for example MongoDB or Galera cluster for more data or if it's very important state data, limited size of data then you could use to keep our LCD or something like this. Some more details on monitoring we use. At Clon indeed we use open monitoring distribution, specifically mostly check and pay from this one. The system is based on a compatible with Nuggets so you can use a lot of the Nuggets checks already out there so you don't have to write new checks if there's already something out there for example RabbitMQ check or something. If there's no need to rewrite these checks you can just use them and the open monitoring distribution is a distribution which also provides some other nice monitoring tools like Nuggets or PNP for not just for visualization for example. One of the most important reasons we use CheckMK and the whole CheckMK system which is by the way also used by bigger German companies like for example the airport in Munich is that it's already designed for distributed monitoring. In the picture you see the top layer that's the centralized monitoring server and the lower layers are free deployments. Every deployment itself as a monitoring dashboard and monitoring infrastructure. From every deployment to the centralized monitoring there's an encrypted channel and via this channel you can get the Nuggets state data on demand if you want to. You can even go to the centralized monitoring and use the centralized dashboard and controlling all of the other deployment dashboards and get for example pictures of the state on demand via HTTP wireless encrypted channel. There's one reason why we choose this and the other reason is called the CheckMK BI module stands for business intelligence. This module helps you to aggregate the service check out of multiple smaller subchecks. For example if you have a complete LAMP stack with a lot of instances you don't actually care about the single states of the load balancer and some of your web servers or the database you only care about are still enough machines running for the whole system to be okay. You can do things like this so that operator only sees one value that's the important value. If you see this value is broken then of course you drill down and see what is actually broken and fix it. For authentication we use Keystone, no normal Keystone with the LAMP identity backend. In every deployment there are multiple Keystone servers running with an LAMP backend. There are load balancers in the deployment. We use OpenLAP distributed across our deployments. In the actual deployments we only have read only copies of the LAMP. So you can't really change user names and passwords and things like this in there. For the actual management we have a centralized Keystone OpenLAP master. There we have the user management where admins can use disabled users and things like this. It's attached to our self-service portal, the web console and the registration process. What we want to do in the future is we want to add a more advanced, more fine-grained user management. So for example we could provide temporary test accounts and certain smaller deployments where you don't have to provide a credit card or something like a temporary test account. So that's easier for customers to get a feel of our cloud without having to provide too much credentials. And we definitely want to add the support for later in this year. So we're one of the 30 cloud providers who want to provide the Keystone federated identity backend which was mentioned in the keynotes on Monday. Some more details on how we do Meet Ring, Meet Ring which leads to the builds of course in the end. Meet Ring is just based on deployment local Cinometer instances so they like a normal Cinometer basically just record and take the samples of the resource usage in this deployment. What we do is we do a hierarchical aggregation based on a location which would be deployments and time frames. Time frames would be hourly for deployment utilization, daily for usage and billing projections and monthly in the end for the builds, the monthly goals. When we do this we do, we wrote a little middleware which basically uses the Cinometer RP to extract the information. There's an asynchronous aggregation of the data. Then it's the middle state of the data and aggregated state of the data which can be replayed into the central database where it then goes up to our CRM system and the billing infrastructure and it has to be asynchronous because Cinometer can take quite a long time if you have a lot of data in the MongoDB and because it's asynchronous you also have this possibility with the replay. This intermediate data we aggregate from the Cinometer instances we store in multiple geographically distributed Swift deployments also for further fault tolerance. We use Cinometer not only for our billing purposes but we use the meters also for in connection with Nova to get an overview of how our deployments are utilized. Basically, the checks is nothing else but the static configuration file of how many resources you could actually use in the deployment, the maximum resources and comparing the current state. The checks are divided into static and dynamic checks. The static checks would always say, okay, we have always one of the instances which I'm capable of launching there in this deployment. The dynamic check would take into account the data from Nova. For example, if a host is down for some reason and can't be used for scheduling, it might be maintenance, it might be temporary error or something. And this whole check is outputted as a normal Nagyos check, a normal Nagyos check line and check-on-K check which then of course we can use with the centralized monitoring. I showed a few slides ago and aggregate the overall state of how our deployments are utilized or not. Okay, now I'm moving more or less from the internal view to the customer view where we'll show advantages and also challenges. So that would be one advantage to get a higher fault tolerance of your application if you run it on our cloud and distribute the deployments. It can help you to basically avoid problems with which you would have one of the local deployments like power outages, some kind of failures, performance degradation. In order to do this, your application needs, of course, to be, needs to be cloud aware. So there need to be some, the application needs to have some knowledge about that it's actually distributed. That's actually deployment which is not the same as the other instance it's running in. It's at least very helpful if you design your own application from scratch. An example, a popular example for this kind of application which would make use of this is a photo and website or web application. You could think of a website which even needs a state, has a shared state that caused a wide area of replicated MongoDB or Calera cluster, for example. That's the state shared. And the rest, the actual website rendering and stuff is a state less. And you can distribute it across multiple deployments. And in the front, you would have something like round probe and DNS or like a failover system at the DNS level. So if one deployment goes down, then you actually have not, the website is still online, so don't take it down. We are well aware that if you have companies producing websites or building things like this, they are not necessarily the companies who also take care of distributing their infrastructure themselves across different deployments. So for that reason, we also provide a product called App Elevator which is basically a platform as a service which we automatically deploy across different distributed deployments in our use case. We are partnering with Cloud Control from Berlin who provide a wide-labelling platform as a service, which you might have heard of them because they are the guys who bought the .cloud business from Docker and currently running the .cloud business. So we are basically combining the knowledge and the proven ease of use from the platform as a service, from the provider who knows what he is doing on this level. We are providing the infrastructure as a service level. We connect the deployments with a virtual private network, so two or three distributed EIS deployments and on top platform as a service is running. Cloud Control, our partner takes care of maintaining the platform as a service, so service updates and so on and so forth. They have quite a good level of automation, so if there is a new bug out there, it is usually fixed within the day. Also good is that they are compatible to the Heroku build packs, so they are compatible to Heroku platform as a service and they enforce the user to use a stateless model. So when a user designs a website or provides a website, it has to be stateless in a way that he has to use one of the add-ons like MySQL or MongoDB to actually use the state. The application itself is stateless, so they say they can always just kill the application and re-run it in another container host. And when a user uses MySQL or MongoDB, he doesn't have to take care of the application because in the end, you just attach this to an endpoint and the actual data is already replicated, so they use the platform as a service provider or we provide already a replicated MongoDB or a replicated MariaDB cluster. So the one who designs the website only has to take care that it looks nice and can push his stuff in the Git repository. And it's also enforced a model like development staging production, things like this. There are some links down there. Okay, now I'm coming to one of our biggest challenges actually. I said at the beginning, we normally have much smaller deployments or not that much anymore but smaller deployments than what you would get in a normal cloud provider like AWS or REC space because they have huge buildings with a lot of service. And we are basically limited by the space we get in the basement. The basements are getting bigger, so the data sets are not that small. So for most customers, you only want to spin up a few instances. It's not a big issue. At the time of registration, we kind of do a static load balancing. Few customers go there and a few customers go there and the deployment, they all have the same feature set. So they don't necessarily see where they are or what they're doing with it. For bigger customers, we kind of give them awareness, which are deployments they can work with or even if they ask us. If you have a bigger customer, like for example, who wants to send up a huge batch cluster, a lot of instances, then of course it's an issue because all our deployments are separate regions. So they all have a set of a non keystone catalog, all the separate endpoints. Apart from the authentication data, the LDAB in the back end, there's nothing shared. So if you want to spin up a lot of instances, that's less resources addressable via a single API endpoint. But there's also solutions to that. Just as an example, the batch cluster, the systems which kind of built for this as well. So you can deploy a batch cluster across different deployments in hierarchy, for example. So that you would have the batch masters talking to each other. That's because the whole batch systems in the scientific computing part comes from a global community where they would have data senders all across the world. So the data senders are bigger of course, but it's quite similar to what we provide in terms of the distributed data senders. Or if you would be the normal web application developer, you could use tools like we provide the app elevator. Another issue might be that you wanted to have instances in different deployments and wanted to have them layer to local connectivity. There we can provide VPNs linking deployments. So you can actually use the local IP of the other instance in the other deployment. And it looks like a local access. That's of course not good if you have a lot of data you want to push through. But for such use cases, you will probably use distributed data like MongoDB or Galera across the deployments. Or you would use some public API endpoints to interchange data. For example, a Swift endpoint or distribute a message queue or something like this. Okay, that's another challenge or question which often comes up with customers. If they want to integrate a bigger workload with us, it's data distribution. So data distribution is like, for example, if you want to input data for a batch or output data for a batch or if you want to distribute images, you have prepared yourself. When this question comes up, I often ask if it's really necessary. So the customer should have a look or they should have a look and assess or even reassess your workload or your architecture of the whole system. If it's really necessary that you need to distribute one and the same data across all deployments. Because quite often the workload you might have is capable of parallelization. You could run different workers in different deployments. You could chart your data according to where the workers are running. And if you need to transfer data, it might be possible to have a look at the data and minimize the data before you need to transfer it. So you could just aggregate what you really need and let the rest of the amount of data stay in the deployment where it was computed, like for example, what we do for the metering. So we only transfer what we really need up to the central metering server, the billing server. Also one other thing, if it's possible, you do not want to distribute images or instances for that matter. What you actually want to have distributed is the data you work on. The images or the instances are normally only the framework like there's an Apache running or something like this. And for things like this, so that you always have the same environment running, you would normally use a limited image catalog saying, okay, in every deployment you have the same Ubuntu version, Cloud Image or CentOS or something like this, but where we're limited. And then you would take use automation tools like CloudInnit and Shaf and Puppet to stand up your environment. So you have an environment and then the only thing you have to distribute is actually the data you want to work on. The data you want to work on is then normally stored in a distributed data store, like I already talked about MongoDB a couple of times or Galera, which are per definition at least MongoDB is already perfectly designed for wide area networks. Yeah, and the last sentence more or less already told you, so that you really have to take care of what you really need to transfer and think about it or think about it twice. Okay, that's an advantage which you could get. Of course, you need some application that could work with it. You could parallelize your workloads across different deployments. You could get a much lower latency possible compared to the bigger cloud providers even if they sit in next to the big internet connection points because we sometimes have deployments which are connected to the local service provider ring. And if the actual customer is closer to the service provider ring, then the latency is lower. You could make it up to the extreme that it's a private cloud or hyper cloud approach that you basically have your servers running in your own building. Then of course, the latency is like a local area network, right? What we can provide, we can provide a mixture of private cloud and the public cloud. This is an example where you would have centralized building across the deployments. So in the top, there's a centralized building server like for normal public cloud. Then you're in the next layer, there's different deployments. So left, there's a separate deployment, separated deployment for a specific customer. We can do this because our deployments are usually smaller. Or it's actually a deployment which is running in the same building as a customer. And to provide the normal deployments, the normal deployments are attached to our standard authentication data back end. And the separated deployment would be connected to the own, I don't know, Active Directory URL of database, which could also be merged with the external public cloud database. If you want to have a mixed setup, like for example, external people working for you, then you can mix it with the public cloud database. Or you could totally separate authentication with your own database. But in the end, with the cyber approach, you could get your builds from normal deployments, and you're separated to run all in one build at the end of the month. Okay, because of the smaller deployments, we actually also have a much higher flexibility. We can change out, also because of OpenStack, can change out the software back ends. For example, for SIN, for block storage, or for some of the other systems, we can switch out the hardware. So we can tailor the whole deployment hard in software much more to the workload, which a customer might run on. And we can tailor performance versus cost effectiveness. So if it's a workload where it makes no sense to use SSDs, then we go for HEDs or something. Okay, and here's a slide on the Outlook, with focus on the distributed environments, of course. We definitely want to use the multi-region Swift, or a GQWF distributed Swift, a cross-wide area network. We also want to look if we can use it as a glance back end, so it's easier to keep the image catalog up to date. We tried this multi-region Swift briefly in a lab about a year ago, so I think we didn't proceed with it, because we have a small team and many other things to do, and it wasn't so pressing at the time. What we definitely want to add is the Keystone federated identity back end. We had a lot of customers already asking us why should we, why should I, again, always provide the same user name and password, right? So it's much simpler for the customer if you can use a more limited set of RPN points or dashboards. We're also looking into containers and container orchestration for internal use cases, but also for cloud and heat products, for example, so that the customer could write their software, and the deaf environment is easily pushable without too much adjustment to our cloud and runs in the cloud networks. One other thing which is very interesting for us is orchestration, which again makes it easier for us and also for the customer to handle this amount of resources. And, of course, not only simple single deployment orchestration, but also cross deployment orchestration. Okay, that was the last slide. If you have any questions, we'd like to answer them. And there are some email addresses of us as well. So just wondering there, my boss actually saw something else he did a couple of months ago and pointed this out and said I had to come and wonder about you guys. What do you do in summer? It really depends on the deployment. So we have deployments. So there's always a failsafe of just getting this stuff out, right? But that would mean that your service... No, no, I'm saying it depends on deployment. In the end, if all things break, then the server don't get overheated, we can blow it out. But normally, we have deployments where it could be just a baseline we provide. So even summer is not a problem. So we have buildings where we work together with local utilities. So we only provide 10 or 20% of the baseline. So it's not really a problem in summer, because they can regulate the 80% thing. That is a possibility. We also talk to people like have a pool or something so that you have the back cooling already much better in place. But it's definitely an issue, it's not an issue which really affects the servers in the end. Do you think... What would it take to deploy this elsewhere in the world where large scale networking is a much larger problem? Here in North America, for example, there is crappy internet to everybody. Yeah. I guess it's more a problem of the power supply and things like this and other problems than the internet connection. Right now, we are just in Germany. We are thinking about expanding, but then I guess we will start expanding Europe where it's more close to the infrastructure level, it's more close to what we get in Germany. The only place here I think it would probably work out is large apartment buildings. Yeah, that's what we do. So we started with single houses. We actually started with proof of concept with the house of one of the founders, which is a very energy efficient house. So then we moved to bigger apartment buildings. We actually have bigger apartment buildings, a freshly built one, where we have heaters in the basement, yes. That's the perfect thing for us, yes, for office buildings or things like this. And then nobody complains about how loud the servers are in their basement. Intriguing concept, I like it. A nice presentation, by the way. One question, how many sites, how many data centers micro data centers do you currently have? It's 40 to 50, I think, but the size largely varies. So we have four or five of the bigger data centers. And many of the smaller data centers we use for testing or separate customers, like I explained. And it's expanding. So actually the requests for building new data centers is quite high. So we are at a level where we have to say, no, right now we're not building any new data centers, we have to wait for the other things to catch up. Is it mainly private persons that approach you, or is it mostly organizations? It started with private persons, yeah, like smaller buildings, but we are moving to bigger buildings like apartment complexes and things like this, yes. That's more like we want to do it, because yes, we can handle smaller deployments, but it's also for us better to have a bigger deployment, so apartment buildings or bigger office buildings is what we prefer right now. It's also more focused for us with a small team. Okay, one final one. One of the things that when I look at OpenStack C is that OpenStack does not have a lot of infrastructure in place to understand the heterogeneous nature of a cloud. What are the main limitations that you have seen in that respect? It could be better, definitely, yeah, it could be like tagging or something to describe deployment much more effective and use scheduling for this. We actually wrote our own schedulers inside the deployments to have some awareness where as the cold part was the hot part of things like this, but it's actually quite easy, at least in terms of the scheduler. So we didn't have that big issues to providing our glue. It could have been easier, but it's not really because of the lot of RPs you have and you can work with. You don't necessarily need to fork anything and go down to the code. It's still at a good level that you can just provide our middlewares and our glue on top, yeah. Okay, thanks. Just a couple of questions. I'm more curious than anything. The first one is what do you use as a deployment mechanism? Are you using Ansible, Chef, Puppet or a combination of the above? We're using Chef mostly, yes. Okay, cool. Are you using the community cookbooks that Chef has for OpenStack or you got your own? I think so, yeah. We use the Stackforge cookbooks. Okay, yeah. As a basis, yeah. We have some adjustments we have to do. Oh, I'm sure, yeah. We've got a ways to go to get these cookbooks right, but we're making progress. I'm intrigued by it, but I also think to myself a little bit about Outloud here. I'm from IBM. I used to work on the Smarter Planet, the whole initiative. My job was energy management. It was more to monitor the energy you're wasting in your data center and how you can control it. But that's a data center. If you control a little bit in your data center, you save thousands, 100,000, millions of dollars. What I'm curious here is that two questions on your physical separation, if you will, of your smaller data centers. Isn't there a cost for the on-site... Somebody has to go there and fix these things and they break. So there's a cost for travel time and get a... You're not going to hire the apartment manager to do it. And I'm also curious about... You also then do remote monitoring of these facilities in terms of not only for the billing, but also for the energy and making sure that things aren't melting. Yeah, this is what I'm talking about regarding the maintenance effort ES, it is an issue. Of course, it's smaller deployments, but smaller deployments are not compared to bigger data centers and doesn't necessarily mean that the effort is less per data center. It's even the same and in the end it's more, right? But when we build up our deployments we basically take care of that we have at least three copies of the important things and a lot of things can break to a certain level because until there's really a necessity to move there. So we let it break until a certain level and then when the levels reach we move there and have to react. You have to do it. You also mentioned the apartment complex solution. One of the part we're doing with IBM is this thing called Smarter Cities. The one thing with the Smarter Cities, if you have an apartment building with a thousand people living in it, let's say, if you have a server bank also providing cooling or whatever or heat, the question was can you cooperatively go after the market of the folks in the building? For example, if everybody in the building is using Netflix and everybody in the building is watching the Netflix, wouldn't it be nice if they didn't have to go, you know... That's actually one of our ideas. That's actually one of our ideas like when the next football World Cup is coming then you're much closer to the customers, yes. And the latency is much better. It's like a CDN if you might think of it, yes. Since 2012 officially, you named our terror initially if you have heard this name. Yeah, since 2012 officially but we started at the end of the 2000 level, yeah. So for networking do you use whatever is currently coming into the building or do you get special redundant... So the networking hardware is not that much different from normal data centers. We have normal switches. We use Viata from Brocade for the road routers. So open source software, normal hardware, we combine it, a normal Ethernet. Okay, and then your connectivity provider is that... Again, sorry. Do you have business class? What we use for our data centers, yes. So it's also different from the smaller deployments that are not necessarily connected to the big lines, right? But our bigger data centers are connected to the Fiber channel and they really provided the Fiber directly to our data centers. It's like normal standard data center stuff. Okay. Any more questions? On screen, at least. Okay, then I think we're done. If you have any more questions, we can ask us offline. Thanks. Thank you.