 All right. Good morning, everybody. Thanks for joining us today. My name is Sadeh Zalla. I'm IBM Software Engineer in the United States. I'm also a project technical lead for two projects, TASCA Parser and Heat Translator. Both are projects under the orchestration heat program, the main program. I have a pleasure to have Matthew Welton. He's a software engineer at SON in France and we have Alvaro here. He is a computer scientist at CISIC, which is Spanish National Research Council. It's the biggest council in Spain, typically for physical physics and other scientific research. We are missing Miguel Cabrera. He could not make it because of project related work. They're having some reviews related to this talk, so he's not here. So yesterday we had a really good keynote in the morning. I think most of us were there in the audience and it was about interop challenge in different open-stack clouds. It was a great show, right? They proved on the stage that interoperability is very important, right, and it could be done. Now there is no relationship between that interop challenge and what we are talking today, but I want to mention because interoperability is very important and that's what is related to this talk. So in this presentation we will talk about how we are handling an interoperability challenge across heterogeneous clouds, not just open-set clouds from different providers. It's cloud made of open-stack, open nebula, some other proprietary clouds, how you can deploy with single modeling concept into different clouds. So we will demonstrate some of the work we're doing with Strone and CISIC and other institutes for the project Indigo. We'll talk about TASCA, we'll talk about TASCA and how they are used to deploy workloads in open-stack, especially, right? That's where we're doing some integration work. And, you know, Alvaro and Matthew will give a deep dive into the project Indigo, which is a large-scale project for heterogeneous clouds and its focus for the scientific community in Europe. And we will show the use case to model a full-stack with TASCA. So TASCA, it's a topology and orchestration specification for cloud applications. That's what it stands for. What is TASCA? It's a very important open standard and it's enabling interoperability, right? Like I just said, interoperability is important, right? Especially the heterogeneous clouds. And TASCA is supported by a large number of companies around the world. I'll be covering that in the next slides. TASCA has its own domain-specific language. It's on DSL. It's in Yaml right now. And it's not tied to any specific cloud, right? You know, it is interoperable. I mean, you can use TASCA for, you know, one cloud or other depending on the orchestration engine support. On the diagram, I have, you know, things like node types. So TASCA has various node types, mostly normative node types. And there are, you know, easy extension with custom types, you know, non-normative types, right? Each all node types are related to each other in some way. And types typically offer some capabilities and interfaces to provide implementation for your workload. And there are policies that can be applied on a single node or group of nodes. Underneath there, I have a, you know, small example of how you model a web application, right? Node.js app. We have a real template out there, you know, with the PayPal sample piece applications. But it shows just the modeling concept that Node.js application can be hosted on Node.js server. That's a logical node. And which, in turn, is hosted on app server, right? Which is the physical version machine. And the same way the Node.js app is, it's connected to database, the MongoDB in this case, which is hosted on another logical node for MongoDB MS and which is, again, hosted on Mongo server, a version machine. Here I will briefly cover the importer milestone and some of the contribution from across the world, right? So Tosca work is not new. You know, it was started five years back, six years back in 2013. There was XML format out and it was published. But hey, that was 2013, right? You know, we have made so much progress in Cloud Wall. AWS was progressing. OpenStack was already became foundation in 2012. And, you know, it was emerging as a standard in the open source world as in Cloud. And what do you do with XML? You know, nobody supports XML. You know, it's either JSON or YAML, I mean, typically. And that's where the decision was made that, you know what, XML is just increasing complexities. And the best way to go is, you know, through either JSON or YAML and the choice was YAML. You know, it's more human-readable and, you know, it's been used across different projects in OpenStack. So the work was started the same year to, you know, create a Tosca symbol profile we call to make things a little simpler, you know, taking out the complexities which were introduced in the XML format and in YAML, right? So fast forward. The first version is out now. It's OSSASCA approved version which came out in June 2016. But before that, we had several working drops out there, you know, and people are using it. And things are going a little bit faster now. You know, for the 1.1, it's already under, under like a public review right now in the final stage. There is another Tosca profile work going on that's for network function virtualization, the NFE profile. It simply extends the simple profile in YAML for NFE-specific workloads. And, you know, there is a standard stable draft is already out there. With company-wise, you can see there are so many companies, you know, contributing to Tosca in one way or other. You know, there are many member Tosca technical committee members out there. Some of them contribute to, you know, with the modeling work, some review work. There are several different groups, you know, for simple profile focused on monitoring, focused on container work, you know, et cetera. All right, Tosca parser. So it's again in 2013, you know, when we decide to go with YAML and integrate into OpenStack, you know, the best route we came up with by mean of, you know, parsing the templates, defining Tosca and translating to heat orchestration language. Hot. So that's in 2014, early 2014 February, much time from we created a new project, Stackware Project, which later became official OpenStack projects. And now we have two projects, Tosca parser and a heat translator. And those are the two projects which helps to deploy your Tosca workers into OpenStack using heat. All right, so Tosca parser is, it's like, it's meant to use as a library. It parses the simple profile. It parses the NFE profile. And it's a subject of the OpenStack heat project. Our approach since last year, you know, before Tokyo Summit, Mintaka Summit is to have at least two point releases for each projects. Tosca parser and heat transfer project, you know, it's like a rapid agile development process, you know, to code against the SPAC, to enable a lot of new features, you know, for users or consumers' need, right? So we, this release, Newton time frame, we had a two release, 0.5 and 0.6. 0.5 was released right after the Austin Summit. And, you know, we added the support for, like, Tosca policies and triggers and load balancer. And so the complex of the type, like ports pack and, you know, there were some bug fixes, of course. 0.6 was, it was, I would say, it was a major release. By the time, you know, we have a, we developed a good user base, you know. The folks here, they use it, Tosca parser, OpenStack Tacker project consumes Tosca parser. And we have OP NFE project, the Linux Foundation project, they use Tosca parser. And there are, you know, several other companies, they are using it, just as it is, you know. And this, again, I think Alvaro will cover it, but this supports that interoperability among heterogeneous cloud, where, you know, if it's not OpenStack, but OpenNabila, you can just take this piece and, you know, you do whatever you want to do with it and, you know, deploy it. So 0.6 was a major release, which came out in August. We added a Python 3.5 support, you know, that we have, we have new OpenStack Gate job created around that time for 3.5 and we were one of the first projects to enable it. We added support for other things like Tosca repositories and some of the NFE specific stuff. Backward compatibility, initially, we weren't too serious about it, but with that release, since many projects and, you know, other companies use it, so we are very serious. They are, you know, anything we do now, it's going to be backward compatible, you know, with 0.6. And down there, there's a link for the PyPy release. Let me brief a little bit about the heat translator project. It's, it's, again, a project or translation layer over heat, meant to deploy, you know, non-heat templates. For example, Tosca, you know, the project is focused on Tosca right now, but it can be used for, for, you know, any other formats with some of the development contribution. So right now, you know, and this is how the project Indigo, they use it, you know, you can translate it and you can deploy it seamlessly with heat. We again have two releases, 0.5 and 0.6. It comes easily, you know, it comes out easily a few weeks after Tosca parser. So the heat translators came out in May 2016. We added support. Matthew did a lot of work there for Ansible and Puppet supports, translation supports. We, you know, we are well integrated with the OpenSec project. So we're using Glance, for example, right now to find the image based on the Tosca constraint. If you're familiar with Tosca, you know, there's a constraint, like, you know, you just define what kind of distribution you want, what version of OS you want. And, you know, we find the right image. And then we set that in the heat template. We're only doing that for flavors. So again, you define your criteria in a number of CPUs and memory needs and all that. And we set that up in the heat templates. And we also added automatic deployment during the same time frame. So, you know, it can be deployed like I earlier said, it's again backward compatible. We actually added a new job, get job in heat translators. So every time now we, you know, update the codes in the translator, it runs against the Tosca parser gate and make sure that nothing in Tosca parser, which is breaking heat translators. So we will fix it, you know, before, like, for the release. 0.6 came out just a month back. And again, Python 3.5 support. One of the major thing we did was, you know, in earlier release, we were doing automatic deployment using, you know, the environment variable in users environment, you know, like OS, user password kind of things. But we realized, especially, you know, project Indigo realized that, you know, for the real deployment, we need to do authentication with Keystone. So a lot of work on that. We are now doing authentication with Keystone Auth Library. We, you know, one of the new feature and very important is a translation for auto scaling, right? We are now supporting, you know, the scaling translation from Tosca to HOT, you know, and we create, you know, auto scaling resources, you know, some of the monitoring resource like odd. And initially, we were only translating to, like, one template, but now we translate to multiple templates. As it's required in many cases for heat. And we're also doing Sendlin cluster support now. So, you know, the policies can be translated to the Sendlin profile as well. In a project Indigo uses Ansible heavily. So we now even support the Ansible rules that was the support added by Matthew. For the future work, there are so many things going on. But one of the things I would like to mention here is, you know, some work in progress for heat translator as a service. So we thought about in the past, just a few months back, right now the patch is out there. But, you know, for folks like them, you know, for Project Indigo, I think they will really get benefited calling APIs directly. You know, so even Tacker, they like the idea too. So we have work going on to make, you know, the translator as a service as well, you know, on top of the command line tool, which is as of today. And the goal is to work on it during this release cycle. Again, you know, the PIPI package is out there. So if you are interested, please feel free to use. A lot of other work is going on. But more interesting work is, of course, you want to know how these things are getting used, right? To again, to handle the challenge of interoperability, right? Across multiple heterogeneous clouds. And that's where I will hand it over to Alvaro. Thank you, guys. Thanks, Alvaro. Thanks. Thank you, Sadeb. Well, as he said, I work at the Spanish National Research Council. And Sadeb already said we are a multidisciplinary institution. I am located in a group that provides support for scientific users to satisfy their computing needs. And in this context, we are involved in several European projects. One of them is the Indigo Data Cloud project that it's aim is to establish sustainable infrastructure as a service, platform as a service, and software as a service for both computing and data that is focused on science and more specifically on scientific users. So we aim to satisfy the needs of science and scientific users. We are funded under the H2020 program. We started in 2015. The project is comprised of 26 partners from 11 different countries. We have one of the biggest research councils and institutions in Europe, as well as some industrial partners that are also participating. We aim to deploy this, our middle world for heterogeneous infrastructures like OpenStack and OpenEvola, but not only also for commercial providers. And one key aspect of Indigo is that we have driven our developments focusing on the requirements from the scientific users. So we do initial gathering of requirements from 11 multidisciplinary communities like LiveWatch, for example, which is in bi-informatics, or for example WLCG, high-energy physics, astrophysics, humanities, and so on. So as I said, we try to drive developments focusing on the user's needs. We have defined our architecture focusing on them to satisfy the requirements. And then we have defined the software components that are needed to be developed or updated to be helpful for them. So the idea is that we will develop open software software that will fill the technology gaps that we have found that are not covered in the current infrastructures that are required to be used as a target for science. We don't want to reinvent things. We don't want to reinvent the wheel. So we will try to reuse as much as possible. We will of course develop the missing pieces of software whenever it is necessary. And we will try to exploit the experience from previous projects, previous experiences, previous infrastructures, for example. As I said, we try to be multidisciplinary. So we will not focus on one science discipline only. We try to gather as much user communities as possible. And we try to be vendor neutral as well. First of all, through the adoption of standards and secondly through the promotion of standards. We try to be interoperable between clouds. As I said, we have open Nebula clouds, open stack clouds, commercial providers and one key aspect is interoperability. So just for you to know, this is our user communities. We have 11 that are marked with the Indigo logo. They are distributed all over Europe and we have several S3s that are special research infrastructures in Europe that are cycling red involved directly in the project. So how are we leveraging Tosca? And why do we need Tosca? Well, first of all, because for deploying this scientific application, orchestration is needed because the scientific applications sometimes are complex. So we need this kind of orchestration. We need to manage the interactions between the different components, the workflows and so on. Secondly, this has to be done in an interoperable infrastructure. So interoperability for us is a key aspect. Targeting initially at open stack and open Nebula but also on commercial providers. So we did the first evaluation of the available options for orchestration. We found, of course, open stack heat and cloud formation, but both are tied to specific implementations. Heat uses hot, it's language, and cloud formation is for IWS. So Tosca seemed to be a good option for us because we could make a definition of the topology of the infrastructure, but we could also make a definition of the user applications. Another aspect that makes Tosca interesting and attractive for us is that it has an existing code base. So we have the Tosca parser and the heat translator, these two components that are separated so that we can reuse, for example, the Tosca parser for translating into other languages if we need it. And the last point is being a standard. This is an important thing is that it is supported from different communities. It is backed by industry and then the industry is using it with partners like IBM, not only. So it is a standard that is actually been developed and being used. So how we do leverage Tosca? So we define some custom types for covering the scientific use cases. We have extended the Tosca simple profile in Jamel with our own custom types that are in the GitHub repo. And we use Tosca to launch these use cases and the applications for the users. Some examples, but not only are, for example, the deployment of an elastic cluster for batch processing. Right now, we support SLARM, we support Turkey and HT Condor, but it can be extended to other batch systems like, for example, I don't know, Sound Grid Engine. Well, Sound Grid Engine, no, Oracle Grid Engine. We support, for example, the deployment of an Apache Mesos cluster, the Galaxy Portal, which is that intensive application for biomedical research. We support the deployment of this application through Galaxy. And for Indigo specific jobs like deploy, I don't know, Mesos, Marathon and Chronos for managing our infrastructure services. And with this, I hand out to Matthew that will go more into the details. Hi there. So I will present to you the different use case. So now I will present to you a workflow or we would work for one of the use case. So for example, here we are talking infrastructure level. So when you say user, it's not directly the end user. For example, the end user would mainly like to run Python notebooks on a cluster or something like that. Here we are talking about Indigo infrastructure users that want to deploy a stack so a user can use it afterwards. So what is happening is that we have integrators that are creating those cat templates that define this kind of application stack. Then a user can select one of these set of templates, parameterize it a bit to fit their needs. For example, number of nodes, do we want to scaling or stuff like that. Then this request is sent to a multi-site orchestrator. This multi-site orchestrator will choose cloud underneath that have the available resource needed. It will then send the Tosca templates and the parameter to this underlying clouds. So here we have two examples of orchestration engine. On the left we have infrastructure manager, which is more generic orchestrator than it since it works with driver and supports public cloud and also private clouds like OpenNibula. On the other side we have the OpenStack hit component, which kind of have the same principle, but it's more specific to OpenStack. So after you site it shows, if it's OpenStack, it's going through the hit translator, then it gets out of templates that are sent to hit and deployed for the users. And the same kind of stuff is happening with infrastructure manager except that they don't have a specific languages. They use directly Tosca as an input. Why do we need a translator on the OpenStack layer for now? One of the goals is later to perhaps try to integrate this translator directly in the hit pipeline so it can be easy to plug in inside hit and we don't have to go through a manual translation. Let's say manual. Thank you. So I'm going to talk a bit on the work in progress. I'm doing this later for the Indigo project. So currently as Sadf said, we have supports from scalable resource. However, it's mainly at the server level. It means if you have deployments to install software or to start software, for example, define on your template, it will not be ended right now. So we wrote a generic framework that will handle all these dependency problems like, for example, gathering the software deployments, put it inside a specific stack with the defined node, compute node. And this framework can then be mapped onto a hit resource that could be either a resource group, autoscaling group, or sign-in cluster, for example. Another improvement that we are working on is support for endpoints. When you have a compute node, you can find endpoints capability that will drive the network-related part of the deployed server. So mainly, for example, choosing a network where you want to deploy it and which port you want to open to the outside. So a quick overview of this. So let's say we have a scalable compute node. So on the left, we can see that it's scalable because we have the scalable capabilities with a count of two and an install creep, an unstable install creep that will take a pass as an input. On the output, we will have a resource group with the same count that will reference sub-templates. And you can find the sub-templates on the right. And on the right, you can see that you have indeed the server, but you also have the software deployment component needed to launch the scripts with the install pass that was passed to the templates. So let's go through the network part now. So let's say we have a compute node defined in Tosca. You can define a network name where it will be provisioned. You have public in uppercase is a keyword. So if you put that, it should ask the underlying cloud what is the official current public network where you can provision and use it. So that is what the translator is doing. It's calling not one to note the current public network. So I put it as a someone here, of course, and an intelligent instance on it. You also get translation of the port that you need opened through a security group. And regarding the network name, if it's not a keyword like public, you can also directly specify the name of the network that needs to be used. For example, if you define a private network already, or if you define a private network inside the Tosca template, for example, which is possible, then these networks will directly be used without trying to talk to merchants or something like that. So currently the translator with several configuration management tools, so scripts, uncivil, pipette, you can also define artifacts on compute nodes that would be mapped to a galaxy uncivil world. So instead of having to embed your galaxy uncivil world inside your image, you can just define it there as an artifact and it will be installed by your software config underneath. And then you can use those roles of uncivil world inside your, for example, installs a big script that we have here. So that's all for the details. So current status in the Indigo Data Cloud Tosca support. So the first use case are working. So these things include, as I said earlier, some stuff to deploy the cluster. So we currently support slurms and missiles officially and some other work, some are underway. We are good contributors, upstream contributors to the various projects after IBM, of course. We try to reuse whatever we can, like the command open stack clients or the Kiston library if needed. Currently as a backend we support several clouds. Our work currently, we can only deploy a full cluster in one cloud. The final goal would be to be able to launch a cluster and scale a cluster across several clouds. So the main issue here is we have to deal with network and communication between the nodes and the master on different clouds. And if we don't provision directly on a public network it can be a bit difficult. So, like I said, we are still working on the custom styles. For example, talk is coming, HD condo is planned. So for each of these templates we need to check if it's well-supported in the heat translator. And God is also to deploy to more sites. Currently we are mainly dealing with open Nebula and open stack. But we are aiming to try it on a public cloud too. We have a good collaboration with the upstream project and with the PTL Sadev. So that's a nice thing. And another work would be to provide it to the translator as a service endpoint because currently it's a Python library and you can't really use it outside of the Python work world. So it will be a nice stuff to have. Thank you. Let's go to the questions. Hello. So thank you for the presentation. So my understanding is that Tosca was specified at the same time that heat was created. So I'm wondering if now that Tosca has become a standard it makes sense to implement Tosca or to support Tosca directly in heat rather than having a translator. And another question I wanted to ask is if you want to support multiple clouds at the same time you require node types and some other things. So I'm wondering if there is any plan or you already have a common location to encourage code usage. Can you answer the second question? Yes. So if you want to support like in your first slide you had MongoDB and maybe you want to support some other databases. You need compute nodes in some cloud and in a different cloud for all these node types that you need to support and do you have a common location that I can refer to to encourage the reusability because some other people that might have the same need for the same node as you might implement it instead of going to the one that you already have. Absolutely. If you go to the Indigo DC account on GitHub you will see that we found a TOSCAT template repository and a TOSCAT type repository that are all over work. So you will find the template for the clusters and also for some of our more data science and user software. So yeah, it's all publicly available. You can type Indigo DC on GitHub. The URL is here. Perfect. Okay, great. So I'm reverting the other questions. Let me go to the first question. Sorry, go ahead if you want. So for the first question, I know you said that since we have open standard now why heat cannot use it straight instead of hard right and instead of translation let it do straight. Well, the problem was and that's how we started translation project, TOSCAT being specification, it go through it's a little time consuming honestly a lot of companies participate in the modeling work and it takes a little bit of time. Heat has its own language heart which has actually the TOSCAT heart if you see there are some similarities. And the other challenge was some of the open-stack components like glance or NOAA, there is not enough support for certain constraint-based support right. Like I said, heat uses image and flavors for example. What is TOSCAT is a little bit low-layers, you can specify the constraints for the flavor and for the image. So I think that the approach we came up with is modify heat natively like you say and do as less as translation possible. And as a result, heat, I know if you are familiar with heat, different resources, right. We actually the TOSCAT team worked with the heat guys and we added new resources like software config or software deployment resources which can be used for any application level orchestration. So they were actually added and honestly for TOSCAT to make the translation easy, right. So that's the approach we are having. It's not that easy to move from like hard to TOSCAT in heat. But I think this approach is working well that we continue adding code in heat to provide as much as support for TOSCAT types or close to it into heat. So the translation can be done very easily without much of breakage or mapping problem, right. Even if you have two different languages, you are getting to a point in which the semantics are similar. So the translation is here, as you said. Exactly. Second thing here, like Matthew said, some of the examples are available on Indigo sites and I would recommend to look at the TASCA parser GitHub repo. The MongoDB use case is there. We actually have a very complex use case which is working good with heat. And heat translator GitHub repo, which actually even has the translator hot templates out there too if somebody wants to use. It's like we have use cases a little bit more complicated than the example I showed, right. The MongoDB example, we actually have an ELK stack. So we have node, we have MongoDB, we have elastic search, log stack, Kibana, like five different virtual machines. And then, you know, the app server, node app server has RCS log, for example, you know, model with the workload so that it can be monitored. And then, you know, with the Kibana dashboard you can see what's going on, right. But those templates are out there. So like you said, you know, instead of somebody writing again, you know, you can see what's going on, you know, you can see what's going on from there. Thank you very much. One thing that we wanted to do is to, I think Matthew already mentioned it, deploy to develop a new service that will consume the Tosca templates directly and they will make the translation on the fly and submit it directly to heat so that in a way there is some kind of API for cloud formation, you can have an API for Tosca. We plan to do this separately from heat so that it can be developed more easily and so on, but more or less the same. It's like EC2, for example, for Nova, deploy a different service that makes this translation so that it's easier to reuse the translator, to use the translator outside the Python ecosystem in the Python world. Thank you. Christian, please. So how do the artifacts that are specified in the Tosca template make it into the server that's installed and then is there any standardization around how that is handled by the server? Currently when we are talking about Artifact it's more deployment Artifact like that data. The data is another subject of course of the project, but it's not handled as a Tosca which is more computer related. We have some software called Icing1Data that is used to handle the cross-site data sharing. I don't have a lot more information regarding that so let's say currently Artifact are more implementation details of the deployment than data. It's more configuration and stuff like that on C++ like I said. I don't know if it answers your questions. I was just wondering if there's any generic way that the things that are not infrastructure, the software that you're really trying to install are being handled. More like an application level you're saying? Yeah. So you can specify that in Tosca template, right? Like I said, the interface is right. At deployment time with translator, what do we do? Again, heat, that's another thing where heat and Tosca are two different things, right? Heat does not have interfaces as part of the modeling, right? Right. But it's again to support Tosca we created some of the hooks with the resources like software config or software component. You can actually specify those deployment artifacts in HOT and that's what we are doing right now. So anything which is mentioned for implementation any scripts or any other artifacts, right? Currently for the data itself it would be like sending the pass and then having something underneath unlink the access of the data. If you mean about data? Yeah, so it's just that you have essentially links available in the Nova metadata, for example. Eventually, yes. We are heat, right? Right. Heat is the one we were going to do exactly. Yep. Thanks. More questions? We have a little bit of time as well. But feel free to reach us. We are always only on the IRC channel, heat-translator and e-mails or launchpad or anything, right? Well, thank you guys. Thanks again so much for joining. Thank you. Thank you.