 Okay, let's do the begin, so hello, I'm Alexander Tbilko from RENTIS, I'm Randall Burt, senior software engineer at Rackspace, thank you all for coming today. And we are here to give you a talk about our latest initiative in Glance, which is called Glance Artifact Repository, and the main idea is that Glance is not about artifacts anymore, not about images anymore, sorry, it's more about artifacts. Still about images, it's not all about images anymore. Yeah, so here is some agenda we'll talk about today, and I'll pass the word to Randall, he may talk about the history. Yeah, so as we all know, or as most of us know, Glance started out as an image repository primarily feeding NOVA with the image data that it needed to boot images, to boot VMs and things like that. And that was great, and it did a pretty good job at it, and its mission statement obviously reflected that in that its scope was limited to just that job. However, there's a slight issue as Glance and OpenStack itself evolved that many more initiatives were coming up to where, hey, we're also storing snapshots. That's a slight, it's similar, but it's a slight expansion of our mission. We also have different metadata for different image formats. What if we want to actually use an image specification rather than some binary? We also, new requirements started coming in about indexing and format-specific metadata that may wind up being maybe the same image is stored in several different formats. And maybe images in the future as the format of these things evolved actually come in, it's not just one binary file, maybe it's several. Maybe you compose those images from several different pieces stored in Glance. So Glance itself had already kind of evolved to be a little more complex than it had initially been. So as other projects like Heat and Morano and other things started getting involved in OpenStack, there are also services that consume some other artifact that will then result in resources deployed on OpenStack. So we have Heat templates, Morano application packages and Solom plan files, Tusker packages, Mistral workbooks, and that list continues to grow. So as these things evolved, so too we believe that the mission of Glance should and is evolving. So part of the driver there, I started working for Rackspace. Rackspace had an internal orchestration tool and they built a product on top of and that was going pretty well and then Heat gets incubated into OpenStack. So we start getting involved in that. And so we come up with this transition plan to move from some internal orchestration tooling over to Heat. Now one of the things that we were kind of missing in that transition was the storage and serving up of say provider blessed or supported Heat templates. So that instead of starting from your own, you could go to a catalog of templates and pull from them and either use them directly or modify them or just tweak a few switches and deploy some supported or blessed or vetted architecture to do a thing like WordPress or what have you, MySQL cluster. So we started that process saying, hey, let's go talk to the Heat community and say Heat, let's expand Heat to store templates. Well, you know, that's not what Heat's job is. Heat's very good at orchestrating your infrastructure and configuration, but it's not a catalog of things. Well, as it turns out in OpenStack, we already have a catalog of things. So we started a mailing list discussion saying, hey, what about adding this capability to glance? And of course there was a little resistance and a little discussion. We started to attend glance mid-submits and meetings and kind of hashing out the details and Alex and I started getting together once every few months to argue about artifact life cycle and what metadata really means. And so I guess here is the culmination of that where we finally found that common ground and now we want to share it with everyone else. So this eventually led to the fact that discussion was brought to wider audience. We at Murano just saw this discussion and said, hey, we also need this catalog of stuff. We don't care about heat templates, but we care about Murano images. And some guys came and said, oh, we need to store our plan files as well. So glance didn't know about it anything. But then the technical committee came and said, OK, please coordinate your efforts and start working in a single place. And then we said, OK, but now we need a new mission statement for this catalog stuff. And here is the new glance mission statement which has been approved by technical committee during the June cycle. And now it says that glance is not about images anymore. It should provide the services for storing, uploading, and discovering data assets that are meant to be used by other OpenStack services. So this is important. This is not just a storage of pretty much everything. This is a storage for something which is used by other OpenStack services, not by end users, not by, I don't know, non-OpenStack services or some other third party components, but particular catalog of objects which are used by OpenStack. Same as Nova uses heat images, other services will come to use artifacts stored in the catalog. So these artifacts have some properties which identify them as being an artifact. The most important thing is that it is a binary data accompanied with a metadata. So data has some definitions, some description, and the structure of this is fixed. It's not just blobs like you can put in SWIFT. This is blobs of particular type accompanied with particular definitions and descriptions with some constraints enforced. And another important moment is the thing that these objects are immutable. What we like glance for for storing Nova images is that once we have uploaded the image there, it's there. It's fixed and nobody can do anything bad with it. We have an ID which identifies this particular set of bytes and you can trust that when acquiring the image by this ID you get the proper approved image which was there at the moment when it was uploaded. So nobody tangles with the store, nobody modifies the images, nobody tries to attack you, nobody tries to substitute the image with something wrong. That's what we want to retain for artifacts. The artifacts remain immutable during their life cycle. I'll touch that a little bit later. The most important stuff is that the users should know that they're immutable and that once it's published it becomes fixed in all its life cycle. Also, the artifacts may have versions, which is also good when you have a long life cycle for the artifacts which may still be changed for some reason, but at that moment instead of modifying the existing artifact we simply publish a new version which is still discoverable and the changes between different versions are discoverable and clearly understood to the end users. So this is a structure of what we call artifact metadata. The important stuff here is that it's not just a number of key value pairs as we had for images. We still could publish additional properties in glance for images but they were just key value pairs with string values stored without any constraints, without any expectations we may have about the image. Now we have a fixed structure which is defined by the project which is going to consume the artifacts. So there are some common properties which are common for all the artifacts. There are regular stuff like name, version description, authorship information and so on, but the most important part is this type specific properties which are defined by the artifact type, by the service which is going to consume the artifact and knows what to expect of it. For example, for regular images there are stuff like MinRAM, MaxRAM, MinCPU count, the data which is meaningful for NOVA, and NOVA knows how to treat it. For other kinds of artifacts, these specific properties may be, well, for example, for heat, this may be a template format. They have different template formats and they know how to process the template based on its format so they will probably want to store this property as part of the artifact metadata. For Murano, we have similar stuff like application categories and various kinds of specific keywords which identify the application in catalog. These are not just string values. They may have different types, different constraints applied on these values and various kinds of validation logic which may enforce some specific rules on this particular artifact. The important thing is that this schema for these properties is discoverable so any project which is going to consume this particular artifact type should be able to fetch the schema and validate its own data and the data it receives from the user and the data it's going to publish into the catalog against the schema and make sure that it's going to publish a valid artifact. Which actually becomes a very important aspect of the next feature of artifacts is that artifacts can and off times do have dependencies on one another and these dependencies can be against artifacts of the same type or artifacts of different types and they can either be explicit where, hey, I am a heat template and I've written it in such a way that I depend on a very specific image to be available. These artifacts can have or I'm a Murano package and I depend on a very specific heat template being available so that those dependencies between those artifacts are explicit so we can do things like pre-validation and guaranteeing that, hey, if you go and pull this artifact down you know that the pieces that it needs will also be available and you can either pull and deal with that artifact autonomically as itself or you can ask the artifact repository to say give me this artifact and all the things that it depends on. So that's very handy but also because of this known metadata you can also have what we call a dynamically evaluated references where, hey, I depend on you having a heat template that expresses this particular type of architecture or I depend on having a Murano package that will materialize my SQL or something like that, not a specific one, this has to be in the artifact repository somewhere. So that becomes very valuable and powerful in saying, hey, I may need some other bits and pieces but you can discover what will actually meet those needs at runtime. So you can also have a dependency chain of relationships through A depends on B depends on C and you can either guarantee that integrity is maintained and get them all at the same time and deploy them so that they're available. Oh, you did it, sorry. So the other part, and this was fed in for some of the requirements that Glantz was already thinking about or working with is that artifacts just aren't necessarily one thing. They're not a single blob of data. They can be multiple blobs of data that are composed when you grab the artifact and what these are and how many you can have and how they relate to one another are also described in the artifact description itself. They are stored in the same, using the same backend storing mechanism that Glantz already has. Some of the work is moving in the direction of having stores specific for a particular artifact type, like optimized for a particular artifact type. You might not need something, you may need something simpler for heat templates or optimized for a different streaming model for heat templates that you would for images or morano packages or things like that. Each of these can still be used by other components of OpenStack as well. There is some confusion here sometimes. When we speak about this feature, some people confuse the artifact dependencies and this artifact composition of multiple blobs. The difference is that when the artifact is composed of multiple blobs, it still remains the single object which is self-sufficient as an artifact. Partial blobs are not self-sufficient. They just complement each other in some different ways as different components of OpenStack may use it to pursue the same goal. For example, if you have an application published in a catalog and there are some scripts which are going to be deployed, executed on virtual machines. For example, during the publishing of the application. These scripts may be separated blobs which are accessed from a virtual machine without the need to download everything else which is going to be executed by a heat orchestrator. At the same time, there may be screenshots or icons or some user-facing information that is going to be displayed in Horizon or even in some other specific dashboard which is used for this particular application for this particular type of service which deploys this application. It may be a heat UI how you call it the name of the heat UI Temper I think it's temper. It's something heat-related. Or it may be Murano dashboard or it may be Mistral dashboard. Something which is integrated in Horizon but is user-facing. We don't need their service itself to download the content which is intended to be displayed for end users in Horizon and at the same time we don't want Horizon to download the images intended to be run in Nova. We have separated these binary objects and provide an API to download them independently. However, they all share the same life cycle of an artifact. While the independent artifacts which are just connected with relations of independent life cycles and may leave their own lives. Speaking about artifact life cycle it all starts when we create an artifact. Unlike images which are quite atomic in glance, artifacts as they're composed of multiple blobs may be created in some kind of iterative process. I just call some glance API saying I want to create an artifact. It creates an artifact draft kind of a draft which says this is a placeholder in database it doesn't have any blobs it doesn't have any metadata associated it's just in creating state. Then I'm using the APIs to iteratively upload the data set the properties type specific ones validate them establish the relations to other artifacts set tags and so on. Everything which is supposed to be the land of the developer who creates or publish this artifact not necessarily developer but the owner of artifact it's not supposed to be used by others at this point. But then the artifact is published. Publishing means validating then everything is correct and then fixing everything which is supposed to be immutable. After this point immutable properties may not be modified. If you upload it to the artifact as blobs is guaranteed to be fixed and never be touched by everyone else. And the relations between artifacts are established at this point as well. So if I have a published artifact and there is a relation to this artifact I am guaranteed that this relation will be never changed and the artifact resides in its current state and all other artifacts which depend on it may be sure that the dependency will not break. However this iterative process of first creating the artifact step by step and then publishing it is not the only one available option. There is a thing called import and export. They are defined by the plugins I will talk about them a little bit later and this process allow us to create some custom logic allow to the developers of artifact types to define the custom logic which allows to specify all the properties upload all the data in a single step. For example there will be an API which allows to upload a single zip file or maybe not necessary zip but some archive which contains all the blobs some textual definition of artifact properties and execute some specific logic which will discover artifact dependencies inside the repository and establish the references to them and then immediately validate it and publish so as a single API call the artifact will become published and activated and then one more step in artifact life cycle is important one the activation we don't want to allow users to spam others or to I don't know provide some malicious artifacts which may contain viruses or some bad code it's a catalog after all any users may upload artifacts and there may be situation when they share these artifacts with others if we find or the operator finds that something is wrong with artifact that it is reported as inappropriate is reported as containing something bad the operator may deactivate the artifact and investigate it what's going on with it without the artifact being exposed to others deactivated artifacts are not accessible to everybody except administrators even the owner is not able to modify it or somehow change what's inside even the mutable properties are not changeable at this point if there was a false alarm and artifact is still fine it may be reactivated back or it may be completely deleted if the administrator finds that there are some problems with that something about defining the type this is important because well a lot of you are developers a lot of you work for different projects which are going to use artifacts at some moment probably because you need to store something in the catalog and to be able to define this metadata structure to define the life cycle of artifacts to define this import and export operations and so on so this is for you the idea is that we are implementing this in a plugin based manner so every artifact type is represented by stevedor plugin stevedor is a common library in OpenStack for defining plugin based architectures and it's actually a pytonic type which allows to define type specific properties blob kinds relations as python attributes with some constraints enforced using the special classes in the Glance library as a result all the developers will be able to create their own artifact types implementing specific interfaces and publish them as Glance plugins Glance will come bundled with plugins for integrated OpenStack projects such as for heat such as images when images finally become also type of artifact and then all other projects which are integrated with OpenStack and need to use Glance artifact repository will have their plugins bundled with the Glance itself but the OpenStack world is not limited to just integrated projects there are Stackforge projects and there are other projects which are intended to be used with OpenStack but not yet integrated into the primary release so there will be a way to create artifact type specific plugins for that projects and publish them in a separate repository same as we publish Stackforge applications under the Stackforge on GitHub so there will be plugins for these projects you as developers will create and maintain them and when your project gets incubated into OpenStack that plugins will be incubated into the Glance this is a similar model that we use in the heat project as well we have a set of plugins that extend heat functionality using a very similar mechanism whereby non-integrated projects can still kind of implement their own integration with heat and it's not and it's done in a pluggable manner so that you can stand up your own version of heat that supports whatever it is you want and this is a big, big strength of the artifacts as well that you don't necessarily have to wait for integration and you may not be interested in integration but you can still take advantage of this feature without necessarily having to be at a particular path along your integration here is a high-level architecture diagram there is nothing special here the most important stuff is that there are Pythonic clients for the particular projects like heat client or morana client or mistral client which are going to use the artifacts of the particular type and they will just use Glance client as a dependency as a reference and they will just wrap the Glance client for fetching the particular artifact types from Glance storages everything else remains the same we just integrate this plugin layer behind the primary Glance API and all other components of Glance remain at the same place we have the backend stores we have database we have all the notification layers policy enforcements and all the stuff I think Randall will talk about it and that was kind of a big part of the motivation for once we once the discussion was had about having it in Glance that it didn't just fit from sort of a mission perspective or a conceptual perspective from an architectural perspective it made a lot of sense a lot of things that Glance does that made sense for artifacts like the multiple backing stores because again once you can optimize for a particular artifact type Glance can kind of already support that Glance store which is actually now Glance itself uses it's an independent project but Glance itself even uses it you can use that to have direct access to your backing stores policies for access we need the ability for you to upload a particular artifact of a particular type and limit that access to just you, share it with others those are all things that we kind of wanted for artifacts and it turns out Glance was either at the time already working on them or had plans to work on them as well so it just became a really good fit just because of the mission but also the way Glance worked at the time or was going to work so one of the things that when we started kind of talking about this the vision was that it would all be kind of part and parcel of the same service but we I think quickly kind of relatively quickly realizes there are some interactions between services and different artifacts most notably let's say Nova and images that really have to have a very optimized path and if Nova is trying to pull images from Glance and meanwhile 10,000 other users are also querying or uploading their heat templates that may impact that may impact actual booting VMs and we kind of want to avoid that if we can but still have the capability to if you wanted to stand up a service that's running both artifact repository and traditional the traditional image service so we're doing the work now to kind of make sure that we can deploy Glance in several configurations and make sure that this is very modular so that each individual instance of an artifact repository can be configured independently so it's backing stores optimized for the type of artifact you want to serve and it doesn't cross the line where you're actually interrupting other services that are also trying to grab their own artifacts from the service so this could result in you having many different Glance endpoints in your catalog that are very artifact specific and that goes into Alex was just talking about the different Python clients using a dependency Glance to say discover their own artifacts well because this will be in the catalog and it also will be explicit then you can also it doesn't mean that okay if I'm going to run Python heat client then the cloud that I'm going against not only has to support heat but it also has to support the heat template artifact type because these are discoverable we can also fail or support or not support that in the clients very dynamically as well so that is a big big plus there and we don't basically make the Nova guys mad because heat templates are making their VM times go through the roof yeah so what's going on with this initiative now so the current development is inactive progress we actually wanted to land something in June or unfortunately we couldn't due to a large amount of work but we expect the first working prototype to be here in couple of weeks and we definitely want and plan to land the basic functionality but basic I mean complete but still with some possible improvements to land it in kilo in kilo we are going to have this artifact repository with multiple backend storage we are going to have the dependency relations between them we are going to have python glance client supporting artifact fetching and we are going to have this input and export actions available for the artifact type developers we'll have this framework for developing artifact types in declarative manner and we'll have the ways to define what particular clients for python project client to use python glance client at the same time we are not going to make images artifacts at this point the image image in api will remain intact for the whole kilo cycle it will leave at the same end point and as it leaves now in v1 images or v2 images api branches of glance by the way we are going to deprecate v1 in kilo so almost probably it's time to start using v2 if you were using glance v1 before but images for now will remain just images not artifacts everything else will become artifacts and at some point in future maybe in the next release cycle or the release cycle afterwards we will have something like glance api version 3 which will treat images as artifacts meanwhile we have a design session today dedicated for artifact development it's in the design design summit area in meridian in api so it's on 440 if any one of you is interested in participating in the active development of artifacts feature please feel free to stop by and ask questions and participate in the design decisions and actually for now I think that's all and we are ready for questions about artifacts there should be mics in the middle if you want to step up to those if you have any questions nobody from nova wants to yell at us not me actually something pretty simple versioning of artifacts or images you added metadata but in your conception how would you address this yeah we do address this so the version of the artifact is part of metadata fields so we have this version as a basic property of each artifact and we follow the same variation which allows to specify the version as three numeric fields like major, minor and revision plus some labels and metadata definitions for the building 4 or alpha, beta release candidate release it's on so we support this as a basic property for each and every artifact and actually the combination of artifact name and artifact version is supposed to be unique within the particular artifact type so if I have image let's say with the name Ubuntu and version 1204 then this is supposed to be unique and no other combination of the same name and version may be uploaded maybe that's not the best example because well there may be different images with different formats and so on but you've got the idea that we can have that extra tagging information as well so we actually have this concept of querying the latest version of the particular artifact as Randall has said we have this dynamic referencing concept where you're able to reference something not strictly by specifying the identifier of artifact but as a kind of a query which says I want something which satisfies my condition this condition may be give me the latest version of artifact having this name and this property and this tag and so on or nothing earlier than this version maybe oh I know that the library that I need wasn't at the right version at 1204 so anything greater than 1204 will work for me and you can specify that in your dynamic dependency same as Python requirements you may specify some inequality operators which identified version range which you are interested in and it will return you the latest version which falls inside this range same goes for artifacts as we are storing metadata not only as strings but as particular data types as integers, as versions, as booleans as floating point values you can specify queries which allow you to set the ranges for values or greater than, greater than equal well like that so version is just one special type of the data and as a result you may query the artifacts based on the version range at the same time we version not only the artifact themselves but artifact types so when you change the schema when you add extra properties or modify the constraints on some of these properties you eventually create a new artifact type which still has the same name but different version so it's like versioning the protocol, versioning the schema we also support this at any moment of time any artifact type may be represented by a number of versions and each particular artifact is connected to a particular version of its type hope this is clear enough you mentioned about dependency are you looking at TOSCA or some other way that also to describe dependency outside of the glass TOSCA as a way to describe dependencies well the idea is I understand what TOSCA is but the TOSCA is more for applications and we don't want that strong semantics as TOSCA provides TOSCA defines much more flexible ways for particular TOSCA applications we have TOSCA support in heat already I think it's already landed or it's going to land soon Thomas maybe I can take that question so I'm working on the TOSCA specification in the Innoasis I'm also contributing to heat and I'm working in the heat translator project where we're trying to consume TOSCA models we're trying to isolate them in a way that they can be deployed using heat and we're also discussing with the Murano team how TOSCA could be used as one possible format for importing Murano application packages so I think we will make use of the artifact repository that's really great work and fits very well to what we have in TOSCA because we also have dependencies not by pointing to a very concrete artifact and capabilities so we would more say we need an image or a binary with specific characteristics so I think we will be able to map the TOSCA dependency metadata to the artifact metadata that's it and in general I think that initially we didn't need necessarily the scope and depth of that sort of dependency relationship management and at first it might be a little soft as it were, meaning ideally in the grand future there will be active preventing you from deleting something that other things depend on across a dozen different repositories of different types initially it's probably not going to be that stringent and so we don't need that level of dependency management and so for projects like TOSCA for HEAT, for Morana which also have the inner concept of interdependency this is the goal of artifact type of the plugin which defines the custom logic for maintaining the relationships or importing or exporting the artifact it's their goal to implement the logic which maps their inner concept on the generic concept of inter artifact dependency which is enforced by Glance the idea is that we may depend on artifacts which have different type so TOSCA applications may have interdependencies between them or Morana applications have class referencing and class loading between them however if we want to use Glance image from HEAT template which is in turn used by Morana package then we may not rely on any of the inner concept of appropriate projects we have to have some concept which is above all the particular projects which is specific for all of them and so we use Glance dependency relations for this and it's up to the particular projects to map this concept to their subset of functionality does this answer your question okay any other questions okay then we may probably wrap up what I wanted to add to finalize all the stuff is that OpenStack used to be a bunch of almost independent components nobody were doing the virtualizing stuff Glance was storing images HEAT was doing orchestration Morana was storing applications and the users who interacted with all these almost independent projects very often didn't think about what the project does I believe that now we have built quite a major ecosystem of components in OpenStack now it's time to move the next step to make next step to integrate them tighter we actually have a single project it's called OpenStack and within this single project the experience of end user should not depend on the particular component or particular sub project the user works with OpenStack so I believe that it's time for tighter integration of various projects within ecosystem of OpenStack and I believe that it's time to do this and I really think that the Glance as a catalog of the stuff should become one of the points of integration of various projects within the ecosystem of OpenStack so if you have some project or you're thinking about running some project in OpenStack think about how your project will exchange the data with others and when you think about it don't forget about Glance artifact repository thank you