 Hello everyone, I'm really excited to be here to give you my presentation. My name is Ina Palova and I work as a superintendent for Paltium and PEPA. The reason why the whole library here is Paltium is Paltium. So please tell me how many of you already did have the chance to get generalized with this project? Okay, great. And how many of you already have the presentation of that project? Cool. So in this session, the guys would have the chance to learn about this project. And we want to already know something about Paltium. I would keep your interest on that. So with the features that we recently implemented and some other needle stuff. So the plan is I will apply for slides. I will give you the basic concepts of Paltium. I will show you some new features, some implementations, some deals. And in the end, I will take part of questions. So let's start. What is Palt? Palt is a platform for managing a business of course. Imagine that on one hand you have a lot of content types like RTN packages, ISOs, Dr. Images, Python packages, and many more. And on the other hand, you have a lot of new editors. In case you want to manage these two concepts by one term, Palt probably is the right term to be used in this case. Palt supports many content types. It has such a flexible design that we can easily implement a new content type. We started with the RTN family when we added support for Dr. Images and others. How to add your own content type? I will show you later on how to make slides. I will talk about the new feature called free free cache. We are completely open source. You can find all our content with Palt. You can browse the source code. You can look at the full request. You can add a signature or a full request. Obviously, it can be accepted to business. And Palt is a Python publication. And by that, you don't necessarily need to go and see graphical interface. It's like you have a CLI from a dynamic interface based on the SPI. So, let's see what Palt is used and where you can see. Palt is used mainly by the countries in New York. What does it have to do with New York? It tries to deliver all the content via CDM. CDM is called the delivery network. And the current store of the CDM is Palt. Palt is also a mapping cloud. How many of you learn about Amazon Web Services? Cool. If you have ever associated an experience source, and please be amazed that these were pulled from Ruin, which is a category for structures, and the kind of delivery is also Palt. You can also Palt is a challenge in that free project for Palt. So, let's see. They decided to use Palt as a method for content processing. And obviously Palt is used by a lot of commutes. So, let's say you already have installed brand new storage with Palt. Where do you start from? Usually you start from repository creation. When you create repository, you need to plan Palt, what kind of content you're going to put inside so Palt can fetch right back to storage. And there are two ways how you can get the content with Palt. First way is you can synchronize it from a remote source, such sources like CDM, Centres, so you can mirror them without having it. You can do this too lonely or on a schedule. Another way how you can get the content with Palt is that you can create very physically and upload your own content with Palt. So, once the content is inside, you can start and move it around. So, for example, you can make a copy of this content. And the main interesting fact is that the copies are rich. There is only one reference in the database and just a bunch of simulators that point to the Palt system of the actual verification of the content here. So, you can also filter out the content based on the criteria searched during the copying. So, there are a lot of things that you can do so you can mix and move around the content in Palt. Once you have the repository ready, so you're ready to make it available to the customers or your clients, you want to publish it. Publish will make the content available to the others. You can publish the repository and host it as a basic repository, so it will be served by a Palt system. For example, when you publish an argument repository, it becomes as any other regular YAM repository, so YAM agents can get the packages. Or, you can export into an RSO, or this content. There is a ranking feature that we implemented a couple of days ago on customers. You can publish the content to the remote host. So, the content can be available for the most of the remote host. It is really official when you want to make the content highly available or available in multiple geographical locations at the same time. Is that different from what we call inter-satellite sync and satellite sync? You call what? Inter-satellite sync. You know how you can have one satellite server, have a second one. This one goes to the CDN, this one only goes to this one. Is that a Palt function or is that different? That's a thing different, because this is like new feature. Okay. Yeah. I'll finish in front of you and have the answers. You can publish all the content as you say to the repository or only part of it, the one just with you. The actual file transfer is made via SSH and RSO. There is a special configuration where you can provide the host, the method of identification, like your password or the key. You can also provide the directory where the content can be available in remote host. And this feature is available for the religious, RPMs and ISAs so far. So, as I said, Palt is not a multi-tolerant content so in case you still didn't find the content which is not managed by Palt yet, you can implement it by plugin. Oh, yeah. As I mentioned, there is the CLI, command life interface, we call it Palt plugin. We're going to take a look. So, you say Palt, Palt plugin. You specify the type of the repository API, you create it, you provide the repository, the feed URL for which you will fetch the remote content and at this point the repository is created. Then you want to synchronize it. You say, so, and it runs. What is going on during synchronization itself? Basically, first we fetch the metadata. Yes, thank you. Okay, we fetch the metadata. We analyze it and then we start all the metadata into that place. We figure out what kind of content we're going to synchronize so we found out we have some RPMs, delta RPMs, some erratas should be, yeah, erratas. So, then we import the content into the repository and at this point synchronization is done. The content isn't Palt. Let's look at what kind of steps are performed during the publish. During the publish, we create the metadata and we republish all the content type, which is new to you, a repository and we also create these signals that point to the actual storage location. So, as I mentioned, you can post the repository in the base repository to facilitate the browsing of this repository. We decided to implement the repository functionality. Republish tool is a very popular tool that generates static customer pages for the republished data. This is how it looks like the regular web-based repository. You can see the list of the content and this is how it looks with the API functionality enabled. So, as you can see, it's much more comfortable to do browse and the pages are done by the template, which are provided by the repository itself and the same templates can be customized and adjusted by your needs. Do you have some questions at this point? Okay. So, now let's move to the post-cache feature. In Palt, we have such a concept of the law policy and by the law policy, you talk about how you want the content available. So, there is an immediate download which shows basically four colors, the content in the queue. So, immediately download the package package on Palt system. And there is a deferred download, which allows Palt to send the content without actually downloading the packages on the content itself. Now, there are two types of them. There is on-demand policy and background policy. What does the on-demand download? On-demand download skips the actual download of the files if the download is just the metadata it passes and stores all the records of the database. Meanwhile, the content, which is the beats a lot of the files system but you still can't make them available to the client. So, to make it easier, the content will not be downloaded until it will be specifically requested by the client. So, this is being official because you can save a lot of storage, right? The ground download is starting here. So, if you do that, the first time a system registers to the Palt positive and asks for an RBM, is that young transaction, is it going to time out and fail if I'm on a slow link? In public websites, I'm going to give you the explanation how it works. You're thinking the worst story about... I'm going to shut up now. So, background policy. Background policy is always the same, honestly. On demand, but in addition to all the things which are performed by on demand policy, it creates a test with the background that, like it's my favorite, there are always the beats on files system. Meanwhile, if dust is running, the content is still available to the clients to be used by Palt itself. How does it work? So, the other client, for example, the client requests some package he talks to Palt. The request goes to Palt and the request is handled by Apache. What Apache does? Apache holds the tool which actually leads to both the actual location of the package. If the tool points to the existing location, that means the package is currently in the file system, and the content is sent right away. Yeah. If the single points to the existing location, there is a directory there and the request is forwarded to Squid. Squid? Well, that's Squid. Squid catches the content. So, we look into the cache and see if there is some content present, we still don't like the way the content is back to the client. If there is nothing in the cache, we inform this request to the Palt streamer. What Palt streamer is? Palt streamer is a microservice which is responsible for the actual amount of the content from the Palt streamer app. So, Palt streamer goes to the Palt streamer app, allows the bits, and streams it back to the client. Meanwhile, we have the content cached and there is a task in Palt that runs tragically, but it goes into the cache and it states on the Palt system all the content which runs on all of the cache. So, the benefits for this. The next time we do same things will be requested by the client. We will avoid to go through all this stuff and we will serve it directly. So, the benefit of this we call it lazy synchronization is that you download the content as requested. So, if you have it and also around 60 to 30,000 pages and if you want to spare some storage, you can enable this on demand policy. Any questions about the on demand stuff? No, cool, you? I'm not sure, should I keep it until the end or which one would be more comfortable? It's a good question. Is this an interesting question? This question is actually a little bit slight. My question is about signing packages. Well, you are, first and foremost, they get signed, they are studying MDR and they are putting this with, and then you get signed, it doesn't change the name, but it actually changes the content. How does the call deal with that? We got a similar call from the USBS and we would like to know if this question is great or not. I'm talking about the center of the package. Yeah, I mean, you get signed, so it changes the content, but it doesn't change the name. How does the speed handle that? Is it handled by speed or you just try to avoid that? I don't think that it's handled by speed, but speed does, it just changes the content. But it changes the content without changing the name. So, you're requesting a package, but it already has changed the content. The speed doesn't know about that. Oh. It will serve you the old college figures world. Yes. Well, you have a configuration where you can set a timeout. So, if you put some timeout or the cash will just cash it out for some. That is the question. I understand your point. You can set it to two hours, which means we basically use this. Or between that timeout, we are setting this to a week. You will have a week. Well, so, I can tell you how we synchronized the, let's say, RPM early because we look at the option at the data. Right? Oh, I see. So, you forcibly replace the package on the side. Well, if you go to synchronization, right? You synchronize across. Then, here, for example, if you go to synchronization the second time, what we do is, we first look at the option at the data. We see if the revision number changed. That means if something changed, we synchronize it. The revision is in NBR? The revision number of the metadata. Oh. Okay. So, if it changed, we didn't see it. If it didn't change, we would not have changed. So, excuse me. That's my question. Cool. So, the extreme metadata will change. If you have any concerns about the problem, the metadata will change. The name of the package doesn't change. But what you're saying is, in this case, if you accidentally promote something before, it's been signed. But then, go back and re-sign it, since the upstream metadata is going to get updated, the version number of the metadata is going to change. Yeah. And it's going to. Okay. Yeah. Thank you. Yes? You said there's a mechanism of getting the content from the screen to the actual file system. Yes. Because there is a time-out on the screen, which you can set when this type of content comes all the way. So, the Apache site is going to be faster? Oh. It's going to be really time-out. It's going to turn to chaos. Okay. The streamer, which is responsible for the actual belong to the content, well, it seems that it leads to the squeeze. There is a collection of the database. So, this streamer goes to this collection and makes a record there. It says, okay, I'm putting this record of this package that it was downloaded and put on the screen. So, there is a political task, which is triggered. What task is doing? Task is going to the this collection and it has the list of all the content that was downloaded and cashed. So, it goes to... At least it's on the streamer side. The list is on the database. There is a collection of the database. It's another important question. If you have some doubts, we can further the question. We can discuss... It's just like a personal experience on how to determine where it's going to be done. If it helps me, we can discuss it a bit later. Otherwise, maybe I won't take this time to talk about other stuff. So, these are the great content to add to the management. So, we have the art fair family, the ISO, some of the darker images. We also have the fancy illustrations. We also do support for GVM and SLAS packages. So, with all this information, if you have great knowledge about content, if you are really interested in content, you can know what are the use cases for popular content like the regular use cases, test death production, actually death test production. So, let's say you're a developer. So, you create a purposeful development plan. Once a training, you promote by copying into the testing plan. You perform the testing, and then you promote it again to production plan. So, it gets available to the customers, right? Is there a useful for a testing plan, a true content search and a real contribution? Another use case is that you can mirror your Python package index. You can synchronize part of all of the packages from the pipeline. We do manage the source distributions, and we recently add support for Wins, which is pretty cool because Wins is a distributed format which is much faster than custom. So, it's faster. You can add your custom packages by uploading them into the repository. And one of being also think about how it can retain all versions. Imagine this situation when you've been I don't know, doing some framework which requires a specific type of Python package, right? If you're here for a work, you go to Python, and, for example, you can find search because how it works with Python? When there is a new version, the old one, it works so you can apply the full work. If you want to take control of this, you can you can store as many other sources of Python packages as you need to work. Another use case is you can order PuppetFords. You can do Python packages with PuppetFords. You can add your own custom PuppetFords and you can also store different versions of them into the repository. How much time I have left? Cool. So, another use place is that you can cost and serve better content to other clients. If you listen to Docker result format, we fully support Docker v2. That means that you can fetch from Docker Hub Docker console and control your own Docker repositories. We're going to make them visible by Docker clients. So, basically, if you want your clients to the Docker pool, you need to publish those repositories. So, Crane use them and serve them to the Docker clients. What Crane is? Crane is a small Python application which provides enough of Docker register like API to provide Docker pool for clients. How it works. So, you have the Docker registry. You have Pult. You synchronize Docker content to the Pult. At this point, Pult calls the Docker content. There is the Docker client who says, Docker pool. So, what Crane does? Crane sends the Docker content through the territory directs which are going to the location where the actual content is stored. So, why do we need to publish? Why do we need to publish? Because Crane knows all the data from these files which are usually created during the publish of the application. So, I think I answered everything. What about you? Any questions? No. Cool. So, Pult is terribly inexplicable. It's a flexible design that you can add the support near the old digital content type. The main core feature of Pult like synchronization and publish were implemented in generic Crane. So, the plugins can be send by plugins which are responsible for the fact that you can inter-pult and out-pult, right? So, when the plugins install Pult automatically discovers it. Pult is also scalable because there are different components of Pult depending on the purpose of the usage, right? So, for example, if you have a lot of content type, different repositories, you will have a lot of metadata. We store just the metadata in the database. This time, it will be much better if you will have to duplicate cost for the database itself. Does it make sense? Cool. Or, for example, if you are pretty sure that one is to be server, you won't be able to handle all the requests, it's better to have a couple of them. So, finally, think about plugins. When you want to define a new content type, you need to say what makes it unique. So, for example, what makes an RK unique? Oop. MBR. So, you say, an RK unique at this point we define the content type, then we have a new content. This guy is responsible for interrogating the server. So, going there, figure out what tools to grab how tools to grab the current content. So, basically, it is responsible to get the content into the content. Then there is the distributor. Distributor distributes the content. And it does the opposite of the new content. So, as I said, there is, for example, RK and a distributor. Distributor from the RK and a distributor makes a regular YAM-byster. Now, let's talk about the integration. Papa's design is integrated with the build sequence, with the continuous integration, the basic workflow using the general RCPI. If you want a response, for example, to a successful publish, you can have an event to get published in a specific exchange. So, you can subscribe to this specific exchange. There is so much going on and then some pointless is needed. You can say, I want to pick a job, a daily job or something. Right? We also do provide the HTTP callbacks. So, when there is, I don't know, a successful synchronization, there is a notification about it. This is the way the last way how to understand what how to do and the last way if you need to react with some actions. We also do provide consumer techniques. Another interesting feature is made by Cyclyte. About can can help you to track what is installed on each machine and network structure and what the date does each machine need which can recently be available. So, we have a lot of synchronized functionality going on. So, we have distributed a synchronized architecture where our REST API and we have all these long-time jobs like Cyclyte publish. We try to separate processes. We call them worker processes. So, what worker to do they look at the end of the queue with some job. They take the job from the queue for example, during the full period. So, from the scene they will be given to the queue and direct another job and another another. So, we use summary for this project which is being made by Cyclyte project which is responsible for I think you are a sexual artist. And I would like to mention our future plans for you to know what to expect from us or when you can provide your group of requests. So, we started the planning for period of organizing the activities that we are moving from moving to deeper progress or people affected about that. Yeah, this is interesting because we are probably one of the most relational project that uses a lot of relational to expect more relational that the place. So, we decided to make some changes and go to the proper relational documents, Postgres. We decided to use Postgres as a DB model and decided to use general and life as possible like generous framework general modifications so as much as possible. So, my presentation is coming to an end. I give you the basic concepts of what part is where you can use. So, probably you will have some ideas where you can apply it. And I want to mention that we have very extensive documentation where you can find a lot of answers or questions and if you still will be in trouble we will be really happy to help you on two more channels HALF and HALF-DEV. So, welcome to our channel and HALF was officially accepted into Dora so you can resume with the order of storage. And another thing to mention is that HALF is distributed not just in RPM packaging but also distributed as a document so if you want to go inside in a couple of seconds you can use documents as they say so, get excited and make your first. So, I have a couple of questions. Yes. Is there let's say hypothetically you accidentally delete some link is there a tool to make you go through and regenerate those symbolic links how do I do that? We are thinking about publish. It creates about a million symbolic links as it seems like and if you accidentally deleted the wrong the wrong one is there a way to force it to regenerate those? Yes. So, every publish with every functional publish because basically we often do such a concept of operational and unoperational publish so let's say with every operational publish we regenerate and regenerate and we create the same like with non-operational publish the point is that it does nothing so if there was no content and if there was or if there was nothing to do with configurations there is no point to with all these things so this is the concept of operational and unoperational publish. I agree. It's about local conditions. I've seen there is a nice sync operation so the company wouldn't involve the wrong type of application. Isn't it possible to make it to a way, I mean for Pandora it would be better which also say they're involved for any other company you think involved and they would like to make the public not just with the local containers not just in their box but also in the public docker IO so anytime they create their packages and they don't only use them great can it also sync with the docker IO isn't it possible I've seen there are HVP-Boldex doesn't have that functionality what's the question besides so you want to publish the docker? Yeah, we would like to have our architecture to be the only source of truth but we also would like to make the sync with other docker registry they should not be used as a syncing like that, we just want one way to sync Very nice for the docker registry what Pandora does are both docker registry and that is decoded from docker registry and place them on and then we want the other way we want both ways we want one docker registry where we use to build packages then we put them into publish and then we want the docker situation we want polish to push back to the docker IO so people would just pull the docker I don't think we have the docker we just have the docker it probably doesn't do that but we will we will make the docker be used for the docker we will use it to tell something else that something came then something else would operate after that but if you wanted to add it would help to push to somewhere else that wouldn't be that hard oh ok maybe we will set up an object consumer which will be listening to that but we would like to avoid it so I guess since it's not possible we have to construct the docker when publish once the docker image is somehow put to the under the control of docker some restrictions for the docker image because it doesn't know what's installed in the docker image oh yeah so what's first in the docker image the blows do manifests from clouds so how close about them but does not go below that like below the layers yes let's measure the docker docker image which was created from round 7 and it has the list of packages installed it does not know about the title itself it can just put it there like in data so all of the layers it's still a very shallow task but yeah I also know a question about the feature plans yes that posterior migration what's the reason for that is it components? is it new features? the reason for that is that as I mentioned this time becomes much more difficult to manage a lot of privation of data with no relation of database we've been using a couple of MongoDB instructions like cheating a bit now we have to show we have some kind of tasks to the SQL so we can say I think we were using amount of tasks running so that migration process might bring a couple of scripts we're reducing but that's not the reason so what's the reason the question is what's the reasoning to move to progress that's basically the progress we've been playing correct for some time about that migration from MongoDB progress and all of the people were the other action I think that's the main driver is the fact that it's an RDBMS instead of any other value there the problem with MongoDB is not working for a while you're doing a long operation and you're doing this it's a big problem and just from a commercial standpoint having it as close friends instead of Mongo means that as we move into satellite now we've got one database instead of Mongo over here and Postgres over here and we're kind of bridging them together that makes it nicer so I have I have a question do you talk about how how we can save all the beginners is that the same for the RDBMS how long yes there is an option called Retail on Versions so by default it's set to zero so basically it retails the oldest point but if you send to Turf you complete Turf in Versions depending on the site itself how do you do that? how do you do this? do you use a little format that you can follow? can I give you a question? which classes do we run or do we use a new format? you mean the new format ok so let's say you want to implement revisions for example polysemonius for example polysemonius polysemonius oh ok polysemonius obviously the changes a lot but the applications for example there are two applications interacting and the layer behavior has not changed but somehow somebody tried a new scenario and now we have to update the solutions for this so ok could really somehow distribute polysemonius polysemonius for basically polysemonius is a bunch of polysemonius and there is some trend for each product we don't bring it on policy and the products have different life cycles as well and the old way was with each change of policy which is required by some product we will ship a policy holding package but it's not necessary we can just ship the one policy in the room which was required by the so can we use somehow we will use plugins and we will define what play is this and then we will try the approach to distribution distribution we have the documentation how to do that but with the digital plugins we are going to do some try to look at changes so if the chance we will but answering your question if you want to add some support for new content you guys will be surprised to see this any other questions do you have some time for me I think I'm sad you are asking about this satellite that's not I think this satellite seems to be changing it used to be called simple and now what it really does is the secondary one just says that the other one has the f3 yes cool so if no more questions thank you for your time