 Are we ready? Are we ready? Welcome to the 10-40 talk on testing OpenShift on OpenShift. Take it away. Yep. So, hello guys. My name is Samvaran Kashyap Rallabandhi, and I go by name SK, and I work for Continuous Productization Team. And I have been working on a project called CI Pipeline, which is a part of CentOSPA's SIG community. And today, in the current talk, we are going to discuss about the following things. Like, we are going to test OpenShift on OpenShift, like use case, and how is it feasible for us, and how did we work on it, and why do we need OpenShift on OpenShift, and the basic terminology we need to understand, like the whole process, like the containers, Libbert, and OpenShift, again containers, that's repeated, and about the privileged containers, and what are the differences between them, and like why do we need OpenShift on multiple clouds, and how are we deploying OpenShift on multiple clouds, and how is the whole process is enabled by a tool called as Lenspin. And we'll have a short introduction about, like, CI Pipeline project, and we'll have a demo, and we'll be concluding the whole presentation there. Going ahead, like, our use case now is to, like, install and run end-to-end tests of OpenShift on a VM, which is running inside an OpenShift container. So it's like we are running a nested virtualization scenario where we are running a virtual machine inside a container, which is already running on an OpenShift VM. And why do we need it? Like, because of the regular system updates, we found a need to test OpenShift on, like, multiple distros, like, for example, CentOS and Fedora, like Fedora is going very fast, like we have Fedora 26, 27, 28, and 29 is in B, maybe it is in Beta, and every time, like, whenever there is an update, there is a need, like, whether the OpenShift works on that update or not. So, like, that is the thing which we are going to address. And we also need to check out, like, how the OpenShift works with the multiple deployments, and why does it fail, and how does it fail in multiple distros also. So in that case, like, we need OpenShift. We need to test OpenShift in a very feasible manner. Using OpenShift itself. So, before we go ahead with the talk, like, these are the things which you need to know about, like containers, like, most of the people who are attending this talk should be knowing about the containers. Like, containers are nothing but an isolated user space, like, which acts as a process running a shared kernel, which can simulate your work, like an environment like Fedora servers or, like, the CentOS servers in an isolated user space. And we have, like, privileged containers. These are, like, the containers which gain the access to the host kernel, and we'll be discussing more about that. And OpenShift is a container management platform which is based on Kubernetes distribution, which recently changed its name to OKD or Origin Kubernetes distribution. So I should be using that term more, like, by promoting that instead of, like, taking OpenShift name. And finally, like, we'll be the... We should also be knowing about the LibWord demons, where, like, LibWord is an open-source API which is used for managing different kinds of virtualization platforms and which have, like, different kinds of virtualization hypervisors, like a Zen hypervisor or KVM, et cetera. Going ahead, let's talk about, like, a container versus a privileged container. So when you talk about a container, like, it is just a process, or it is just a command running inside a user space. Well, then, why do we need... Why do we need a privileged container in place? Because, like, most of the times, containers are kind of secured by a... Containers are secured by a container engine, which run on top of an operating system, which is kind of using, like, the kernel, and it is running on an infrastructure. However, like, in case of privileged containers, the containers kind of bypass this container engine and get access to the operating system and the kernel devices directly, which we might or might not want to do, like, in certain cases, because unless the container is... Unless the container needs to use the device as shared by the kernel, we shouldn't be using privileged containers because there is a risk where, like, container can run commands, like, RM minus R slash, and, like, remove the host all at once. So, which we don't want it to happen. I have a perfect analogy for this, like, where... How do you guys use, like, hotels and Airbnb? You definitely have used. So, what do you think about, like, Airbnb? So, is it secure enough? Or, like, which one do you prefer? Staying in a hotel or, like, staying in an Airbnb? Hotel? Yeah, if my company is kind of sponsoring me, like, to stay in, like, a hotel, then I would definitely prefer a hotel. But, in some cases, like, in case of Airbnb, you kind of share the resources, like, kitchen and, like, sometimes the washrooms. I have pretty burst experiences with Airbnb and great experiences, too. So, Airbnb kind of acts as a privileged container. Like, a person who is using Airbnb can, like, be a, like, well-mannered person and make use of the whole, like, accommodation in a very good way, or he can destroy the house. He can set your house on fire. So, that's what happens with the privileged containers. Like, if you're not being careful with the privileged containers, the whole infrastructure is on fire. Like, people might delete the devices or it can go, like, in an error state. Whereas, in terms of... When we check out the hotels, these are, like, the secured spaces which have the security mechanisms, security mechanisms of, like, security guards or, like, patrolling around and they have access to the cops at any point of time. And each room is kind of on its own, which doesn't share the resources. And hotels are, like, have the best service also because, like, hotel management is kind of responsible for maintaining the rooms. Whereas, in terms of Airbnb, the guests are kind of morally obligated to clean up their rooms, like, when they go away. But in hotels, it's not the case. So, going ahead, why do we need OpenShift on multiple clouds? So, recently, like, we have seen OpenShift popping up on every type of cloud provider. Like, OpenShift on AWS. OpenShift on Azure. We're in collaboration with Red Hat. And OpenShift on, like, Google Cloud Platform. So, there can be, like, many scenarios, like, which you want to use. So, even, like, you would, like, run OpenShift on your local machine to offer your development environment. So, there must be a better way to choose cloud providers. Like, there must be, like, an easier way for your deployments to happen. So, advantages would be, like, you can choose what you want to, like, when you choose multiple cloud providers. In some cases, like, Amazon is, like, more costlier. Like, it's just an example. It might not be true in real-time scenarios. Amazon might be more costlier in terms of storage. Google cut-downs it costs, like, every three months to compete with Amazon. So, I might want to run my OpenShift deployments on Google Cloud. Maybe, like, Amazon is more efficient in terms of storage. So, I want to keep the storage on Amazon but I would like to run my machines on Google Cloud. So, I want to connect them together. Like, that is, like, very difficult these days because each cloud provider has its own API and it's very difficult to connect them. And the person who is using those APIs, like, should have intense domain knowledge of, like, both the cloud providers at any point of time. And, like, when we kind of deploy the whole infrastructure on a multi-cloud basis, I'm pretty sure that, like, a Google Cloud on Amazon, like, both wouldn't be down, like, at the same time. Like, so there will be less down times and there might be less latencies, like, according to the regions which they actually offer. Wouldn't it be dreamy? Like, if we have a lightweight tool that does the whole, like, multi-cloud platform deployments. So, there we get the tool called as Linchman. So, Linchman is a collection of answer playbooks, modules, and, like, simple Python scripts which enable this cross-cloud deployments and multi-cloud deployments. Going ahead, like, Linchman does have its own terminology, like, which you need to understand before we use Linchman. Linchman has workspaces where workspaces are nothing but a collection of files which are generated to manage your cloud deployments. Or it's not difficult to, like, create a workspace. It's just a simple command called as Linchman in it, and it just creates your workspace magically. And we have a print file which is the starting entry point for Linchman to grab details from. And they have topologies and layouts. Each topology constitutes different resource definitions of, like, multiple clouds which you would be seeing in the upcoming slides. And we have layouts to generate multi-cloud inventories automatically based on the data which we fetched from the different cloud providers. And the best part of Linchman is the hooks. So hooks are something which acts as, like, a pre-provision and the post-provision scenario-based scripts which you can run, which can enable, like, things like OpenShift installations. Going ahead, this is the Linchman flow. Like, a Linchman... If you see Linchman as a black box, and it takes topology and layout as an input, and it gives you an output file and it gives you an Ansible inventory if there is a layout, and it connects to all the cloud providers, like AWS, OpenStack, GC, and there are, like, six to seven providers, like, which we support right now through Linchman, and it provisions the instances. So all the outputs will be gathered from the APIs provided by the cloud providers. And Linchman is... Linchman hooks works on the, like, generated inventories. So they run Ansible Playbooks, Python scripts, or, like, Node.js scripts, Ruby scripts on the generated inventories once the resources are up and running. And they create a deployment using some magic. There is no magic in between, but it does use Ansible, so which we call it as magic because that is a pretty good tool which uses SSH, which I always, like, wonder, like, how come SSH can be an integral part of deployment? So that's what I feel so magic. So we use Ansible in between to deploy the inventories. And going ahead, this is a typical workspace how it looks like. It works... This is a workspace to install MySQL on a particular topology, and it consists of the credentials, which are stored in, like, YAML... We can use, like, whatever credentials which you want, and these three credentials, which are standardized credentials of, like, OpenStack, AWS, and the Google Cloud. And we also have hooks. We created an Ansible hook in this particular case, which installs a DB server from an external role. And after that, like, we have different types of folders to store different kinds of files, like inventories, layouts, topologies, and resources and topologies. And this is how a pin file looks like, and each pin file is a collection of key value pairs where you just give the reference to the topology and layout. In our case, like, we can use any other... We can use a layout for, like, OpenShift 3.0 cluster or, like, 4.0 cluster that creates an Ansible inventory for us to create. Ansible inventory to deploy the whole OpenShift environment. And these are the examples of the topologies which consist of resource groups, resource definitions, and there can be, like, a number of definitions coming up. So the highest deployment which I have made is about, like, which I accidentally made using Linchpin was, like, a 20-node deployment on my AWS account, which costed me, like, $200 over two days. That was crazy. But I came to know about a lesson, like, okay, you should be very careful with the count attribute of this inventory, the count attribute of this topologies and Linchpin works with bigger deployments, too, though we haven't tested in the production environments of deploying 20-node clusters, but it does provision, like, 20-node clusters. This is a basic structure of Linchpin topology where we have resource groups, and it consists of, like, different resource definitions, and each resource group can have its own metadata, which can be parsed. And the best part of this topologies are, like, they do support ginger templating, where you can dynamically render the whole template, templates of topologies, so that, like, there can be ad hoc provisioning, like, different other provisioning tools. And this is the inventory layout, which is kind of cloud-agnostic in nature, because it doesn't specify, it doesn't tell you to choose from which cloud provider. It, intelligently, like, goes to the provision instances and brings out, like, brings out the resources based on the count attribute there. To map the Ansible example layout with inventory, like, we have three sections, mainly, like, one section is the wars, which translates roughly to the all wars inside the Ansible inventory, and at the same time, like, each host has its own host group, which consists of its own metadata using the layouts. Going ahead, this would be a successfully generated inventory for, like, an app server and DB server. As you can see, there can be, like, there can be an AWS instance and a Google Cloud instance working together, or there can be a private cloud instance, too, which you can connect them together. As long as the network permits you to do that. And coming with the linchpin hooks, which are kind of part of our, like, use case of testing OpenShift on OpenShift, the linchpin hooks are, like, kind of context-aware scripts which run after the provisioning of the instances happen. There can be, like, five types of hook sleeve can be written in Ansible, Python, Shell, Ruby, or GS2, and there are, like, four states where you can initiate hook on, where one is the pre-up state, that is, which is before the provisioning is happening, and one is the post-up, which is after the provisioning has happened, and the one is before the destruction, if you want to do some cleanups to the external cloud providers, which is your own custom scripts, which you can do, and one is with the post-destroy, which can be helpful in a case where if you want to assure or, like, be certain of whether the resources are being destroyed properly or not. And this is an example hook of installing a DB server. Basically, we don't do, like, much work in creating hooks because if we use a playbook which uses an existing role, this would be a playbook that looks like it's just referring to a role which is externally available on Ansible Galaxy, so you need not write different types of roles if they are already on the Galaxy. And this is, like, a quick one-on-one of, like, how Linchpin is installed and how do we create instances. You install it via PyPy, and after that you create a workspace using Linchpin in it, and giving up and giving to the credentials path, you can use Linchpin up command just like you do with vagrant up. It creates all the instances, and if layout is specified, it creates Ansible inventory too. And finally, if you want to destroy the inventories, it won't destroy the inventory, but it would destroy the whole resources out there by using Linchpin destroy command. And finally, to make OpenShift on OpenShift possible, we used a container which specifically installs Linchpin and other dependencies and which also runs lib-worthy inside the container. So we kind of borrowed this Dockerfile from Mr. Alec Brenton-Baud. This is, like, one of the great examples which we found in order to run lib-worthy inside a container. And on top of it, we just had to install some of the dependencies, like the lib-worthy developer, devil, and the RPM builds, and the batch completions which are necessary for creating lib-worthy instances. And what we did is, like, we kind of, since we are using privileged containers almost there, like, since we are using privileged containers, we were, like, trying to run the whole lib-worthy by overriding the existing setup with the host machine's KVM device. And we got to the point where we kind of created an inception inside an inception. Like, we had a miniature VM. On top of it, we are running a Linchpin container. On top of it, like, we are running OpenShift Origin, and we are using Linchpin to run end-to-end tests. So this is a simple, like, workspace. I would like to show the... I would like to show how the workspace looks like. You can access this workspace onto this particular repository. And before I conclude and show the demo, like, I would like to talk more about, like, RCI Pipeline project, where the whole testing OpenShift on OpenShift is a part of CI Pipeline project where we are trying to, like, accommodate an automation framework which uses different types of containers and tools to make your CI process easier. And in an example project, which testing OpenShift on OpenShift is a part of, as a stage, where we were testing... we were actually getting the packages of Fedora Atomic host, and every commit that is made to the Fedora Atomic host runs through the Pipeline, and it triggers a build package and runs the functional tests and composes an OS tree. And further, there will be integration tests made on the compose of the images. And finally, like, once the image is being generated, that image is being fed to the Linchpin LibWord container where the OpenShift cluster is being booted inside a container and end-to-end tests are being ran. So this is how our Pipeline looks like. As I said, it goes... like, whenever there is a disk commit, it goes through all the stages, and finally, like, it confirms, like, this is the part where the Linchpin works inside a container to run the OpenShift test. Oh, sorry. Coming back to the demo. I hope I can play this. So this is an OpenShift environment which is running locally, and we have the Linchpin LibWord containers which are already being built as a part of the CI Pipeline process. And this is our Jenkins environment where our actual Pipeline runs. For this particular demo, I kind of isolated all the other stages and used the Linchpin LibWord container directly to run our OpenShift end-to-end tests. In this case, it started provisioning the instances and using a LibWord provider. So using a quick work through, this is a privileged container which uses the host machine's LibWord daemon, and the virtual machine is actually running inside the container where it uses the Linchpin hooks to install the OpenShift and run end-to-end tests on it. So it took like 28 minutes, 38 minutes last time, and it's going to take a little while more, but the whole demo is within two minutes. So I took a freedom to edit the demo, and now it's kind of downloading the image source from that would be a Fedora atomic image. And it uses Linchpin to boot that image and run OpenShift tests on it. Let me just forward it a little. Now it's generating the outputs, and it is generating the OpenShift inventory with hook, which would be the post-provisioning hook, and it started installing the OpenShift environment onto the virtual machine. And once the OpenShift single... In this current experiment, we tried to run the single-node environment because running a full-blown OpenShift deployment inside a container, which we tried it, kind of crashed our environment like multiple times, so we just wanted to check only the single-node environments. So coming back to the presentation, I thank my whole team, like continuous productization team for all the support they have given me and for giving an opportunity to work with this particular project. And feel free to fold the repository of CI pipeline. We are looking for contributors. And on free-node, we are continuous infra, and we have a mailing list for continuousinfra.com if you have any doubts. So any questions? Yeah, sure. What performance, like, in terms of comparing Winchpin to, like, just using an Ansible Playbook that uses the AWS modules or the OS modules for OpenStack in terms of provision? Performance, like, since we are underlying Playbooks of Winchpin actually use this OS server and other things, like, if you are just talking about the provisioning of instances, it hardly differs in milliseconds because the things which we do is, like, it is just instantiated by Ansible API. And the Ansible API calls the Playbooks where the OpenStack server, like, OpenStack server modules are, Ansible OpenStack modules are referred. So it shouldn't be any much of a difference. But when you talk about, like, a whole run from creating an instance to a generation of inventory, which you need, like, multiple Playbooks to run, and still then, like, Winchpin gets an advantage on top of it by simplifying the process rather than on performance. So in Winchpin, we have multiple components called as RunBB where we use a database to store the existing topologies and successful runs. And you can repeat it again and again, like, based on a transaction ID. So essentially, it can be, like, thinking about that, it can act as an external cloud service provider. If you write a REST wrapper around it, it can be a full-brown, like, provisioning as a service kind of thing. But currently, it's just a small lightweight tool which does provisioning across multiple clouds. Any more questions? Have you played with Qvert instead of, like, manually? Not yet, but some part of my, like, a couple of people in my team, they have started working with Qvert. And I heard, like, pretty cool stuff with Qvert. So we are going to, like, we are going to, like, implement Qvert soon. Any other questions? All right. All set. Thank you. So, like, one more small announcement. Like, I have a talk about, like, CI pipeline for dummies, which could have been better, like, if I give that talk first because that has, like, all the basics of, like, how the pipeline works and how OpenShift works and water containers. And it starts from a basic level where what is the software and how does it work. So, like, feel free to attend that, like, and that's it. So, thank you.