 Hello, everybody. Thanks for coming to my talk. I'm Miro, I work as a senior principal quality engineer at even though I'm now a full stack developer of testing farm. So QE is just a part of it. We are also doing all the stuff for my sorry to development. And we are also testing our stuff. So yeah, welcome. Yeah, small agenda like what we will be speaking today about. So I will guide you about introduction of everybody maybe knows this is what is testing farm. I will also tell you what is TMP and why it is important, why testing farm and TMP together are important for this talk. I'm gonna go through the testing farm request anatomy. It's just some introduction. I think there might be people who don't know this stuff. So I think it's always important and I will dive into some interesting features that I think they are interesting. And we'll look at some use cases of our testing farm is being used. That's on GitHub, GitHub, Fedora CIA, CentroStream CIA and how you can use it also otherwise. I will show you some future plans. And I hope because the content is packed, but I speak fast. So, yeah, and I will do a demo, but the demo is taking a little bit time. So I will do some questions with it and I think we will do it. So what is testing farm? Testing farm is an open source testing system as a service or our code is actually open or our infrastructure code is open, even though like it's not super well contributable, but we are working on it sooner or later. It will be that way, but it's like people contributed even from outside Red Hat. So it's basically a flavor of software as a service and we are focusing on executing automated tests. And we are running this automated test against most AVMs of burn metal machines, but also containers. And we are really back end of CI systems. If you think about it like from the high level, that's what we do. CI systems call us to do the dirty work provision the infrastructure around the tests and return back results. Testing farm itself is being used quite a lot. We are also this, this is a bad word, like something that is on a lot of places in some of the real infrastructure, but also we are in federal infrastructure, sento stream infrastructure, github, github. So we are on a lot of places, we are doing the same job there running tests. We are doing it in hybrid cloud, that's important because we have one single public API that is being used by everybody, but we have a worker deployment in Red Hat and one in public. So we are running tests also inside Red Hat, but also for the public stuff. This single public HTTPN point is important. We wanted to be really open source and we didn't want to have multiple HTTPN points. We want to have one really and sort out all the problems that come with it. It's reachable here. If you go to this link, actually, that leads you to our documentation and you can look how it looks like. It's fairly easy. It's still 0.1 version. We are slow, but it works well for three years currently, what we are in production. Our request testing farm is really easy. You specify what you want to test and you specify the environment on which you want those tests to be run and testing farm returns you back results from the high level view. Right? That's the worker deployment that I was telling you about. The API is public range. What is the worker deployment? We call that range because we are farms. So the worker deployment is range. That horse has nothing to do with anything, but I just find it nice. So that public range runs everything I will be speaking today about. Red-head range is something for internal stuff and audience because those tests are running inside red-head network and providing results only to red-head employees. But I think it's half of our requests, half go to public range, half to red-head range. As you can see, we support a lot of infrastructure. We are running a lot of containers, actually. Majority of workloads is running against containers because we run some generic tests. But mostly we do AWS, also downstream, or I mean in red-head also, also inside red-head network, but also upstream. We have AWS, actually they are connected to internal network. We do internally open stack, internally beaker. We have this nice new REST API which you can connect any other provisioning system very quickly. We have some Azure preview and we are planning Google Cloud and IBM Cloud. But majority is this one is used just for one special device testing now on QDrive trees. Hope I can see that. So I will be speaking only for this part. This is about, the talk is about, I will not speak about red-head stuff because I'm not sure if everybody is here red-head employee, there is a lot of you are, but I will tell you that on QCamp. So what is TMT? And why it's important? So TMT, it's my little bit word, maybe not everybody likes it, but basically for me it's an open source test management, reimagined in Git. We did this for REL because we had a lot of legacy systems, or we still have inside red-head, and we wanted something modern that basically the folks who are working on REL can easily open source test to Federa or CentroStream, so they can share. And it was funny because the test could be executed on Federa, but there was no infra actually running it, and also there was no system where you could easily consume this test, it could be nicely polished that you can share the test between Federa and CentroStream, combine them together and so on. So that's why TMT was created. It's open source, there is already like 50-60 people contributing to it, so it's getting larger, we have, check it out, it can do for your project test management. I don't know if, like, who of you works in QE, but there are test case management and test management systems, usually something that you pay for. This is in Git, you store some metadata in Git, and you can do test management in it. It's pretty cool, we like it. And then we are connecting with these systems, like Polarian and TCMS, what are some internal deployments of test case management systems, so we are creating some export plugins, so we play nicely, but we love this, that we are working with tests and test metadata in Git, because that's cool, that's what everybody does, like correctly, I don't know, it's Git, you can create a merge request again, your test cases, that's cool. Just this is Git, right, so that's the default. Otherwise, TMT is also a CLI tool, and test executor, that's where it relates to testing farm. Testing farm uses TMT to execute tests on their workers. So this test management, of course, we have a specification for it, so in this specification there are few attributes that I'll be speaking about, so there are a few levels of test case metadata, but TMT is really focusing on that you don't need your repeat yourself, so it is basically kind of YAML format, but it's on some steroids, you can basically store your metadata in a hierarchy, there is some inheritance and stuff, so it makes it really easy that you can have your test configuration. Test metadata, dry, don't repeat yourself, right? Otherwise, if you go to some project which uses TMT, you install TMT from whatever operating system you are using, if it's rel-like, then install TMT, and you run TMT, it will discover and show you, so it's really, we love it, because it's just looking at time. You go to a repository and discover what tests are there, in general, and you can have multiple unit frameworks, you can have multiple integration test frameworks, so that TMT once will have this nice way how you can really discover all the tests that you have, regardless unit test frameworks, what not. TMT is actually, for me, it's test framework agnostic, you can connect it with anything, really, even though it also can execute tests and has some preliminary support for some framework that we use in Redhead, but people run PyTest via it's avocado, a lot of stuff can be run in any Ruby, JUnit, name it, Ginko, Ginko, sorry. Okay, yeah, so TMT has this, I'm not gonna go to the details much here, I will tell you in the next slide why, so TMT has like four levels of test metadata, one is like core attributes, which applies to all the other levels, then there are tests, plans, think about tests, like one test case, one that is testing something, and then you have plans what is like a collection of tests, where it shines, that you can nicely select tests, fine tune your test plans in a way that something only runs on CentoStream, something only on Fedora, and something only on Rel, and so on, so this comes from some use cases that we had in Rel, that we needed this, really, the selection of the test is important, but you don't have to use it, you can just use TMT as a very stupid thing that runs one test, whatever. Stories, that's interesting, link and create user stories in TMT that is going a little bit beyond, even test management may be going into the project management, but they find it useful that you can, when you go to the TMT project run, TMT, you look up the stories, you can check which stories are covered by which code, which tests, so it's interesting, check it out, I don't have too much time, tomorrow we have workshops, so come there and try it out yourself and better here in the audience we'll give you more information about this, and it will be really brief here because we don't have time. So how does the testing farm request look like, what's the anatomy of it? So I said that this is really simple, so first you define tests. You define tests by pointing testing farm to a githry poem. Oh, there's too much. Right, so this is our API, we support actually two test types, TMT and STI, and if you are using STI then migrate to TMT, please, I will tell you later on why, and the TMT is the main form, but you don't support anything else, but you can see here that the testing farm can be actually agnostic, TMT is just like one execution of tests, like we were planning to add some random running of some script or whatnot, but it was not really the TMT, now it can do all the stuff, but in theory we could run other larger frameworks and integrate it on various levels so you have only TMT, you give it a gith URL, gith ref, because it's in gith, the test metadata is in gith, you point it to that gith, where it finds the test and then we have some cool other features that are TMT specific, you can select test directly the API, you can say that filter me this tests, filter me these plans, these tests, you can change the TMT root directory, so the directory where TMT looks up is root, it's something like gith, gith also has like that gith somewhere, TMT has also some file there where it denotes that this is the start of the metadata tree, it's usually in the same root as the gith repository, but you can change it. Nothing so much interesting here, so the test and then you have, that's what I wanted to say, that we are planning to depreciate STI finally and there will be a further change proposal, maybe a feather of 40 maybe, I don't know, we'll see but I spoke this folks who are running working on this, we think that TMT now can replace STI fully, we are over it like functionality wise so we hope that, get rid of it because we don't want to maintain two things doing the same thing why, there is no reason we have better and then we can just spend more time making TMT better, so I'm asking you if you could slowly migrate, I would be glad there is a nice migrate guide on TMT, it's linked here ok, so the environment specification, so testing farm like takes the tests and then it can run it on multiple environments, think about running it on multiple architectures in public to support x8664 and ARM so you can run the same tests on two architectures, but it gets more complicated because each plan is running in a separate environment so if you have 100 plans then all will run in a separate VM testing farm then parallelize it and I think for 5, we are running now in parallel of 5, on 5 VMs and we are crunching all the plans and of course all the environments so it can get messy like people are running via hundreds of plans thousands of tests we are if you run million then it will break but then we will have to fix it nobody just came, we had some people who are running really large stuff but currently seems like it's ok so that's the environment in environment you can specify a lot of stuff so you can see it's an array of environments where you specify the architecture as I said, you choose what operating system to run on via OS compose then you have pool one thing that I didn't mention and I will have it in the features we are trying to abstract away you to knowing that you will run on specific infrastructure you actually, our users usually want certain amount of CPUs, memory, disk size 2 disks, 2 nicks PPM support, UFI support those are generic properties then we abstract away from the users and we choose the infrastructure for you because maybe UFI is provided by Azure but also AWS so we will choose the infrastructure that we want you don't care, you just want the property but if you want really AWS then with pool you can say it then as a good testing system you can pass environment variables to the test execution any environment variable but you can pass also secret environment variables so the time thing that we hide from you that can be useful if you need to deal with secrets for example, uploading VM images to AWS or uploading your container image to quite that you build the testing farm so we can have secrets there it's being used one of the features that we had to implement is installing arbitrary artifacts before the test run so if you are testing a Koji build most probably you want to install it before to test the real thing so currently that's that's how it's done you can ask for installing an array of Koji builds, Copper builds repository file adding a repository file to the environment sitting at the time should be fine then you can specify the hardware as I was talking about it and then some other stuff so that's roughly the test environment stuff the environment can be influenced via this API you say what you want but also some parts of the environment can come from the plan for example maybe in your plan you already know that you want to run against you want to your test the tests in this plan need 8 gigs to run well so you can say that in the plan that this combination can be done then as I mentioned the test selection so there is an adjust code attribute in TMT where you can really fine tune the execution for your test so it runs only the specific environment something only runs on S390 that's a little bit an internal use case but something only runs on ARM something runs on X86 so we are ready I think TMT is very good in it in this test selection part testing farm then runs stuff and if you are integrating with a service we have a webhook mechanism so basically packet is being their API is being hit when we change the state from new to queued running complete so they don't need to poll us they are informed that something is changing but otherwise you can poll the API get the results maybe you don't even know that you are using testing farm you just know because you get to some link like this and you can see that you are hitting our results viewer what has been contributed really easy nice, like these are all plans this is actually from TMT and you can see a lot of tests being run there there is a nice reproducer did I mention that TMT is also a command line tool for local debugging tests if not then I'm sorry but otherwise like this reproducer it looks a little bit larger now it's a little bit older picture but you can just paste install TMT paste this in your in your localhost and it should do mostly the similar thing as CI does it's not completely the same we are on the path to making it the same but it's complicated because in your localhost you probably not having AWS machines you have Liberty and so on then we have here some stuff about the environment preparation and installing the builds we have some playbooks that run before the artifact installation post artifact installation you can see here also some links to API request when you can look up the details about the request if something fails then we if something fails then it looks something like this by default we show only fail stuff because like when you come to a run you usually want to see the fail stuff because that's what interesting like past that's cool, right? so here but otherwise only show only fail then you can look up like exactly in the log what failed what you would expect from a testing system show me the failures give me some reproducer and you need to sort out the details when something errors out we are trying to be reasonable we are not always reasonable maybe once we will feed this into chat gp then it will explain to you what is wrong or we will fix the code so it's more reasonable so here we tested testing farm to repo where there is no TMT metadata so it looks like this didn't find any plans and it tries to give you a hint that run this command then in your repository and you will most probably see that something is not happening there is some context something that is being passed from the CI systems that is used in the adjust rules so this context that is being passed it's not it's something that we do not auto detect passing us some details about what architecture is being tested and distribution and what is the trigger in this commit and then you can adjust your test according to this context so that is the selection stuff ok let's move on so like TMT when you want to try something on your your local host you don't need to care about testing farm at all so install TMT go clone repo or use TMT init command to play around with it so TMT is really for these local use cases local stuff we also now have a testing farm CLI tool which will be used for onboarding and interacting with testing farm if you have yourself a token but we will be blending somehow TMT and testing farm in the future we will see as I said testing farm already uses TMT and also real TMT will use testing farm if you are an automated system you are using most really testing farm to run the testing scale so packet uses testing farm to run test against copper builds or without copper build installation but run test fedora ci there is a Jenkins instance I will actually doing some details about this here I will just move on and yeah so if you are user just use TMT and once you get in the place you want to run this in ci you will interact somehow with testing farm so you will not even need to know that they use testing farm because you will use some of our users like the ci systems that we interact with sometimes you don't even know that you are really using testing farm you will see that if you see that occulus viewer the results viewer so what futures are there now in testing farm this infrastructure agnostic hardware requirements that seems easy but we have this beaker system inside redhead that has super variety of infrastructure sorry bare metal machines that have different network cards, different CPUs and we want to get to this detail so this is all fun like public but in reality in real we have a system that is providing a super wide wide inventory of bare metal machines so that it gets funny but we are trying to get it out as a infrastructure agnostic way so you can define it and you can say for example that I want only this CPU maybe it is on aws so we will choose aws for you so we can run multiple environments as I mentioned parallelization up to 5 reproducer steps I showed you we have a testing farm CLI tool that I have for demo we can now request testing via it restart some tests if you have the token of course run run some arbitrary command of testing farm somebody ask me like is selinux enabled on zento stream line in testing farm and I told him like then what should weep if not so of course it is but with this run command somebody can just run testing farm run on zento stream 9 sestatus sestatus and he will see answering this simple question should be possible but we have now a reserve command so you can reserve as the community member if you onboard the testing farm and give me your public IP like I will make it available to you later on it will be not needed of course we will add it automatically but it is here that is what I am going to be demoing today what I am really glad about that you can reserve a machine according to hardware requirements and have fun on it we are not a funny service like some path project we have SLA, SLOs we are doing devop stuff monitoring stuff we have SLO error rate less than 5% API time more than 99% queue time that is the new metric so under one hour but we are now under a few minutes in public actually we have good infrastructure so just that I am not lying this is actually an internal dashboard I am sorry for it we don't have it in public yet so this is where we are keeping the SLO also with other teams who are running services inside Red Hat and we will open source it somehow definitely because the metrics are actually open source but like grapho nice load to load then we have this VEPOG notifications I mentioned we can do secrets variables in test execution we can integrate with some internal other instances that somehow deal with results actually report portal is coming to federa that is a system for storing results we will integrate with it so you can have a history of testing results testing form really has just this simple view variables in TMT environment I think I already had that there filtering plans the queue time is missing here so this is for the last 28 days like our error rate maybe I should slowly start opening the error budget mode because if we reach 5% we drop in the error budget mode we drop all work and just go fixing the stuff but API uptime looks I think that's even unreal but it works like we have metrics and it's no just some data but I was surprised seems like we are very stable in the last 28 days at least for the API time queue time currently that's actually on public range 14.2 seconds it takes until testing form queue goes running here we are we need to spare money so it's a little slower we are still not doing the scaling well as we should ok 5 minutes and then I will do the demo and questions just the scale I said to you so we are running now 700,000 test requests a year so it's like there is stats testing for my we can look it up and it's there this also shows how we were growing how we were growing how we were growing from the time then we got to production will it load yeah so 2020 we started 56,000 requests so we are running maybe 700,000 this year slowly but surely as more people on board yeah the main use cases of our service are basically a public service using our API to run tests publicly then we have redact the systems who run tests and run it or the results are available internally this is quite locked down but basically with testing farm red hat teams can validate their merge requests to github against unreleased rel or any unreleased other products I think that's cool because it was very hard to do before it's locked down it needs to be very safe because like it's it's a weird scenario in these terms but it's making sure that red hat employees can validate their products very early shifting left as much possible is a big selling point for red hat employees I believe internally fedora cip packet then we have some zool integration github actions and the CLI tool which you can use to interact with services only if you are using CLI or directed HTTP API you need a token otherwise with this also with github actions you need a token otherwise with fedora cip packet and zool you don't even need a token because the services take care of it you just drop some files in github repose yeah not much time left I will do this very quickly so packet itself a github application that can test your pool request it can build a copper build and we can validate that copper build installs in the system and then we can run test against it easy as that report back to pool requests I will just show a few links here because we don't have time I was not so quick as I wanted today maybe that's good so here is packet running and you can see that it's testing on a lot of lot of all the versions I think we saw potential stream 8,9 fedora and it's running test against that it's actually directly TMT TMT is of course like we are dogfooding as hell then packet can also test only you can skip the copper build you can just drop some files into github repo enable packet and you can run test against that github repo vm shoot yourself anywhere you can so packet easiest way to run test via testing farm enable packet at TMT metadata to the gith no testing farm token needed github is a little bit harder to setup but also it can do github configure number of requests in packet so packet can run multiple test jobs those test jobs, multiple environments and multiple plans so it can get large but like it's possible so packet you can have multiple jobs differently it has very good configurability because packet implemented a special field they can patch our API so you can do a lot of stuff with that and you can even run test against internal infrastructure with packet so if you are a retro employee it's really easy no secret support that the only limitation that I would say packet is perfect and I would just say like that's all for github but no secrets there is a problem of sharing the github secrets to packet so we need to sort it out most probably vashikov volt but it needs more time federasia is having a jankins instance that is reacting on cogibilts and it's running tests and it uses webhook mechanism in jankins and they are reporting results and yeah if you go to bodhi you will see some automated checks everything we start with federasia it's run via us also federasia runs some generic test via us installability RPE inspect rpm dapling and they run even via containers so because not every test needs a VM if you don't need we can run against containers zul is the next css system for testing disgit PRs on pager and we have it actually in two flavors it's also for cento stream disgit contributions but also federal disgit stream contributions there we integrate with zul via a playbook and you can integrate wherever you want and zul is providing the result maybe you know zul check it out very easy onboarding but this is not configurable this is not configurable so much as it should be and I will go to the demo then we have a git collections workflow you can read what are the benefits but rather you spec it if you don't need secrets and if you really need to integrate because call API directly you can use our HCTP API directly via HCTPi or you can use testing form CLI tool and we have a small image you can use also that and in future we will have multi-host tests and easy onboarding via fitr and retet single sign on and more infrastructure grow more infrastructure because we are hungry and now onboard and now demo time so this is like the newest thing I wanted to show you it's available if you are willing to share public IP with testing form then I can make it available for retet it will just work but it will take us a little bit more time to share the public IP automatically so you don't need to know but if you want early onboarding let me know somewhere you can find the contacts in the last slide so I want to reserve what I want to reserve sensor stream oh ok I want just feather arrow height I will just click it ok feather arrow height intel architecture reserved for 30 minutes and now I open for your questions until these 3 minutes I will get root and I can do whatever you want so that's my demo reservations on testing form if you have a token you can reserve a machine and do investigation this gets you the exact same environment parameters to make it the same as your testing form as your request was but maybe we will adjust that it reads it out automatically so this is very cool I am really happy about it I am ready for questions questions was it understandable yes so the question is if I know how many people use this for be a packet I think packet has a nice slide with those large components I can query this but it will take a little bit time so after the we are testing hundreds of feather up packages 900 was last I checked but with packet I will need to check somebody from packet in here they didn't come so we will need to find out we don't have good statistics for this and we should have but I can query database after the no problem sorry no problem so if you have some specific hardware requirements because we own the infrastructure I own AWS account I know beaker so you have a beaker account or AWS account I think it should be possible we currently do only AWS but it is internally so if we have accounts in testing farm it is possible then to provision these other clouds we have some Azure support others are not implemented so if you need for example GCP it needs to be implemented so you can contribute, our code is open or you can file us red hat issue and we will try to get to it we are all based on user request so if there are more users asking for GCP we will implement GCP then we are planning also to make it possible that you can give your credentials somehow to us then we can use your credentials because that is needed for cloud costs on our accounts if that is not a problem internally we have cloud cost dashboard I can tell you that your team how many dollars it spent on AWS so if that works for you but I am free to speak after the talk and we can sort out the details any other questions was it digestible or not that's an improvement last time I said slow down the recording if you cannot understand it was better I promise it will go through just let's wait a few seconds how much? I am over time 2 seconds, 3 seconds now it's actually now it's preparing the environment we hacked it a little bit it shows the status we will see that it was running setting up and now it's preparing the environment at the end it should log in it takes so much because our guest setup playbooks are slow and they are updating the whole system so maybe it takes longer it works and I wanted a normal machine but I could provision a machine with 32 gigabytes of RAM more disk size but this works if you want this it's available we have actually this tool on pipi if you have a token you can already use it just let me know somewhere that we can meet together and I will need your public IP and we will make the access to you because you need to somehow access these AWS resource we cannot just open it for everybody because somebody in test they set the password to Fedora then the bot comes in and it will do something so we need to make sure that this is safe and in Red Hat there is no problem because it's an internal network so we already VPN but if you want this it's available catch me, thank you for your time