 But now it's time to change the face now it's not business enabler only now it is going to be humanity enabler and why I am saying that I have strong reason behind that. So there are so many things which is happening in the technology area in cloud area. So we all know 5G is also on the plate. So cloud with 5G there are it in the collaboration it is creating a strong platform which is going to be helpful what we had seen past 2 years due to corona. So in the healthcare sector imagine we used to see the telephone booth sometime back at the corner of the street right likewise in the same manner there will be medical booth let's say okay if anyone is feeling any problem medical condition or something they just go over there they just plug in their hand or whatever procedure is there and then based on the analysis the respective shot the respective drugs will be injected immediately without doctor with perfect manner imagine that kind of thing obviously that kind of things are coming right. So that's where the cloud cloud is doing a strong collaboration with the 5G yeah we cannot forget the cyber security area the more we are exposed to internet the more we are exposed to the world there are so many threat known and unknown everyday those will be increased. So of course in the 2023 the major focus area is going to be on the cyber security as well and integration right integration is another aspect which need to be covered so we know and we used to call that legacy system right I think legacy system integration is not going to be more challenging because the integration layer integration services integration platform coming into the picture like as a service okay we have Rubik and other services those are going to change the all integration challenges paradigm and those are going to be in a plug and play kind of thing so we can see legacy will now more challenge for anyone and everyone and Kubernetes use cases for Kubernetes is going to be increased of course a skill the definition a skilled person a skilled team is needed to be part of a successful Kubernetes to actually leveraging the full capability of the Kubernetes so in our organization the existing team skill resources capabilities are going to be changed and we cannot talk to DevOps. DevOps is the parallel stream and in DevOps so far we just use CI CD just continuous integration continuous deployment but we haven't thought the capability to extend it further so imagine going forward what could be the possible use case for the DevOps so in the DevOps basically CI CD automation complete automation and I'm saying so like human they validate the task cases for business critical services still we think we cannot rely on DevOps we cannot rely on complete automation we cannot rely on complete automation so considering that situation the automation will be integrated with AI right AI the actual case of AI so far we know there are very limited use case of AI because there is a fear factor also which is involved we cannot trust AI 100% right but going forward the AI models are going to be mature and that few of the use case will get implemented in 2023 itself that we will see we will see we will adopt them very easily that will become part of our day to day life we had seen IOT we had seen so many things coming into the picture in our day to day life as well even some of the use cases we don't even know we are already so this is how the technology is getting improved this is how the technology is becoming part of our day to day life right so there are so many more and that's why now we don't talk about the 5 year plan we used to do that I do remember in my organization also and sometime back when we talk about the long term plan we used to say okay it's plan for 5 year now there's no 5 year plan let's talk about now let's talk about this year only yeah because for sure whatever plan you are going to do for 5 year that is not the relevant even your second year or third year itself so there are so many things which is going to change our life so with that later note so enjoy our learning today with that I invite Mr. Ahmed Safanaj for CI CD just give a big applause for him okay just welcome morning Singapore I'm gonna talk about how you can level up your CI CD workflows using GitHub actions so I prepared these sessions for beginner friendly I don't know how many of you are students so can you raise your hands if you are a student okay we have a couple of students right fine so let me come to session okay before that let me do a self-introduction so my name is Safanaj I am coming all the way from Sri Lanka and currently I'm working as a software engineer at Cisco Labs a small advertisement here and these are some other communities that I'm being a part of so if you want to get connected with me or if you have any queries you can reach out to any of my social handles so yeah that's all about me and yeah the agenda is simple we're gonna talk about what is CID since this is going to be a beginner friendly and why we need to use CI CD and what is actually GitHub actions and what are the benefits of using GitHub actions and finally we can have a demo if the time allows okay let me remind you the hashtags for today's event so if you are posting something on social media or if you are sharing your feedbacks about my talk make sure to follow these hashtags and my handles okay let me tell you a story of a typical developer so as a typical developers we used to develop applications and then we do some manual testing locally in our PCs and then we deploy it right then we used to pray to make the deployment successful and we will be again do some testing environment and if it didn't work we had to go back to the develop stage if it is work we have to pray more to keep the application successfully so this is the saddest life cycle of a typical developer where CI CD was never exist and that's where DevOps came into picture and back in 2000s like departments like developers and QVs IT team and security they work individually to achieve something like machine in the IT industry right so basically developers used to code and QVs will be testing you guys know the stuff right so CI CD can someone tell what is stand for CI CD can more into specific into students yeah it should be continuous deployment or continuous delivery it's correct okay I will give some github stickers make sure to bring me okay so CI CD stand for continuous integration and continuous delivery or deployments so let me start with what is continuous integration so it is actually a phase and the practice in software development cycle so we are developers work together in a single project and regularly do merging quotes so this is basically they are we will be out some basics such as a unit test and integration test maybe you can have some code quality checks so within a short lived in so what does it mean by a short lived environment in the sense so you just need to spin up an instance a simple need right so you don't need to run all the time and you can set up your repository where someone every time they push the code the CI can trigger and it will be running so if there any build or fails it will notify the relevant team so you can have a look at what happened on the CI pipelines so typically the CI pipeline involves the following tasks so it is used to direct changes in resource code and we can build and generate the artifacts that we need to deploy and also we can form basic tests such as a unit test and integration test and you can have a status of report so if you are having a code checker so there will be a status of report of what is the percentage of your coverage and if they both steps fail as earlier this will notify the relevant team so you can have a look and and fix the stuff so there are some ways of using CI so first one is to develop a productivity so you don't need to build the applications manually where it's time consuming and so there you can save some time and you can find the bugs and address it earlier before it's reached to production and grow big ones later and you can deliver a bit faster right so you just push the foot so the CI will be running so you don't need to spend some time on your manual works and let's talk about what is CD so CD is a suffering approach so what changes can automatically deploy into a configurable environment so it could be a testing of production environment so in practice in simple term developers code changes can go into live within a minutes after writing it and if we can discuss about the benefits of using continuous delivery it's part of made the process of software releasing so you don't need to build the stuff and take into the server or whatever the platform that you prefer and again the developer productivity is high and you can find and address the bugs earlier and you can deliver the update fast so typically CI CD pipe can contain cases one is where you can and test and generate some artifacts for your pipeline and the second phase is CD so where you grab the artifacts and deploy into a server or something so let's talk about the GitHub actions so what you make your workflow from idea to production so we have seen such as some eight phases in DevOps cycle right so you can automate all those eight phases using GitHub so you don't need to spend some time on your manual works in any phase so when it comes to achieving DevOps culture in your company or your business projects so automation is a crucial component right so beta actions allow us to automate customize and execute your software development workflows right from your report and yeah you can build test and deploy from your repository so you don't need to spin up some server you don't need to connect your repository with some webhooks you just have the configuration that mean the workflow file in your repository it's simple as that it will work so why we need to use GitHub actions so the first point it supports multi-operating system starting with Linux windows and Mac and containers so container is something we need to highlight here so if you are coming from a Docker background so you don't need to go with VM so you can simply run your CI CD stuff containers and metrics means if you want to test your application in multiple operating systems such as Linux and Windows parallelly GitHub actions allows you to do so you can save more on that and any language so far it supports most of the programming languages such as Java script and you can see in the slide go and stuff and live locks so right after your workflow triggers you can see the locks what's happening in your CI CD pipeline so if something goes wrong you can have a look at your locks and you can fix it earlier and the next point is built-in secret store as a good developer it is a best practice to hard-code the secrets and token stuff in your source code or pipeline so GitHub has its own secret store where you can have an open accession level secrets or a postal level secrets so it can be you don't need to hard-code the stuff and multi-container testing again it's similar to DevOps stuff if you want to run the CI CD against your containers so you can have a multiple multi-containers and make it easier and again it's simple right so you don't need to be a super-duper expert on DevOps so if you are a developer who knows some basic stuff on how to run a CI and how to generate the artifacts it's simple that's it you don't need to have a wider there was no way to do to a good GitHub actions and there are like a five components in GitHub actions or even steps jobs actions and runners let's discuss our one by one we all together we call it as a workflow file which need to be written in YAML so a worker is simple it's a configurable automated process that will run one or more jobs in your repository so it's need to be defined in YAML yet another markup language so a single repository can have multiple workflows so you can have a workflow to run the CI can have another workflow to some testing you can have one more workflow any number of workflows in your repository to perform some different tasks and it's need to be defined in the directory code slash workflows otherwise it won't work and again there are some rules for that and the workflow should contain one or more events that can trigger the workflow it's need to be triggered right it can have one more jobs to execute the workflow in your animations and either you can have a script or you can use actions templates so typically this is how workflow file will look so this is a simple node CI pipeline so here you can see the event will be listening to on push event on master branch and the request for the same master branch so when someone made a someone actually directly pushed to the master row some made up the atom master the workflow will start and here you can see we can have a variables defined there so you can reuse it later and you don't need to always run your own actions using shell script or something so you can use templates so this is the checkout template that's mean you you once you run your CI CD in a Ubuntu server you need to have the repo source code VM right so you just use the checkout action from action repository so where we defined the Node.js version and stuff and the next one is shell script so if it is a machine you can once you install the Node.js and stuff you can perform this command site so that is it and this is a simple CI workflow will look like yeah additionally we have one more step where we upload the generated artifacts into this particular locations so this is a typical CD workflow so what we do is the above step is to build the artifacts and the deploy steps are defined to download the artifacts from the directory that upload and we are deploying to AWS here we have the secrets and stuff so this is not hard-coded this is coming from the GitHub actions built in secret store so evens we have like hundreds of events that can occur in the repository so these are some events that can occur so here you have a link where you can see more about what are the events that gonna trigger in the GitHub repository so you can have a workflow which can trigger based on someone pushed or someone made a fork or you can even run cron jobs and there are like two types of GitHub events so one is single event so basically this event is going to be triggered when someone open an issue in the repository otherwise it won't and you can have events multi events workflow so basically this particular actions will be running when someone push or when someone fork in the repository so jobs is basically it's a combination of shell script where you can reuse it later so you don't need to always write the shell script for all you need so it's like simple as reusing the code piece of code and actions so as I said you can use actions from different places in place so you can I will show you how it will look like in marketplace and it's simple as even that triggers workflows it going to run some actions so you can automate your workflows here you can see some very famous templates such as deploying orders to Azure web app and publish doger containers so you don't need to write everything from the scratch so if you want to publish a docker container right after you merge a payload request so you can configure this template which is available in GitHub marketplace that's it so we have more than 11,000 GitHub actions template available in GitHub marketplace so you can have a look and how you can use GitHub action here I will tell you some like interesting stuff so first thing we know we discussed about we can create the CICD pipelines using GitHub actions second one you can auto label your pull request you can have some configuration so what label need to be added if the PR contains these keywords so you can have you can do that as well and you can generate lighthouse report how many of your frontend developers one only one okay sad okay so if you are coming from a frontend development background so you don't you have to check your side scores using lighthouse which is available in Chrome so you every time what we do is we used to test the report manually so it will take a lot of time at least five to seven minutes will be there so you can configure this that stuff in GitHub action so right after you push the code lighthouse report will generate to you so you can check the score there and you can have a code quality checker so if you are working in a strict company like me there will be a code checker code quality checker so you can have a sonar keep configured over there so if if your unit test and the code quality stuff is above the threshold your PR will be passed otherwise it will be failed and as we saw you can build containers and publish docker images so you don't need to remind difficult docker commands and you don't need to do it manually so you just push the code docker image will be published in your configure environment and security analysis that's a inbuilt feature in GitHub if you are like hard-coded any secrets or if you have any vulnerabilities in your source code it will be you can even have an action for that and automated infrastructure creation so this is something what is today's and Terraform you can create infrastructure using GitHub actions and unit testing, linting if you are coming from a JavaScript background you know about linting right so you can lint your code automatically and you can deploy application to any cloud platform or we have like more than 11,000 of GitHub actions for all the cloud vendors so you don't need to write everything so you just use the template and go and send messages to or notification to people so you can configure templates if the CA would pass notify these people and if this if the CA is failed notify this set of people so you can configure those stuff as well and finally you can even order pizzas using GitHub actions so I have seen a company every sprint if this if the release is successful the GitHub actions will automatically order number of pizzas for the team so that's the condition they configure so so let's discuss about the pricing so we have some bla bla pricing based on the OSN the number of minutes that you're going to run so at GitHub they love open source right a lot so if your project is a public one GitHub action is free for you you can use any number of minutes for free so I have a demo I think I'm running out of time so let me show you the repo where you can see the workflow so this is the demo I had this is a simple react application so I need to deploy this react application to Firebase so if when someone's merged the code it's need to be deployed into live environment here you can have a look once you want to do the same and this is the live channel workflow and the second one is online that means like this I have configured this workflow if someone made a project to master branch so as a frontend developer I don't need to call on that particular branch and check the changes so this will deploy to a preview channel in Firebase hosting so when someone made a pull request to master branch so there will be a channel for that so I can have a look at here once I deploy there will be you are generated in the PR description so I can have a look at the UI changes there so that's the demo sorry for not starting the demo from the scratch so I'm running out of time and a little bit we have to go from here if you are a student please go for GSOC and create your own GitHub actions if you are a Linux background person you can create your own GitHub actions published on GitHub marketplace so someone can make use of it and open source and keep contributing to the community if you have any questions it's a time yeah but we have time for a couple questions if anybody has some thank you for your talk yes please go ahead it's depend on the OS and the size of applications so if you are if you are going with a free plan there's a limit for the c-power right so based on that free plan on your applications it will it takes some time uh what what language python or something yeah python it will take probably with the free plan within four minutes you can get it done all right any questions yes yes is that let's sort of so we guys keep those slides so because uh if you have if you have your own VM running on AWS or Azure so you don't need to pay for GitHub so you can configure the stuff to a VM so the cacd pipeline will running on your machine your own machine yes sorry it's scratchable so yeah we need to install so otherwise the code will break right if someone has a vulnerability dependencies the see i will fail so that's where we need to always do some npm installing the c-power pen any more questions thanks thanks a lot again Ahmed thank you i think these guys need the applause i'm nothing yet all i've done is turn up so um great thanks very much i'm poor brettner on the open source technology evangelist i work for instacluster in cambera we've recently been acquired by netapp by their cloud part which is called spot which does stuff with spot instances on cloud providers so i believe um so the inspiration for this talk came from where i live which is cambera it's it's called the bush capital in australia it's very unlike singapore it's very low density and i live right next to the bush um and about a year and a half go one of google's subsidiaries started doing some drone delivery trials over the top of our house which i thought well this is initially i thought this is cool we can get drone deliveries like our neighbors we're getting i don't know donuts coffee and stuff at all hours of the day and night with these drones buzzing over our property so i downloaded the rap put in our address and they've got too many power lines and trees and things which is really annoying so we had to put up with the buzzing drones while our neighbors got the free or not free but got the exciting drone deliveries so that point i thought okay i can come up with a more interesting story around drone deliveries um for a demo application so this is more or less that's the origin of this the story i don't think it would work in singapore i think the density of living in singapore might be a challenge for this uh insta cluster provides a managed platform for big data open source technologies we've got a whole bunch of technologies around storage streaming analysis search and the latest one that i'm talking about particularly today is orchestration it's a workflow orchestration technology uber's open source cadence so it fits in really nicely with the previous talk around workflows as well this is something that's focused a bit more at the the enterprise level high scalability workflows uh so workflow orchestration is basically pretty simple it's just about task ordering you want to start something uh you want to perform a task you want to perform another task and then you want to end so it's relatively straightforward at that level than that um so doing that with cadence peddling with high cadence in the bicycle world is called spinning mashing or grinding is slow and bad so you want to spin your workflows basically at a high cadence um cadence is fault tolerant so workflows can and will fail and at that point you want to be able to recover from them in a sensible way uh cadence is horizontally scalable essentially the number of workflows that you can be running at the same time is unlimited uh cadence workflows are just code so that's one of the nice things about cadence here's an example of a couple of tasks being performed in sequence this is a workflow implementation of two tasks executed sequentially in terms of the architecture um the architecture is a bit complicated which is why we provide it as a managed service at the back end there's actually some databases uh Cassandra is the one that we we prefer postgres works as well if you want to be able to do what they call um extend uh that enhance visibility so you can see what's going on uh you you also need a cafe go and open search but they're optional which i won't talk about in this talk as well so instacluster actually provides the managed um cadence part of the system and the databases as well um as you as a developer actually have to write the workflow code and run some workers as well and that's run um on the client side so you need some like some EC2 instances running on amazon or something to actually be able to get that to work as well that just connects via APIs to the to the managed cadence um so yeah the workers run the actual workflow logic in Java Java and go clients are currently the two supported uh client types so just looking under the hood a bit how does cadence fault tolerance work well it's essentially using event sourcing which is sort of a it's been around for a while but it's a continued um hot topic uh this basically means there's history mechanism and replaying of the history um so essentially as the workflow is executed the history is written to the database which the database side of the story comes in that's the persistence mechanism and then you get workflow state recovery by replaying the history so if there's a failure at some point between task 1 and task 2 in this example um the failure actually causes the complete history to replay back again um yeah and the replay basically recreates the workflow state and the restart point to continue the workflow from so it all continues to work as if nothing bad has happened uh cadence activities um cadence activities are quite central to cadence activities including things like cycling backwards can fail um remote calls can fail so one thing cadence is used heavily for is um calling external systems and that call can fail so what you do is you wrap them in activities they can contain any arbitrary code you like and activities are executed at most once and can be automatically read on failure there's a couple of restrictions around activities um replaying must not produce new history events activities are executed at most once and this ensures that once they succeed and the result is basically fixed or immutable the other restriction is around workflows the actual sort of top level of the workflow itself replaying again as I said must not produce new history events so due to replaying workflow code is executed multiple times and that's by design so it must be deterministic so you actually have to use some special built-in methods for things like time sleep uh random number generation anything and anything that has side effects otherwise something will blow up at that point so as this example here actually shows the use of the um the inbuilt time uh function so what are some good use cases for cadence well essentially when you've got a very large number of running workflows um things that are long running processes of asleep and scheduling are also a good candidate for cadence anything requires any so stable top tolerant application execution complex workflows are a good example integration with unreliable external systems integration with streaming and event based systems for example Kafka and the other constraint I guess is that really it works best for short workflows with hundreds of steps rather than with very long workflows with very large numbers of steps uh this can have a side effect in that when the history is being replayed on failure it can take a longer period of time so that's perhaps the one case where you don't want to be using cadence so what are some good examples for finance retail and as I mentioned at the start uh delivery so hence spinning your drones a drone delivery demo application is what I came up with uh so drone delivery is sort of complicated it's got a lot of a lot of steps in order to be able to um have some drones starting at a base ready to go then an order is is put into the system by a customer they they want some donuts delivered or something so the drone has to get that request it has to go and actually physically pick up the order it's got to then fly to the actual location where the delivery is to be made deliver the order and then fly back to base and get recharged and repeat and get ready for the next sequence of steps so it's reasonably complicated um the actual flight for the drone is quite complicated so here's an example of a drone delivery flight on a imagined a desert island or something um basically there's a drone base it's got to fly to the order location fly to the delivery location and then fly back to the base again um that all requires fairly complex calculations remember this is all the simulated I'm not actually flying drones but I've done everything that would be required for real drone flight basically so the drone flight path is computed inside an activity that's where all the complex um numerical calculation happens because it can fail and you want to know where your your drone is at all at all times um it uses location distance bearing speed and remaining charge and it does this every 10 seconds so it moves more or less in real time what's going on uh if one of the activities fails the drone won't just suddenly crash and fall out of the sky it'll continue flying from its last known location so just having a bit more of a look at the actual the overall system I'm using some swim lane diagrams here to explain and the different parts the systems involved and the steps involved as well so here's the first workflow um this actually two cadence workflows this is the first one the actual drone workflow and this basically um does the the stateful sequencing of the drone starting being ready waiting for an order doing all the movement activities recharging and repeating again so that's relatively straightforward well there's actually a second workflow as well orders themselves are actually stateful so I've turned those into a cadence workflow as well so for the orders the order starts it generates the location where it needs to be picked up from it gets ready for the drone to come and pick it up it updates the locations continuously based on the drone location and then essentially if it's delivered correct correctly then it ends successfully at the end so because we've got two workflows we actually need some sort of coordination between them and this is an example using one of the primary mechanisms in cadence which is signaling so it provides a built-in mechanism for asynchronous signaling between workflows which works really well when you've got multiple workflows that need coordination like this okay so just to make things a bit more fun and perhaps realistic as well and because I like Kafka I introduce Kafka into the the architecture so cadence meets Kafka um Kafka for those who may not have come across it it's pretty common now it's a distributed pub subsystem um for stream processing it allows distributed producers to send messages to distributed consumers via a Kafka cluster and it uses topics to loosely couple the producers and consumers I integrated that cadence architecture with Kafka and this adds basically a cadence cluster with three Kafka topics three Kafka producers and two Kafka consumers and what you can do essentially is start a workflow from Kafka that's the second star up there so we're actually using Kafka to initiate the the cadence order workflow and we're also using essentially a Kafka microservice to coordinate drone and order workflows so this is where you can actually make sure that the drone picks up only one order and that all orders are actually picked up so that was quite a good use case for Kafka as well so just showing more or less how the system works in its entirety I'm just going to go through a few steps here to see what's going on so first of all the customer places an order with some sort of an app or something and the order is sent to Kafka to the new orders topic step two and the Kafka consumer gets the order it starts the new order workflow using a cadence client step three the order is ready for the drone pickup this is basically an activity where it sends the order ready message to the order's ready topic step four is again back in the drone workflow it's an activity it sends the drone ready message to the drone ready topic the Kafka producer gets an order ID and sends a signal back to the drone and starts the order pickup step five is in the drone workflow it flies to the order so that's again in an activity it flies to the order location and picks the order up step six in drone workflow it flies to the delivery location and that's in an activity as well it flies to the delivery location and hopefully it does something else yes okay so it flies to the delivery location drops the order sends continuously sends location updates to the order workflow which in a second so that the order itself knows what's going on people can so the customers potentially can get an ETA for the delivery and check to see where on the map the their delivery is and also so that the shop that supplied the order knows what's going on as well step seven in the drone workflow it flies back to the base again so once it's flown to the delivery location it flies back it recharges when back at the base and is ready for starting a new drone workflow so in fact the drone workflows are never ending they actually use a particular mechanism and cadence which essentially allow them just to repeat so they actually are more or less permanent workflows at that point whereas the order workflows do actually end step eight back on the order workflow side it receives the update location so that it receives the state and location signals from the drone workflow updates the state and finally if successfully delivered it actually ends the workflow and that's it as far as the architecture goes so one thing i'm really interested in is performance and scalability and that's one reason that you'd use cadence because you'd need lots of workflows so i didn't experiment to see how many drones simulated drones again we can actually fly once using this particular cluster it's not a particularly big one you can scale it straight out horizontally in any direction you like um whatever we got we got six virtual CPU cores for the cadence part of the system adding them to Cassandra so you can see Cassandra is actually the part of the system that needs the most resourcing and on the client code side running on some EC2 instances that was eight virtual CPUs giving the total yeah of 32 virtual cores for that system and with some load testing i did we managed to get 2 000 drones running concurrently which also means you've got 2 000 orders concurrently as well which gives you a total of 4 000 workflows running with that system that's not a particularly big system you can easily scale up to hundreds of thousands and if you want to find out a bit more about cadence and other open source technologies there's my URL my blogs i've written about 100 blogs in the last six years on open source technologies this is just the latest in a series of interesting technologies i've had to learn and demonstrate and with a bit of luck i think i've got an example here so this is where i live in Canberra i've assumed my backyard is the drone delivery headquarters the the base where all the drones start from and i'm in luck i'll be able to set it going and essentially the orange icons there are where the orders are being picked up from the moving um darker colors there are the drones so they go and pick up the orders from the the shops and once they've done that they then hero from a different direction to the red icons which is where the delivery locations are and this is only i think i've only used 20 drones in this simulation and this was done actually using javascript on in the browser subscribing to a Kafka topic that actually had all the information about the drone locations and other information that needed to to generate this this interactive map so as the drones pick up the orders the the orange colors disappear they're not really interesting anymore and they head off so that one's about to deliver in order to a particular location once it's done that the color of the icon changes and it heads back to the the base again so that'll run for a bit another couple of minutes i'll take any questions perhaps at this point while let's still just going on in the background there thanks a lot for the talk i'd just like to remind the next speaker i think Jiang Xu, if you could start getting your laptop hooked up for the next talk because there was a little bit of delay this time yeah please go ahead any any questions at all that must explain everything you can know about cadence that's good cadence is there's a lot more but it's a great tool for developers so if you are interested in building complex workflows that are scalable have a look at it then it's all open source and all very developer focused so yeah i have a question i know she have java and go support do you have any plans for python support yeah look i'm one of the actual developers of it but we do work with with the development team quite closely and from memory i think python is one of the supported languages that's not because not really officially supported i'm not sure if there's any plans for any other languages at the stage so yeah this is sped up by the way it actually takes about half an hour for each delivery they go at about 100 kilometers an hour the drones they're really annoying like a oh yeah it is well no but no because no but i am the sort of the drone lord at this point i i've got the base in my backyard so right i think that's all for me yep we have uh jinxia lu he's principal engineer uh open you open right there open you and uh thanks for your time um today i will go to share this topic and first i will tell you some keywords the first keyword is lubic it's a new open source project and the second is cure as guaranteed for containers okay okay let's let's continue and before my sharing i want to show you with open eura and myself because i think uh they are very important it's my background okay the first one open eura it is an innovative open source always and covers all scenarios that's our vision and it has many you know innovations on in this kernel and it is it is all the infrastructure currently it is incubating and operating by the open auto foundation that's open eura it's an os and the second my shelf jinxia lu and currently the project leader of lubica project and my focus is on container infrastructure and in in linux so uh we have a connection on os so i will try to explain this topic cpu utilization from the perspective of os okay why the resource utilization in data data needs to be improvement as you know um our data centers are growing rapidly that's why uh we have many data centers building every year then you know that the cpu allocation already served for the customer are very high in the data center but if you check on the monitor system you will find that the actual cpu utilization is low maybe 10 percent or 20 percent on average so in industry we are trying many ways to improve the cpu utilization okay um currently in the container world we are trying this co-located deployment to improve okay before that we are in a siloed architecture that means we have many different kinds of workloads AI application big data application they are deployed in different resource pools with different software specs okay in this year's we have cooperatives okay and most of our applications have already containerized and can be scheduled by cooperatives we have a unified cloud infrastructure this year's okay that's the basis of our today's topic cooperatives we have a unified cloud infrastructure okay and as this as to to be we define different stages to for this uh evolution stages the first one level zero independent deployment that's the s e stage at the last page and the level one to level four there are different stages at at this to be okay that's something different between uh those levels okay the first one level one we define as share that deployment and level two is hybrid deployment the difference between level one and level two we are already in the same software stack and share the same resource pools but the service type in the resource pool may be different okay on level one maybe we you know the software and but the service type only the only one okay and next we in the hybrid development we have many multiple service types but and level three okay and level two we are selected the different types of service types in the same cluster uh but at level three we hold up all workloads are deployed freely from the OS perspective we hold this it is in the black box stage which means that uh we at the other side we have to detect which kind of workloads it is uh as a black box okay level four we hold it it is in the whole multi-cloud deployment we have a technology to schedule all kinds of workloads in all clouds to improve the whole CPU utilization that's our different stages okay based on the decisions needed for the OS I have listed many okay those led parts are for our operating system we have multi-level resource isolation from level one to level two and from level two to level three they are QS scheduling or quantification in the fair control or to level four the building capability okay there's some key functions and I will try to explain or introduce functions to you on today okay um we have these functions and we have this architecture on the other on the other part upper part they are Kubernetes in this cluster we uh open URL as an work node is located in lower part it is built on Linux kernel as I mentioned before then the Lubbock the Lubbock is learned as a step in this Kubernetes cluster okay it provides the different kinds of in here that's our architecture you can see that in this um timeline we have finished all of them and you you try to build more on this next year okay that's our architecture then from level one to level two as I mentioned before we are trying to build this multi-level resource isolation okay why why this it is not needed once we build this hybrid deployment for containers they must based on the consumption that our workflows will be deployed with multi-level priorities okay we have different priorities for those workflows to simplify our modeling today we call it uh we divided into two one is a service and the other time is offline service the online service they are sensitive to qis and latency maybe they are for short time running and the offline service they are off-site okay so there are some uh difference on this services so we divide the different kinds of resources and the isolation capabilities for those workflows as you can see in this table we have cpu kiosk memory kiosk cache kiosk network kiosk and this this kiosk okay we have built such technologies and they are all already open sourcing in open euro okay I won't I won't share all of them today try to introduce three of them they are our elite features the first one as I mentioned before we have online and offline workflows and normally the online service needs no cpu resources and now in our Linux kernel we are using this cfs scheduling algorithm and cfs is fair it's a fair scheduling okay we cannot have a virus on online service and offline service so we have this multilevel bleed-empty scheduling to cfs so once the online service wants the cpu resources we can suppress the offline tasks to obtain the cpu resources for online service okay this is the first feature and the second one is multilevel load balancing scheduling load balancing okay and it is a big cause when once this online service is learning about or we may have many idle cpu calls at that time we can load balancing the dose of online workflows to the other cpu calls to decrease the number of tasks context switches and equations okay that's the first and second features they are both on the basis of hybrid on online service and offline service what about if we have a hybrid deployed with online service and online service okay based on the characteristics of online service it will normally apply for sufficient resources okay that means we will have no utilization on cpu and most of time so we to build this dynamic core affinity feature to try to overcome its cpu resources in cpu okay that's the features level one to level two and welcome to level three okay level two as i said before okay we are using the customer a selected workflows to deploy in the same cluster carefully carefully okay in level three we we have a we have a such mission that we try to we hold a customer can hybrid workflows freely as a black boss from our always point of view so build such method with offline training to get a model on such specific online workflow and we can use these models for our prediction on live network okay for online training we try to run this online workflow in the test map we have different kinds of magic collection from linus kernel or the hardware and we learn this online workflow in high pressure and then at the same time it's different kinds of pressure to this online workflow and capture all kinds of interference is suffered at that time after the testing we analyze these interference and the curious of this online workflow we can get a model okay then after many many testing we have these different kinds of modeling for our latering online prediction after that yeah we have this library as i saw here and then we have uh hybrid deployment with online services and offline services then we have this rubik learning as a demo they try to collect all kinds of matches in the background and that's many kinds of quantification for its qs and analysis interference for online services all the time once it has got any kinds of interference you try to locate this source of interference normally they are from offline service so if we located those those interference from offline service we can try to control it by by everything those offline services to another cluster or another or another node okay or suppress it for this kind of cpu resources that's online prediction and level four remember that we are in level three we are deployed in one cluster or one cloud but at level four we hold that we can using we can use multi-cloud or multi-data centers so the customer may face one scenario uh he has a pop requested as cpu polls and cpu calls down he may get different different computing capabilities on a different cause which increases his maximum costs because he has to maintain a different configurations in different cause okay why it is happened i listed the possible reasons maybe it is um by different generations of processes or different architectures of processes or maybe they are using the same processor bound with different configurations such as ht or frequency or different kinds of frequency so we propose this idea can we unify them the computing capability for this processor or for this physical server so um in our latest kernel in your open url we try to normalize the computing capability and in rubik we try to uh conversion this capability to a unified ncu that's normalized computing unit for the easy uh management of this cluster okay for scheduler that's all uh key functions we define for this utilization for your easy using we look we recommend you use rubik it is a bridge connects our cluster Kubernetes and open url it runs as a demo set in Kubernetes cluster okay in the work knob you can uh click clicker experience it with this um first you fetch its yaml and then you apply it on the knob then you can enjoy it for more details please visit our code rep this is our first case we already use our technologies in our internal cloud platform in the sample cluster we have uh already reached 70 percent security utilization okay this is case one and case two our community friend sina weibo they already use our features in their uh lift network in production they already reached 60 percent more CPU utilization okay this is our cases and what's more we recommend that you use volcano and kamada for forecast feature as you know for this uh CPU utilization we are talking the whole data center or multi-cloud but on open url and rubik we are basis on the only uh specific knob okay if you are going to the cluster or the uh multi-cloud requirement we recommend that you use volcano and kamada okay welcome to jazz we are learning to cloud native thick and it's less in open url and it's ready it's a lightweight container inch we develop in open url and welcome to engage us thank you that's all for today yep yep yep yep yep yep yep um this is a good question okay the first thing and we are still trying to develop for this feature maybe on the late of this year or next year okay and what we're trying to do is um actually we are only focusing on the cpu resources now so um the first thing we are trying to learn microbenchmark on linus kernel or on the linus we try to test the the probability of this cpu okay we got our score for this cpu then we unify in rubik after that we can get the result maybe from zero to 100 it's a score for this cpu then propose to the custom scheduler okay at that time and it may introduce a confuse to the customer because normally in our um for our daily use it configs different with cpus for our apps okay or for our container pop maybe 10 cpus um if with this one maybe there's um a number but not the cpu course you configure but with this number maybe you can get the same computing capabilities in different calls that's our question yes we are trying to define a small benchmark to test the capability of the capability of physical cpu yeah um sorry i didn't you do you mean for this one yeah i didn't prepare the ppt for this feature today okay we're trying to use the edt and ebpf in linus and try to how to say um just uh if edt and ebpf and has a small config in the user space to control the network bandwidth okay i think it's hard to describe it maybe i can show you something later okay we have this also is still based on the os point of view what we can see is uh different cases of resources such as cpu cache and memory but different kinds of workflow they may um maybe i can take a sample of engines okay engines it is most of time it is only using the cpu okay so you have a specific model is um it has a high cpu usage okay but it isn't it didn't uh sensitive to other resources this kind of cpu sensitive or cpu intensive workflow yeah that's one kind of model okay and such as we have many we can define different kinds of models maybe network intense model or such thanks again for your talk uh our next talk is going to be at 1115 so we have a little bit of time but uh heath you're you're here uh are you no sorry is heath here he's not hello yes getting ready to start our 1115 session uh just a reminder that we are streaming so we're streaming on facebook and several other platforms so this is going live our next talker is heath newburn he's the field CTO from pager duty and his talk is uh safely delegate your cloud operations with self-service automation this is a 25 minute talk oh gosh that's loud sorry um so as ryan said uh my name is heath newburn um the uh you may be wondering why uh you know pager duty is talking about run deck anybody wondering why because we own it so that's one of the things a lot of people will know about if any run deck users here today cool we're going to make you run deck users for the ends is over right so run deck for the download of open source uh pager duty we've we've acquired it um as part of our core capabilities and we've really built our entire operational stack around it now we're pretty happy with it um so that's what i do uh so field CTO my role is really to work with clients around the world on how do you take your existing investments you have how do we do things around helping you build out your automation capabilities um your ai ops capabilities your incident response automated sure response and so really what i want to spend the time with you today on is um kind of some of the problems that we're seeing out in this space in particular around um you know cloud right and we think about cloud automation and what that actually means for us what are we actually really trying to accomplish is you know there's there's certainly we've all heard and understand the easy problems of cloud right very easy scalability great um happenstall so then uh lead to you know really remove those very easy barriers so now you say if i've started a company right now i would never racking that first server of course it'd be cloud right i'm assuming this was for all of them everybody on cloud we're not out in the wilds um and the whole like behind that that was supposed to be happening is lead to increased business agility right so here's a hint when you're talking to your bosses about your next set of investments don't talk to them about speed don't talk to them about hey the cool tech talk to them about how it's going to deliver faster how it's going to make money flow faster right i love our business folks i really do we have an amazing job we get to work on cool problems do cool stuff we really suck at telling our own stories how do we begin to do that so as you begin to look at these processes and the things you want to think about and things you want to do think about it in terms of what does that mean for business agility how is that going to allow you to deliver faster and whether you want to get into the business of creating disruption whether you want to be smart about things you do whether you want to create efficiencies you know this is the whole promise that we're doing again but you know what does this need us to well it also turns out that when you make things really really easy to use anybody ever gotten your bill at the end of the month from grab and realize you spent way too much this month because it's got really easy now right don't want to leave your room don't pass don't leave your desk you know next thing you know you're up 800 bucks on you don't grab it's it's easy to do so we make these things so simple so easy to consume that things like surprise billing turned out to be a real problem for us right fair we also end up when in we're making these our cloud pieces work into put a lot of architectural complexity right because no matter which part of it whether we're in azure whether in aws right think about all the components that i'm going to have around rds and my route 53 and all of the other pieces that are required in order for me to build these complex applications around that i'm putting docker in i'm putting kubernetes in right so the things that we're asking people to do now are becoming more and more complex right anybody here part of an sre team develop so we finally got sre rock on my friend right and we see a lot of organizations are moving to sre but here's one of the things about sre is that we demand an awful lot from those teams right the expectation is you know the whole stack you know down the infrastructure level right so all of those things that are going in that it you know require a lot of things and so what do we do well we end up then wanting to figure out the right tool for the right job guess what there's an awful lot of right tools for an awful lot of right jobs so this is one of the things i always hear people talk about you know cost and and these other kind of complexities when it comes to cloud management we don't talk enough about tools complexity we don't talk enough about how hard these things start to look when we gather all of these additional pieces of information together right because you know we stand these up our database people want one set of management stuff our network people want another set of management stuff our core infrastructure capabilities are going to drive our os our virtualization our kubernetes our docker are all demanding other pieces of that right so what we end up with is a massive amount of tool ball don't have to begin to think about is there any possible way we can begin to address this and think about you know what's next around this so what we end up with is a set of capabilities that of course because we look at this as a purely technical problem we think about this technical problem we want technical solutions so we weren't developers so in this case you can see some cloud formation stuff down there for my friends that love cloud formation it doesn't matter this could be harness plume terraform you know the whole idea is we're going to deliver things as code but here's kind of the thing that's happened now is with this promise of cloud is when the expectation is anybody can use it anybody can deliver value from it anybody can create this yet we've kind of created this temple of high priests you in this room who get access to things but who else does so you know we demand this level of familiarity down to the code level on how you want to look at and create these values from that so what we end up with it is with this temple of high priest what happens anytime you want something done you have to go to the high priest that means that if i'm a business person out here hey i've got this great new idea i've got a new client they're willing to pay us a lot of money i need you to stand up my next instance of this database that i'm going to come out it's going to be part of the billing thing that's out there i got to come how many requests am i going to put in right i'm trying to go put in some core infrastructure requests i need somebody stamp some additional billing i need somebody to come up and instantiate the piece between them i need virtual and it goes on and on and on is anybody about to submit those kind of requests wait for somebody to come back to you to give stuff to you right you see it all the time right and here you are this whole idea was cloud was going to make it easy it's going to maybe go fast right i was actually working the client in the west coast of the u.s they put an request to just simply extend the monitoring in their environment four months four months right digital business moving at digital speed four months it's too long why because so few people have the understanding of the indian pieces on it right on how we begin to drive this focus and think about what we want to do so the other part is we can enter right so if you're on the other end of that request you're the one receiving all these requests it's awful damn hard to get work done because it's this constant interruption right we call it unplanned work there's a fantastic book by the way if you're actually having you know want to be able to better communicate with your senior leadership on why you can't get your job done the way you want fantastic but called making work visible and making work visible they talk about how do you begin to surface all of this unplanned work where do all these interruptions come hey why can't you do all the things that you're supposed to be doing well it's because i'm running around putting out fires all day it's because i'm running around constantly you know asking you know standing up individual pieces of stuff that's this this toil that's sucking up all of my day so i need to get a lot smarter about how i do that so i want you to think about you know that interrupt driven piece of it what is that costing by the way worldwide unplanned work cost 1.1 trillion dollars to businesses across the globe in 2021 so affects a lot of us so how do we begin to rethink this right how do we begin to think about really getting this you know this overall promise of cloud that we were all supposed to have and make it so easy on us and really you know bless them but how do we reduce the burden on these business users that are asking these things from us what does that begin to look like we need an easy button right we need something that how can i make it very very simple for people to take and consume the services that i want to be able to give them the cloud was going to tell us we just go ahead and snap these things in place you put it up you're ready to go so how do we begin to do that well run deck that's my presentation thank you sorry what's the idea behind it i want you to be able to think about self-service operations what does that begin to look like how do i begin to make it to where i can stand up easily accessible readily addressable secure role-based access controls so that anyone in my organization can make it very simple to get access to the services that i want to deliver to be able to utilize those cloud resources in a way that makes sense and allow the experts in the organization all of you to be able to design things in a way say hey here are the things i want to easily expose where are the things that i'm actually spending too much time on where are the pieces that are actually making me lose focus on the capabilities in what i want to do so if i can begin now think about making it very easy to self-service what begins to happen well number one i'm going to enable everyone in the organization right on what i want to be able to do and i'm now going to be able to have a secure methodology right so role-based access control my dear friend keo over here keo reach your hand one of the smartest people i know page d really smart dude knows a lot about this stuff you can go find in the booth later on if you want to have some more questions um but i want keo to be able to do anything keo gets it all right my dear friend adelle here also at our booth later adelle is going to be really really good networking she gets to have all the networking stuff she wants the woman will not walk through a database in my organization i swear to god right can't have any of that so very very easy to begin to segregate the kind of requests that i'd be able to drive and do at component level right so i want to do requests how i want to tie things together how i want to drive pieces through that and here's what gets really interesting you know that this use case actually becomes generic very very quickly i could think about holistic service request management as well because i've actually created a methodology for how i'm going to secure access to things right so i need bolts to continue secrets and passwords and keys and things i need to be able to have a way of how i do a fine role-based access control across the entire organization via sso or whatever else i need to be able to have a way that i can you know catalog and segregate various disparate pieces of things with tagging that's something and i want to make it very very easy to run so those are the pieces that are put in place once you've actually deployed run deck which is a osse image you can grab and download it today we actually have you know full licenseable versions of it as well but you know when you get it great you unpack it you stand at your cluster you know now what well first what are your most manual and repetitive tasks and i know that sounds a little silly right but bear with me for a second i want you to stop and think and here's here's a thing because everybody's already got automation if i were to ask you do you have automation anybody that doesn't raise your hand is a fool because you've got it already it's just that today it exists out in his share drive it exists out you know on his laptop it exists out on that local server over there it's there but we haven't found any way to actually bring it together so go find that stuff right hey when i send you a request to go ahead and answer them to a database what commands are you running what scripts do you already have in place what python have you already written right and let's go reuse that because this is what's key when i find the automation isn't that anybody believes that we don't want to automate everybody wants to automate that's not the issue it's a question of being able to make the time and the commitment to do so are you able to actually articulate a return on investment for that automation the things you do beginning to think about this in terms of dollars not just in terms of technology so starting with those things they're good you know i want to be able to take all the existing pieces i don't want to replace anything the second you start a conversation or the organization of hey i got to go rip that out of your power shell you're done right because i've got those investments i don't want to lose those i don't want to go rewrite stuff so how do i do that i talked about this operational framework of how do i begin to think about how i want to solidify what does that look like how do i want to get down what level of fine-grained access control that i actually want to do and then begin to think about how do i design that various user experience whether they are hey their other developer peers of mine that don't happen at access they're actually maybe on-prem but need access down you know to cloud resources for hybrid environments or maybe they're AWS engineers but i'm doing something out of Azure or maybe they're just straight-up business users that you need to be able to take a product manager in the organization and give them a way to communicate with you what their needs are to those next set of capabilities because here's the really interesting part of this folks we have a credibility gap in IT because we have a visibility gap we are the moles and trolls in the basement because they know this for a second if you do your job every day dead solid perfect you make no mistakes no errors 100% availability does anybody know you exist you're completely invisible right so this is a way that you can now one of the very first times show hey here's what i can do for you here's the things i can provide here's the value i'm driving for the organization here's the dollars i'm pumping into this value stream around the things that you do so begin to think about how do you want to go to the stakeholders that carry an organization to surface to them all the hard work you do because we don't get enough credit we get the blame right then you're the last outage everybody could find you then right the last time somebody right last time you were up for 30 days anybody pay attention no so begin to think about what is your journey for automation look like right so you know obviously all of us kind of are on the manual piece but that's pretty straightforward and you know and that doesn't necessarily mean anything bad it just means that there's probably a lot of toil we could begin to think about how to remove it from the organization we've kind of kept automation to you know the you know the high priest you know very limited we want to make sure that we you know a lot of people to see the things that are there um so i want you to think about what is that first move to you know react to the responsive look like where and you know i'm kind of doing some you know fairly straightforward taking hand in the lane i'm beginning to think about how i do things in the system a little bit easier i'm beginning to set up automation for experts repeatable sets of capabilities around the things that i have um then this is where i get you know a lot mercy is when i start to get more responsive this is who we're getting to standardize and that's really the big key to automation is standardization how do i make it repeatable how do i create templates that i can hand you whether they're terraform whether they're honest whether plume or there's something else that i can say if you want to be successful if you want to work with me here's the easiest way to do it and then proactive this is where things get much more interesting because automation today in your environment is driven in two ways it is driven by a human initiating it right which all that means is you still need a person like you that's smart enough to know which particular script to go run or it's earned by a schedule what if instead it's a venture what if now i go ahead and get a request from a product manager saying hey we would have been to spend this new piece of it it drops that into a pub sub you yank that off know that your automation then says hey here's the next thing is i got to go fire in order to expand these sets of capabilities and you know begins to deploy right where now automation is an extension of my observability pipeline an extension of my cm cicd pipeline and extension of my service request now we're doing something really interesting right and this is not rocket science i know this seems if you if you're just kind of over here right now seems way way way far away turns out it's not it's actually pretty straightforward my friends down there in the income see with the page duty booth and they'll show you right so this is what we you know what we're really trying to get to is we're talking today primarily um you know what are you doing on being able to automate these service requests and because my whole goal is i want to get you out of your home office on your home back deck maybe an hour earlier this week you know it's great that we can do things that are for the company but i'm not looking for us right now this is what i want you to think about the investment in automation isn't so much about hey how to create more value for the company which is great i want to create more value for us because again we don't get enough credit for the things that we do and also do anything about what is it now because we're talking about basic cloud capabilities but guess what we got to run those clouds same kind of thing day who's actually on call right now rock on my friend thanks for being here anyway i appreciate you right on call is pain on call will always suck there's nothing good about being on call so you know friends of yours if you've got friends that are on call give them an extra hug they need it buy them a beer later they absolutely need it so how do we begin to think about what's automation begin to drive in on call if automation become the first virtual responder guess what my dear friend the back maybe gets 10 less calls this week that's a pretty good deal so beginning to think about what is this system of automation begin to drive for us it not only gets us our own hours back it can give us the capability for actually increasing overall availability so think about what is that journey to a then driven automation look like right think about how do you begin to peel off the pieces out of here to get you to here what you've got happy smiling folks that it's a whole lot easier to do business with people know what you do you're saving time you're saving money you're delivering faster and you're getting that garbage out of it there's a lot of great stuff and the things that we do did anybody get into it to sit there keyboard and run commands i didn't we get here to solve cool problems to do interesting things so let's figure out where are we wasting our time where are we not doing interesting things at and go out and make those things that are there so that's it for us we've got the page of the booth downstairs with the very back make the trek back there come and see us i'm sure we got some swag and other things that are there and we can walk you through kind of some more examples of this talk about some stupid use cases that would be useful for you in your organization and i guess i have a couple months for questions right so everybody bought all of that yes sir so the question is can can use it towards like an l3 capability to where it would actually try and resolve the issue first you know as part of the overall process automation correct so yeah let me answer that in two ways right so there isn't there isn't an inherent ability inside of like the run deco ss to drive events associated with it however as part of a whole page yes it is so the idea is that data of new relic apde's abix your favorite monitoring tool raises an alert right the thresholds been violated or a transaction is now trending off i'm going to send that alert we're going to take that alert normalize it send in a run deck now the only thing i'll caution you on is what i would say is yes if there's an available what i like to call sort of horseshoes and hand grenades fix right can i can i make it can i pump that back in the submission make it work yeah absolutely but what i really want you to think about is what if i actually focused on doing automated diagnostics first right right got you yeah so you can do that basic reflex action like you can do a web hook out of a data dog or something else that'll you know do those attempts when you actually drive into this what we do is we build more complex workflows that will say go ahead and capture the diagnostics first before you attempt for mediation because to your point if i've tried it four times and it's still failing i would really like to know why right and even if i do get it back into a working state i would like to know the failure conditions so what our suggestion is inside a run deck you would define the automated diagnostics first then begin to attempt for mediation like you said inside of a tri loop you know and being able to identify we've got multiple failovers but you think about the kinds of remediation you could do so for example by leveraging a more complex automation instead of a web hook what i could do is i could say hey let's say i want to fail over a cluster but i could actually check the health of that cluster first and if it's a four-way cluster and i've already lost two of them i don't want to fail over right because now i'm down to a single point so i can begin to think about remediation in much more complex ways and what i do so we start with your point failing over a cluster you know then we go to you know and say capture the diagnostics for it then attempt to fail over only under certain conditions but yeah that's exactly the direction that we're headed and what we're trying to do that help yeah and that's what and and that's what kind of best in breed i've got clients and financial services in the u.s that they'll drive down their mttr for their engine x clusters and their web server clusters down to 60 or 90 seconds because of these kinds of complex methodologies in run deck exactly other questions you said i'll be around the rest of the day my friends on the page really beautiful look forward to seeing you all thank you guys for your time and attention to appreciate y'all and cloud this is a 20 minute talk and just a reminder that we're screen we're streaming today check check check hello chicken perfect good afternoon everyone i know the next tone is they're going for a lunch but i'll make more interesting don't worry about it i'm boominathan and thank you for the quick introduction wrong right so currently i'm helping us as a cisco multi-cloud architect market multi-cloud regional multi-cloud architect helping cisco solutions on hyperscale us like a aws azure and gcp so what is the experience i'm bringing here right have nearly like a two decades of experience maybe like a 12 years before i did a cloud meaning when aws saying autoscaling easy to by the time itself i did a project so that's my one of the biggest advantage so i did like a lot of cloud migrations and cloud modernization and app modernization these are the things so apart from my current role what i'm doing is i'm also doing a lot of organizing a lot of community and also talking a lot of the global conference an example yesterday i was part of apd delivering the talk and also the kubernetes user community so i love one thing about myself please i curious to learn and share the knowledge that's about me and apart from the technology i also do mentoring if you guys need any mentoring i have to take like a walk you can connect with me and this is my linkedin at any time you can connect with me and i have certified in multi-cloud well that's it okay now let's have let's come to the actual topic okay as i told you i want to be more interactive how many of you guys already architected your applications in the cloud awesome so well maybe i want to ask from rain so what is the application you architected and what are the high-level key builders maybe you you considered awesome that's great so what are the key parameters uh kind of like a key builder you consider will will you do the architecture traffic perfect okay because it's kind of like a machine learning compute intensive so you need to be like a network how the nato can see there those port is like high bandwidth available or not maybe i'll take one more question uh what kind of application you architected is it like a data center migration or like any specific application contact point so any any key things you consider like we'll do in the architecture uh what was there like a push and what i'm asking is what is the business business push and what is the technical push doing the architecture visibility of the infrastructure reproduce okay got it immutable okay awesome okay okay awesome so i'll just like summarize right what rain says is rain focused more on performance efficiency that's his key thing well doing the architecture whereas uh may know your name sorry hurdy okay hurdy says more on operational excellence how it can be like removable and kind of like i can do that so for a sake of understanding i'll take only aws how you can do the aws architecture in any uh that's common for azure or gcp or oracle anything but uh for the easy understanding let's take the aws so there are like a five key things we need to make sure while doing any architecture first thing is as ryan mentioned performance efficiency how our workload is going to deploy in the cloud and going to operate in the cloud because at the end of the day we are architecting for the applications applications is nothing but the business outcome right so the business needs to run it it has to be performed efficiently so you have that's why like you have to use you have to select the proper compute proper network proper storage if you remember the rain word he told it's a compute intensive application and network intensive retirement like high gb those kind of stuff but if you see hardly he told it needs to be elastic that's where operational excellency comes with the picture so most of you guys come from the devops or platform engineer or sre that's where like operational excellency okay the third one is uh maybe before instead of eye on i say i want to hear from you we talked about two things operational excellency performance efficiency so other than the two key pillars uh anyone wanted to share any of other key pillars which you did architector such anyone to share yeah okay okay got it so such in concept is cost optimization see uh so we are we are always doing the architect for the business push so business is always like a what's the cost benefit what's the cost optimization i can do that right so this is like a third pillar cost optimization how do we do the cost optimization so that's where we have to use the latest one example where our infrastructure is running more on the compute instance so we need to use the latest one then we have to use more kind of like a latest uh technologies like a cloud native uh kind of containers or serverless those kind of stuff so you guys only sad three key pillars performance efficiency operational excellency cost optimization any uh i i told you like a five key pillars in the in architecting anything and i wanted to be more interactive there are two things remaining anyone wants to touch base just think from your own experience what what what does it matter for your organization what does it matter for your customer while architecting any application exactly that's the fourth thing security because why we are consider we are architecting the application the application or is not secured end to end how our customer can deploy that okay or like how we can deploy over bfc or like a banking applications on the cloud so security is the key pillar okay uh this is the fourth one the last one is okay the last one is okay before i complete the block okay the last one is reliability okay what is reliability see i'm talking now we are connecting all the ways like a in person also online reliable right like a if i'm if anything if you guys missed here some of the people can share from the online also and it's available according also reliability comes from the resiliency okay it has to be available all the time and if you remember the famous quote from awcto where this works everything fails all the time okay so the reliability so let me summarize what are the five things okay reliability performance efficiency operational excellency cost optimization and security you guys got it right now let's come back to the architecture okay so this is like a sample of there are like i'm going to introduce five patterns but let's quickly take the pattern number one pattern number one focuses on uh let's take like a normal from web application when you are designing the web application if you think whatever the five key pillars i told it will be there so we are choosing the compute and also we are here in the one of the availability zone so what is availability zone in awc it's more kind of group of data center in a specific location so in the data what is the data center that's where the applications will be hosted where the compute network storage security infrastructure virtualized everything will be there so availability zone is nothing but there could be multiple data centers okay and uh in awc there is a concept called a region region is nothing but it's a group of availability zone maybe one uh one two three availability zone in one availability zone how many data center will be there minimum of two to four okay that's that's these things if you see here we have a ec2 instance uh in one availability zone and we have a ec2 instance and another availability zone also on top of that so that's where like our application is residing on top of that we have a load balance okay what what it is like this architecture we are deploying the application compute intensive application if anything happens on az1 so when i say az1 it's availability zone one so we have a traffic distributed to az2 also it could be like a active passive or like a load sharing both of them is available let's see the pattern two so if you guys see in the power pattern one it's more kind of single instance okay one ec2 instance in availability zone and another ec2 instance in an availability zone it's more kind of like a monolithic applications how many of you guys already working on micro services architecture that's a deep aspect of campus right it's becoming so let's consider those kind of a micro services okay compute like cloud native if you see here we have like a group of instance what is the benefit of cloud native instance right that's hardly correctly mentioned elasticity we can easily reproduce and within the nanoseconds we can easily build it okay so with that we have a group of instances one availability zone if it gets files we can take it another another thing so it keeps on doing that so the third one is where the region comes into the picture if you see the user is there they have like a web application in one region and mobile application kind of like a mobile app in another region so what does that mean if you are if you take like a three-tier architecture web can be we can use in one region app can be we can use in the another region also it based on the difference but what is the key key outcome here right you can distribute the traffic accordingly most of the times like a global traffic that's why like aws of the cgn and route 53 we can have the global network traffic from there we can route for this kind of traffic to this layer consider an example so if you're accessing amazon.com sd those kind of tactics loaded located to like Singapore data center if you're opening amazon.com.in it goes to india data center that's those kind of distance okay and those entire data is also that's what i want to try the customer data is also located based on the region there is more based on the application requirement okay this is the fourth and final i know like i'm going little fast because of running of top time but what i want to make sure is the five key pillars you need to remember operational excellency performance efficiency cost optimization security and reliability and you know how how it is and whatever we are talking it's more focused on reliability and scalability okay so let's take our last one uh this is like uh i i explain about region right by seeing that what is the one thing comes to your mind related to application do you see like this application is active active anytime sl a ran mentioned right here be able to achieve the sld because it's kind of like active active uh you have like a group of applications in code code consider like a code banking in availability zone a same thing available for availability zone zone two also and across the region also what is the region so singapore is considered one of the region under one of the region we have like multiple availability zone under one availability zone we have multiple data center same singapore we can have a sydney sydney is another region and we can have one more region in india banglore or hydra that's another region only here us okay that's how we can scale the architecture so with the interest of time i wanted to take couple more minutes to answer the to take out the questions yeah thanks a lot we have about uh three minutes for questions so maybe two questions and i'll be there if you have more questions i'll be roaming here i'm happy to uh explain you so now you guys yeah please please go ahead okay whatever i talked to is more focused on scalability but uh and that's part of the reliability right but not reliability in security is a release everything one example depends in depth whatever i'm doing like one example here it's more focused on the compute rate we need to whatever the data resets in the compute have to be encrypted data in transit uh sorry data in rest data in transit have to be encrypted along with that we need to include the zero test model also who is accessing what is the authorization and what is the authentication everything so that's a key thing also we need to do defense and depth towards the cloud inside the cloud and outside of the cloud okay so this is more focused on reliability pillar any question thank you for asking appreciating so you guys go on it like how do we scale step by step so you see like from one availability zone one data center one instant and then compute instance and cloud native and multiple regions so that's how like we scale the things and i think i'm i'm running out of time uh next is lunch right okay we have one more speaker okay perfect so i'll be there if you have any questions related to that we can connect uh you can ask me in the meantime this is my link ring you can connect and we can explore we can exchange the things if you have any questions one more minute right okay maybe i can wait for one more minute until this speaker comes i love to hear any questions please okay resiliency you already answered the question yeah yeah that's a very good question maybe let me summarize so what is the need for multi-cloud in terms of resiliency because resiliency can be achieved with that single cloud provider with help of multiple regions that's correct right okay multi-cloud push not only come from the application it always come from the organization why because there are two business benefits one business benefit is consider i'm a i'm a customer you are a you are a cloud a and you are a cloud b i can negotiate both of them cleared up okay and some of times my developers will be there he's killed on cloud e and he's killed on cloud b so i have the providing the democracy to the developer to flexibility to the developer these are the two things apart from resiliency okay perfect sorry when it comes to business that's the first one but it's strategically uh that's why like important right because uh some of the times you know what you take like one of the government organization in malaysia they may have i want to store my data in specific cloud where the data centers already in malaysia if you take cloud a b they are not there cloud b is there if you take like it's indonesia google is there google data center there so it's depends on business complaints and also business application requirements these are the three things along with that in single way we are saying calling a resiliency by adding a spot of the architecture i hope like i answered all the questions please if you do reach out to me anytime thank you so much uh his talk is multi cluster multi region and multi cloud deployment with kubernetes uh dan is an instructor at learn kubernetes and just a reminder that we are streaming today let's see so thank you very much for having me today so my name is dan and today we're going to talk about a little bit about kubernetes i've got 20 minutes the original talk was 40 to an hour we have plenty of demos and and and stuff so i had to cram everything into into 20 minutes which means yeah we i'm going i have a demo hopefully it works um but it's it's a much shorter version than what i usually run okay who am i my name is daniele very hard to pronounce so people call me dan daniele whatever and people make fun of me because i do quite a little kubernetes so they shorten my name to d5b um that's me um i i spoke at kubernetes twice um i'm a certified oops that's very quick not sure why what's going on i'm a certified kubernetes administrator so i spent i spent quite a lot of time just researching and playing with with kubernetes that's that's also part of my job i work for a company called learn kubernetes um that's it they basically teach kubernetes and that's why i spend i spend a lot of time with it and today the talk of today is basically we have these kubernetes clusters how do we connect them together um if they are in different regions in different cloud providers now before we tackle this this challenge it's worth doing a very short recap of how kubernetes works so generally with kubernetes what we do as as developers or just you know someone interested in deploying we have like a single unit and we say hey kubernetes please deploy some deploy some workloads for me that that's basically how it works we don't really care what's happening under the hood i mean most of the time we don't but in reality what happens is kubernetes will basically figure out where these deployments need to be placed inside your infrastructure so we don't care but kubernetes does because we still have to rely on servers and we still have to rely on machines deploying these workloads okay all good so far what's next so kubernetes the way the way uh decides how to place these workloads is decided by what we call the control plane and the control plane is is basically a collection of uh different controllers that work in tandem that work work together and most of the time there are four big blocks i mean they're way more than four but we're interested in four of them which are the um api server which ingests your requests and then we've got a database called at cd then we've got a controller manager which is brain of the operation and then we've got the scheduler which basically assigns workloads to nodes that's that's basic so what happens when you deploy an application in kubernetes well that that request you say i want to deploy something that request goes inside the api server goes through several steps of authorization authentication and just changing also the request on the fly and then eventually it gets stored in the database in ecd once it's inside ecd then it's picked up by what we call the controller manager the controller manager is one of these systems you basically just syncs the state so if you ask for something it's going to check that you've got what you asked for in this case it's asked for the deployment the controller manager notices that there are four missing three missing parts that's going to create them oops very very quick i'm not sure what's going on so these parts are created they are creating impending inside the database you're written down to disk and then since they're pending the scheduler is notified of the change and they will pick them up and then try to allocate those workloads to particular nodes that's basically the end of of of the scheduling of this workload what's important to notice that at this point in time all you've done is basically just changing values inside the database there is nothing actually going on inside your infrastructure so who who creates this workload inside your service well we need something which is going to do the work and that something is called the kubelet so the kubelet is basically like an agent that leaves inside your nodes and they basically what it does is just pulls down all sort of information about the node and tries to reconcile the state of the node with with the control plane and if there is a workload then it's going to create it right or if there is a request to the lead that workload that is also also going to delete it whoa i'm not sure what's going on not very useful okay so we got we got this cube we got we got a kubelet we know how the control plane works um and other things that it's important to mention oops is is that the traffic does not flow through the control plane so when when you deploy this application these applications are sending web traffic like production traffic then this is not the kind of scenario we we are looking at oh i'm not sure what's going on so i'm going to stick to the seriousness of my clicker yeah sure sure that's a good idea um so what's going on is is that um the traffic doesn't doesn't flow like this instead of what we see is um most of the time the traffic flows directly through the nodes and these nodes are fronted by a load balancer and then what basically means is that this this control plane could go down and we can still serve traffic right which is quite quite convenient for us right um and this now we know the basic let's let's have a look at how the basics apply when we when we scale this cluster from one single region so one single cluster to multiple clusters right so the first idea is okay i've got i've got an application i've got an e-commerce website or a financial application that needs to be deployed into different regions for availability for example so one you know the first idea you might have is okay i'm going to have a single control plane and then several nodes spread across regions that's perfectly fine there is um you know all of you all the traffic will of course will be ingested by the nodes which is great but the reality is that some of the communication that you have between the control plane and the rest of the nodes will be slow right because of course they're in different regions you might say okay and there's you know i can i can bear the cost i can bear the trade-offs let's do this then the same sort of problem you will sort of face when you deploy your applications right because kubernetes treats those nodes as equals right so it's going to deploy you're going to deploy these microservices and then they start talking to each other right so you might have cases where some of the traffic which is distributed inside the cluster is a little bit slower because kubernetes essentially doesn't doesn't know that this traffic is going to be redirected somewhere else now there are changes in the kubernetes kubernetes api to make it that a little bit better but but the default this is what you get which is you know sometimes everything works fine and then suddenly everything is super slow okay so how do we fix this well what if right we use one of these kubernetes features to fix it so instead of having a single control plane right kubernetes can be designed can be deployed to have multiple control planes so the idea is if i can have multiple control planes and each control plane has got a database and all these database synchronized between each other who is stopping me from deploying multiple control planes and then eventually one for each region that i'm that i'm that i'm using in my infrastructure no one is talking to me i can do that does it solve the problem that we have before well sort of yes because these nodes they can talk directly to to to the control plane in the same region so that sort of communication should be solved however this introduces another issue which is which is replication right so we talked about how this database needs to be replicated between between instances and the issue is that the database will write a disk only for its quorum right so if the majority of the nodes agree on a value in this particular case it will take ages to agree on a value because they are in different regions and latency is very high so this design is even worse than what we had before so we know we can we can't really do a single cluster with multiple we multiple nodes in multiple regions we know it's very tricky to do a single a single cluster with multiple control planes and multiple nodes in multiple regions so what can we do well the reality is that we can have a single cluster per region and that's that generally the better approach that that you might take when designing this sort of stuff however when what we do this there are a few issues the first issue is how do we decide how to place these workloads in which cluster we're going to place these workloads this is issue number one second issue is if i route the traffic to the us cluster or the cluster or the south fish of pacific pacific cluster and there is no app or it's overloaded how do i route the traffic to the other cluster all right and third one is if i've got storage in one cluster and that application needs to move do i need to move the the storage on the other cluster as well i don't know so when i play with with this sort of examples then that i found a quite interesting tool called kermada and basically the way it works is it's an orchestrator of clusters and you install a cluster you install this software in one cluster which becomes the manager and then the rest of your cluster will become workers and the way kermada works is basically we just send the request to the manager and the manager will basically distribute these workloads across across the entire fleet and why was it is it good it is quite good i think it's quite clever in a way it basically creates a mirror of your control plane but this control plane is basically cluster aware so it's aware that there are several other cluster that needs to be orchestrated and and what happens is you have an agent which is similar to the kubelet in a way and then it's the agent which is basically going to send the commands to the individual cluster so you have a single cluster which basically sends all of these commands to the agent and the agent will send the commands to the cluster themselves so that's basically how it works but the interesting things is this kermada which is basically a glorified control plane for clusters can have something called policies so you can basically say i want to create a deployment and i want to go to just u.s u.s deployment the u.s cluster so if you do that and you submit the policy we'll basically have a deployment just in the u.s but go back to my policy and i change it to europe then this pod will be moved to europe right and there are also many other ways to describe how you want the state of the cluster to be in this particular case i say i want to deploy two pods and if i say duplicated then kermada will deploy two for each cluster who got in my fleet or i could say something like divided you know i could i could put different weights or i can say aggregated so filling one cluster first and then move on to the next one and then there are way more ways to to configure however you want this to be to look like this is a good and when you deploy kermada the networking for kermada is isolated to the to the cluster itself which basically means that if the traffic is rooted inside um surface your Pacific so in Singapore then it stays in Singapore there is no way to share this traffic with the rest of the clusters so what do we do i want i want to be able to share this data i want to be able to share this traffic with the rest of the fleet um i can do that if if somehow i find a way to share this ip address and this information about workloads that have been deployed in my infrastructure and the way we do this is basically by intercepting the traffic proxying it to the other cluster and then rewriting it when we are on the other side and to do that we use something called service matches so one one way to solve it it's not the only way but one way to solve it is to use something like service matches so you might have this name so Istio which is what i use in in the tool as well so what are these service matches and how they work so essentially you have services you have deployments inside your cluster and then they all talk to each other so when you install a service match what happens is each of these applications will be fronted by a proxy so all incoming and outgoing traffic will be filtered by the proxy but this proxy is not just like a regular ngdx proxy it's actually a programmable proxy that can be reconfigured on the flight so the way it works is we've got another control plane which is the Istio control plane which on the fly will basically reconfigure the routes and the end points for this for this proxy which gives us quite quite a lot of flexibility for saying oh i changed my mind i don't want these two services talking to each other anymore or i could say something like hey when you do the log balancing can you please do a different split than 50 50 or round robbing right all this information is something we ask we send the control plane the control plane will reconfigure this process on the fly so that's that's how it works what can we do with this well if we can reconfigure on the fly we could also say hey this traffic we also have multiple calls in multiple clusters so route this traffic somewhere else and that somewhere else is is basically the service match so this is basically how the service match multi cluster works so we've got cluster one and cluster two and then when we install Istio you can see the proxy there the programmable proxy becomes part of your application it's a single unit and then when when you install what we call the gateway which is basically a way to connect this to serve these two clusters together and then Istio will basically start sharing and discovering end points from the other clusters right so this is basically where we share this is my workload this is yours you know and we keep this information in in the cluster now when the traffic comes in and it goes to the proxy then we have the ability to say actually you know what I will send this through the gateway to the other side and that's basically how we share we share the traffic so what does it look like so we've got a deployment we know we can spread this pod across the cluster we know we have policy we can decide how to how to spread it in in our fleet and now we have this service mesh which is also able to decide where the traffic should be placed now we've done we've done something basic we just basically install the service mesh and connect to the cluster but we also do clever things such as always send the traffic to the local cluster and if you're overloaded go somewhere else right so in this particular case it's just just a basic configuration but at the end we are basically able to ingest traffic in one cluster and then if we want to forward it to the others do you believe me someone's saying no no anyone else okay i've got four minutes to make it work let's see okay okay so what i've got hopefully it works so what what i have is a small small script and what it does is basically just send the request to the cluster and sees and and basically check the response now hopefully the wi-fi works i have no clue what's going on or how slow it is maybe maybe we'll wait a little bit longer okay and so this is a dashboard i built so this is i deployed three clusters today um before i came this is why i was late to just deploy the cluster but what this dashboard is is basically just pulling the data from the fleet cluster i've done and then the what happens is i've got here um some some radio buttons at the top where i can select where the traffic is going so i make a request to the london cluster or i make a request to the u.s cluster and then i basically have a small application that replies with let me see if the script works no of course not that replies with a flag with with the region where the cluster is deployed so this you can see here that when i send the traffic to the u.s takes ages but most of the time it goes to the u.s and now it's going to london all right so i send the request to the cluster in the u.s it travels all the way to london and then comes back all right and if i switch to to singapore then it goes to singapore and then it goes back to london and then it comes back so so basically this is Istio deciding there's some of the endpoints that we've got inside a cluster needs to be rooted elsewhere before they go back as a response um here we go whoa okay got it so excuse me so let me wrap up um so i i think so what we what we discussed today so we had a look at how can kubernetes work so that was important so so we don't understand how scheduling works in camada so how do we manage to have policies that can spread cluster can spread work with the cluster we had a look at how you know different options that we have to deploy kubernetes so we could have a single control plane multiple nodes in multiple regions but we sort of discovered that the better way to do it sort of easier way to do it um is to have a single cluster per region we had a look at camada so this tool i mean it's not the only tool that does this so some of you might have used argocd to do deployments to the multiple classes perfectly valid option i was just interested in in this one for for my my research um we sort of had a look at how Istio shares ip addresses between clusters and and how this traffic is rewarded in in the network using using gateways we also look at them when it actually worked this time which is excellent news so i just want to thank everyone for for joining me today um this is me i don't know how much time i've got um maybe 30 seconds for questions one is linka d which is the second most popular service mesh and then the third one is called kuma from from the kong kong guys um yeah i think yeah those are those are the most popular i guess but there are many more options so so the question is why do i need for multiple multiple control planes so the reason why you need multiple control planes is because kubernetes um stores what we call endpoint so endpoint is basically the ip address of the pod and that ip address of the pod is used by several several components in the cluster so the components are core dns so the dns system the ingress and what they do is that we basically make an api request to the control plane and say can i have back a list of nodes please with this list of list of ip addresses please with this list of ip addresses they basically reconfigure on the fly the dns the ingress and several other components to make the cluster work so when when you grab a control plane and you shut it down if you don't do anything at all your cluster will work but as soon as you have the new workload then first of all i don't know how you're going to schedule that workload but assuming that one of your workloads goes down and that ip address is not available anymore there is no way to update the control plane saying hey this ip address is gone so your dns your ingress will still have a stale list of ip addresses that will still root traffic to those even if you don't want to now this is not great this is but your cluster still works so you don't have a hundred percent downtime you will have depending on what's going on you will have degraded service but but it's not like fully percent you know a hundred percent downtime so yes we won't have we want to have multiple control planes because we want to make sure that these endpoints are propagated correctly but again it's trade-offs it goes back to how hcd works and how replication and databases work so the more the more a control plane you have the slower it gets and the last consequences to that this is the last talk before lunch yeah okay um i think google antus is a collection collection of of tools come out it's just like open source tool focused on on scheduling and you know multiple cluster i think i'm not i'm not an expert in antus by anyway um but i think antus is is a color it's way more unified collection of tools that help help you to deploy kubernetes on on prem or in the cloud so i think you know you can have e-steering answers that's for sure um i don't know how you would do multi-cluster i'm pretty sure google has got like a a load balancer that can distribute traffic across multiple clusters that says for sure um so i think they are quite different isn't you know different different sort of aim for the project cool any other question i guess the question is where is lunch um it's basically just a collection of tools like you wouldn't in the same way you wouldn't install prometheus so you stay the same way you would install istio then you install this in one cluster and that cluster becomes cluster manager will basically become like this this entity that can manage other clusters and then it's a little bit weird because then your kubectl then the same cluster will have two two kubectls right two kubectl one for the normal cluster then if you do kubectl get parts you will see karmada and then there is the karmada kubectl same kubectl different kubectl and if you do kubectl get deployments they show you that it shows you the deployment across clusters which is like a control plane for the entire fleet you've got and and basically karmada what it does it basically aggregates all of the all of the kubectl coming from all of the other uh cluster which is quite cool right you see a unified view of whatever is going on in in your fleet yeah it's it's on top it's turtles all the way down oh yeah all of the code for the demo is open source and it's online um so i can share i can share the the github repo not not an issue at all um yeah but this this sort of stuff breaks quite easily so i spent this morning as you can imagine there are so many moving moving parts i was like okay can i do it no can i do it maybe and then eventually i i figure out how to do how to fix it cool i don't i don't think that's that's the case because the the service mesh will basically do that work for you so i i think there are like several ways you can connect multiple clusters together and then some of those will be like similar to what we do in in AWS like VPC peering where you need to be careful about how you assign IP addresses like an example of that i think i'm not i never use it but i think it's called it's a product called celium service mesh so this this will be like a lower level so we're talking about probably l2 for connecting clusters right and then istio sits at the very top right most of the time layer seven so so in that in that particular case we don't need any any precaution when it comes to networking there are consequences on on running a lot of proxies in your cluster when it comes to yeah managing them resources it it is not three it is not free for sure but but it is it is an option if you if you've got this kind of problems cool no one is asking where lunch is excellent cool thank you thank you very much