 I'm so sorry. Okay. Hi, everyone. Thank you for joining us and welcome to Open InfraLive. Open InfraLive is an interactive show sharing production case studies, open source demos, industry conversations, and the latest updates from the global open infrastructure community. This show is made possible through the support of our valued members, so thanks again to them. My name is Kristen Barantos and I will be your host for today's show. We are streaming live on YouTube and LinkedIn and we will be sharing questions throughout the show, so feel free to drop them into the comment section and we'll answer as many as we can. Some of the most popular episodes on Open InfraLive are the large-scale OpenStack show where operators of large-scale OpenStack deployments come and discuss operational challenges and solutions. Today, the large-scale OpenStack show is back for an off-steep dive with Thailand's largest OpenStack cloud operator, Nipah Cloud. Joining today's discussion, we have Felix Huthner, Thierry Chourez, Arno Morin, and our Nipah cloud guest, Dr. Chulia Ntek. To get it started, I'll hand it off to you, Thierry. Thanks, Christine. I'm very happy to welcome Nipah Cloud to the large-scale show. It's always great to have perspectives from deployers of OpenStack all around the world. I wanted to start off with, could you give us a quick introduction about yourself, your organization, and your OpenStack deployment? Hi. Thank you for inviting us to this episode here. My name is Abhisak Chulia. I'm a CEO and I'm with Thierry Chansin, he's a chief innovation officer of our Nipah cloud here. Just to give you a quick introduction, Nipah established in 1996. Now we have about 170 employees with three availability sites, and we have 24Y7, 80C. As you can see, we are partnered with Julliper and AMD Epic, and then the next one, please. Next slide. Here's our brief product. We have a public cloud, we call it Nipah Cloud Space, and we have a hybrid-an-age cloud, which is we name the Saivage, the one that, the reason we call hybrid, because our private cloud, this version, we can go between public and hybrid, and we can set it at any age we want. And then the third product is about the object storage. Okay. Next one, please. Okay. Here's the topology of our cloud here. As you can see that we have the core site, which is our BKK, is site number one, and then known site is number two, and then number three. We're going to set it up by the first quarter of next year, which is site number three. But the one on the edge is the KKN. This is the setting around Thailand. So we have a full site around Thailand, and what we have done is we've put in the fabric networking on top of all this. So we have, using the topology of a spine leaf, and we have about 400 gigabit per sec of the backbone, okay. So with all this, the whole thing is our open stack, and then self-startage, and we add on the fabric networking on top of that. Great introduction shows how central the network interconnect between your sites is important to your deployment. Could you give us a bit more details on the components of open stack that you're using? Like I suspect you're using Nova and Neutron, but which other components have you been deploying in your open stack deployment? Okay. We are using KVMS, the hypervisor of Nova, and safe storage for cinder, volume, and object storage. And low balancer, we are using as well with Octovir backend. And the last one is the tungsten fabric. We are used in state of Neutron, yeah. So Neutron, basically, we use it as a proxy. So your core component for networking is the tungsten fabric, right? Yes. Okay. You said in the previous slides, you were building your network fabric on top of open stack. You mean the building in our work order? No, no, I'm saying when I say on top, I mean, a network is covered with the fabric networking, it's just covered all the locations. That's what I meant by on top. Okay, but you, oh, sorry. So your open stack is using this fabric that you have as a core network infrastructure in your data centers, right? It's something like that. Yeah, we implement the new network. Okay, we use the VXLAN, VGP, VPN, right? Yeah, when you said on the each side, it is our empty zone. When you said deploy the application on each zone, they can connect the private network with our fabric networking. And always configure the physical equipment of your fabric. Is it done by open stack or do you have any other external SDN controller or something like this? Oh, no. We have a two-layer, the underlay layer networking. We are using the EVP and VXLAN for the LeapSpy and PR, the MP, VGP between each side for the underlay networking and for the overlay networking, we use a tungsten fabric. The tungsten fabric can connect the router and share BGP route. Okay, so the physical equipments are already configured by your network team with the EVP and stuff like this and open stack is running on top of this network infrastructure, right? Yes, of course. Makes sense, makes sense. And are you then running one large open stack across all these environments? Or is it like individual open stack, one large, okay? Yeah, one large one. It's only one region, right? Yes, correct. So I imagine the region is quite big. Are you willing to share numbers or how many computers do you have or stuff like this? Yeah, we have it. We prepared on the next question. You want to go back to, or you want me to talk about it now? Yeah, okay, all right. We can mention about current scale of our open stack. Right now we have about 2,000 DM for the new one, okay, just one year that we start. And we have about 40 compute nodes and 18 SEP nodes, okay? And in the SEP node, we have NVMe about 300 terabyte usage, okay, and the spinning drive that we do for the object storage has a two petabyte usage, usable, usable, okay? On two side, okay? One on the BK, one and the other one on B, on a known temporary. So two side, one petabyte for each side, usable. You said you were using SEP as well for object storage, right? Yeah, yes. So you're not relying on Swift at all? Do you deploy Swift or it's not part of your open stack? No, we didn't use Swift because we don't want to maintain the storage with Swift. Because we are not experienced with Swift when we are going to code production. Okay. And you share which versions you're... Go ahead, Felix. You mentioned that this is the new environment, if I got that correctly. Is there also an open stack cluster that you're running in parallel? No, before, let me give you a history. Maybe first one, our first open stack, okay? And it's under Okada. And we use the open V switch, right? OVN, right? And then at the time, that's the old one. And we learn a lot from that one. I present those experience at the Vancouver already. And from that one, we know that there's some problem with the first one. So this is Nipakow space, we call this one. This is the second one that we can deploy in a way that we can expand to many locations. So does it mean the previous one is completely shut down now or are you still managing it? No, no, no, still use it. Okay, it's still working okay, but there's some problem with the auto backup or some scaling when you scale it. If you scale beyond some number of VMs, then there's some problem happening. We don't want to upgrade it. And we just want to rebuild a new one with a new Fabric networking. So the previous one is running open stack Okada, right? Which version are you running on the new one? Victoria and Yoka. The current one. Yeah. We start with Victoria and then we add some component of Yoka into it. Okay. Do you plan to upgrade your previous Okada to the new architecture or what's your strategy with your Okada region? Are you going to upgrade it or maybe it's gonna die by itself because it's too old? No, no, we usually, we like to upgrade it every two years to the new release, a major upgrade. But the minor one is security fix. We're gonna do it every three months. Like AMD Zenbreed, that showed in a few months ago, our operation team needs to upgrade the operating system and CPU microcode for their security. Yeah. This is for the minor security update. Minor update, yes. We need to do that. Okay. So we plan to upgrade this big Victoria and Yoka to the new one, Antelope. When was that? Next? Next year in quarter two. Quarter two, yeah. Yeah, we need to simulate the production cluster to our staging cluster and test for the impact of our customer. Yeah, maybe for video then. Mm-hmm. Do you already have your, sorry, Felix, go on. If you didn't have such security things, what do you do to like, to maintain these hosts? Do you like, make them empty by live migration or do you have alternative? Yeah, on our maintenance, we need to live vacation the VM in computer to another one. Yeah, and upgrade it. Yeah, when we maintenance, we have to follow strictly because we have ISO and CSA star change management. Okay, so every time we're gonna do maintenance, we're gonna follow those standards. Just want to let you know first and then the reason when you do those kind of maintenance and upgrade, you have to analyze the risk and impact on change over the real committee. And it may have impact on our SLA, so we gotta do that. So we have to notify our customer as well. So all these things we gotta do some live migration. Once one we got all these done, we do live migration. And then we do the maintenance. Yes. So we got a question from the audience and they ask, are you using Keystone Federation between OpenStack clusters? No, we are not using Keystone Federation. Yeah, I'll be the Keystone tool on three side, the main side. Yeah. Just the core side. Yeah. So you have two running Keystone, one for the Octavia Regents and one for the yoga and Victoria Regents. We have only one regent. Yeah, this is what you see on the topology that we show before. It's all one regent, but it's all availability soil. Is that one? Is that two? Is that two? So you have one AZ running Oketa? No. This is a new one. This is a Nikonipa cloud space. So it's a one regent, but with many availability zones. And the whole thing is all under your cup. Okay. Your cup. Okay. Can you share a little what your customers are using your cloud for? Is that something you know about or is that something that basically it's customer workload and you don't have too much insight into that? Which one is that? Can you repeat your question? Do you have like interesting or unexpected use cases from your customers? Do you know what your customers are running or using the Nikonipa cloud for? Yeah. You know, we have many use cases when we implement on Centrapec. One thing that we get is that we have many location and no matter where which location you're in, we have private network that link together. And that's kind of make it easier for them to deploy application on both sides or as many sites as we have. And that's the beauty of it. So we not only doing that, but we now have a chair and dedicated core flavor. So the user can use on variety of application that's suitable for their needs. So if you want to scale, you can resize within that instant. So this is our self-service. That's the case. We have variety really. Once we open up the horizon for them, you're gonna see a lot of application coming in. The other one, the good use case is in Thailand. I don't know somewhere else, but a lot of migrate, first-cloud migration from on-prem to cloud. Many send us the hardest to us, to our data center and our genius support them by converting to image, VMDK to Q-Cloud to run on OpenStack. Those are the use case that I think we have to our Thai customers. Okay, so maybe just taking a step back at the history you've said that your organization has been founded in 1996. I suspect you did not start with doing cloud. How did you get to choose OpenStack? How did you discover it? What led you to OpenStack? What's the history there? Okay, that's the history I can explain easily. At the time, we built a data center in here in Thailand. And I think it's in 1996, 97 or something like that. And I think what we want to do at the time, we want to do the set engine and we build it. But later on, Google just come along and we can really fight with them because the set engine for Google is pretty big at the time. So we have to draw back and then we want to sit down and say, well, what can we do with our data center? And that's we know that we should jump on a cloud bandwagon. But how are you going to do that? How are you going to start that? And that's when I discovered OpenStack. So I went to the summit so many times to understand OpenStack until we show that, okay, this is the only one. We don't want to build from scratch. So OpenStack, give us that opportunity to build a car infrastructure of your own, okay? And not only that, we found that OpenStack has a huge community that could support us if you, we have some problem with that. So, but again, in order to make use of it, you gotta put some work done into this OpenStack because it's not easy one, it's a complicated one. So everyone that will use this one, you gotta put some work into it. And as you can see now that really not much choice out there, you know, it's really a good car infrastructure that you can build to compete with hyperscaler and in terms of private car, you can use it to compete with VMware. I think we have no question about that, even though at the beginning we don't know it, but I think now we, I think we can compete with them in both arena, with OpenStack. Nice. So you said you've been involved in OpenStack, you started OpenStack with which version? You said Ocata previously, but maybe you started OpenStack with even another release before? Yeah, we started with a new version in our staging lab and we test and the last time we test one on the campaign Dr. Dr. Yeah, he told me that one. Container? Yeah. Okay, so we do a lot of testing at the time. The key, let me put this this way, the key is that once you jump out all in one and then have the experience of running the whole thing, I think our first Ocata version is really a good one that for us, for our team to learn. Okay? Even though it may not be perfect, but I think even now we can make good money out of the older version still. But it may not give you the SOA 99.99 the one that you like. Okay, any other questions on the history of Nipah Cloud? Felix, do you have any? No, but I think it was the most interesting scenario to start from, to go to OpenStack that we heard. Yeah, it's always good to understand how you're getting to OpenStack and one of the key reasons we started the project is to make sure that that technology would be available all around the world for everyone. And so seeing it used in Thailand to solve your business use cases is really great. I think we have a new question, Kristin. Yes. What deployment slash lifecycle tools are you using? Example, Cola Antibole by Frost, et cetera. Okay, we are using Cola Antibole and we do some customization for support the Townsend Fabric. Okay, so you're using Cola Antibole now on your new cloud? Which tool were you using before? On the old version, we used the OpenStack Antibole on our OpenStack. OpenStack Antibole. You're quite used to Ansible with all of this, okay. Right, and for the storage, we use SEP Antibole for the jewel under the old one. But for the new one, we use Cola Antibole for Victoria and Yorca. And for SEP, we use SEP Antibole under Octopus. And for Townsend Fabric, we use in-house development to deploy. For what? In-house development. For the Townsend Fabric. Okay, Townsend. Townsend. Yeah, okay. Makes sense. Do you then have some additional tooling for, let's say, more operational tasks like just that live migration topic you just mentioned earlier? Is there some additional tooling you're using? Yeah, it's just a script for do the line make agent and we are some monitoring tool for make sure that VM is not going down. Yeah. And actually, we have two names. It's a mess, Metal Adder Service of Ubuntu. Yeah, we are used for the automated installation of hardware. Yeah. Or hardware. Right. And Ansible just used for the script to automate software. And we have another question from the audience. What kind of customizations that you need or that do you need to do for Colance? It's just used for deploy the Townsend V-Router into the Compute Node. Yeah, that's it. Because the Colance one does support to deploy the Townsend Fabric, but our architecture need to deploy that. I think Colance one is really, really good. Yeah, we really try to stick to the Colance. Does it mean you're relying also on Colance Docker images to deploy your infrastructure or are you building your own images on your side? We also make some change of OpenStand sort code. So we need to build our own image Colance. So what's your process to maintain your downstream change? Have you forked every OpenStand Cripo inside your company? Or do you, how do you bring back your change or upstream change into your cloud? We don't want to make the share of the sort code. Yeah, but sometimes we have to do that. We have to keep the update, keep track to the upstream. Yeah, and we use the Kola configuration to patch our sort code when they build. So we have to keep track of that every time we upgrade. Yeah, that's the question behind, how hard is it to keep track of this? Because I know by experience that it could be very painful to maintain downstream change, even if they're even bigger when the changes are easy. And I imagine, for example, that you are deploying text and fabric. So the networking stack is completely different compared to what is usually deployed using Kola. I don't know how much code or how much work it is to maintain that, but if you have to bring back a patch from upstream, is it complex on your side to maintain or? Yeah, but we have to upgrade on every two years when we use the KITDIF for the deep sort code, the change from the Victoria version to the less dead version. And we have to try to deploy on our staging because we use the collines, but it is a Docker image, right? We can use that image, that deployment on our staging and do that on our production. Yeah, it's compact, yeah. Yeah, but we have to try on stage. Yeah, you have to try on staging and make sure everything is working fine and we deploy it on production. And our engineer needs to know every option between the collines, even from the Victoria version to the new version. How do you manage testing your open stack on staging? Do you have a bunch of tests you're running there? Are you relying on the upstream test factories like Dembest or stuff like this? Or do you have custom tests made in your company for this? Testing. Testing, yeah. Yeah, we have many staging customers. We want to use the exactly same configuration of the production on our staging and hardware also, but it is a small size, yeah. Actually, our staging is about two rack, yeah. And we have about four customers in there. Do you then, like, what's your feeling for actually testing these changes in the staging environment? How representative do you feel that to your production environment? I know at least for us that you can test things in the test staging environment, our production is always not 100% the same. Okay, we have some testing tool, like we deployed the testing VM in our staging and he's learned some benchmarking to like a file on a VM and send the metric back to our monitoring system. Yeah, and we have the hardware, like same in the production, yeah. We test it before machine to the production. We have another question from the audience. I'm curious about the way you handle cluster configuration on the admin side. Is it done via CLI by operators or do you have automation in place? First of all, in the admin side, we connect the keystone to the LDAP server, LDAP, yeah. And we rewrite some, customize the policy of Newton, Nova, Cinder, Octavia, yeah, to classify our support team. We use Horizon and CLI also and we have some script and we also have some admin portal to many. Maybe a few words about monitoring. You said something about you have custom monitoring tools on your side. How do you monitor? How do you monitor your OpenStack deployment? You said you have something to monitor the fact that instances are running well, for example. Yeah, we use something to centralize the monitoring, yeah, and we monitor many layer like hardware monitoring, OS monitoring, hypervisor monitoring and OpenStack API. And the latest one that we implement on the first one is about a monitor with show machine help or the VM help by try to call the Kimu Bing from hypervisor to VM, yeah. Because the last time that our production out here is about the safe storage, is make our system stuck in about five minutes, yeah. And we didn't know which VM need to reboot, yeah. That is why I need to implement this monitor. Yeah, talk about the outage. You know, we have some incident on the outage as well and it's about the safe storage. And as you know that our cloud, we use the volume, okay, and then when you build VM, the computer, don't we provide CPU and RAM, right? And then attach volume from the safe storage. So if there's some problem with safe storage as we know, we have some problem with the VM as well. So what happened with the safe storage at the time is we have a hardware problem with the safe storage. I think the root cause is the NIC card. That's not stable. And then because we, it's an OCP 2.0 form factor that causes a problem with the 25 kick NIC card, okay. Create a heat more than it should be. And the heat sink also in the hardware, instead of open up to the airflow, it's just close to the airflow by flipping to the other side. I think it's just a problem with the manufacturer that they did not test it good enough. So we found out it's a problem because the heat is so high. So we fix it by changing from OCP 2.0 to PCI card, okay. And so everything else is coming back, but it just strictly the hardware problem that we found with the outage of our cloud. So the whole safe cluster stopped, right? Because of this? Safe cluster is weird because, how about here, network fact? The problem is the network card and the network card is fapping. So our safe system doesn't know which OAD is alive or not. Yeah. So it's fapping. You mentioned earlier that, oh, sorry, go ahead. It took a while. We went from this, you know, that's once the problem happened, you gotta find out what happened. And luckily that, you know, we have two sides and one side doesn't happen. And you know why it doesn't happen? Because that side, you don't use much. Okay. The side that you use a lot, heavy usage, it's create a lot of heat. So those are the things that we learned and then, so we now replace all the NIC card, OCP 2.0, with the PCI card here. Go ahead. That also seems like one of the things you can't easily find with testing and all. Yeah. Very strange. You mentioned earlier that you upgraded from Victoria to yoga. Did you have some specific procedure for that? Did you use some specific tooling besides color and some specific testing? Yeah. The first reason is Victoria is, among the non, maybe look upon. Oh, Victoria just no support anymore, right? So the component will not be supported. So we gotta move to the yoga, right? Yeah. And the Antelope. Antelope, Antelope, right? The newer one. The Antelope know why I have a feature like a power saving feature. Yeah. On the CV, right? Yeah. Okay, it's a power saving and that's what we like about it. That's why I think it's so give us motivation to upgrade to the Antelope. And for the doing of the upgrades, you basically rely on color and for that. Basically rely on what? On color and to do the upgrade procedure. Yes, yes. Okay. Ah, sorry, Kristin. No, you're good. Sorry. We have another question. One of the pain points that Ashish finds in Cola Ansible was to manage the inventory and to reflect in its correct state. How do you manage failed hardware or one in maintenance mode in inventory? Is this a complex? In our system, we didn't use one Cola Ansible. We redefined the two layer of our cluster. The first one is a global infrastructure. Yeah, the global infrastructure use one Cola configuration and manage it. And we split some controller and compute node. My team call is the edge deployment. Is this another Cola Ansible configuration? When we have more size, we will duplicate the Cola Ansible configuration for each side. Yeah. I'm not sure is this a question? Is it answer the question? I don't know. I think so. Yeah. So this is the large scale six. So we like to discuss scaling issues. And it's interesting because it's the first time we have, I would say like a deployment that is earlier in the scaling process where you don't have, tens of regions and tens of thousands of nodes. So I'm actually interested in learning more about the scaling issues that you have at that stage. My first question would be, are you growing your deployment steadily or is it mostly a fixed size? Are you planning to continue to grow your deployment? And how fast does it grow? Like I mentioned earlier that just this year long, we go to how many VMs in one year? 2000 VM, okay. When we started now, I think the main infrastructure, the core one already set up. All we have to do now is adding compute nodes, adding storage nodes, right? And we use leave and spy so we can expand it, just add leave at spy and then you can just grow. I think this is the beauty of scaling that we have. And same as self storage, right? You just scale by adding a storage node. So I think it's really, we're ready for the scaling and we don't know yet how the time market will accept this one. But I think it's showing a good sign that we are in the early stage. Again, like you understand that we have to compete with the hyperscaler, okay? You have to compete with Huawei, you have to compete with Tencent that come to Thailand market, but with the local cloud, we are the only local cloud here that can provide similar kind of infrastructure and stability to our customer here. And then when it's local and when we expand to the edge with our Tencent framework, I think all these things make it viable for us to grow, to expand. It feels like you're specific. The fact that you have multiple of those regions, like at least this network presence all around at a relatively reasonable scale is very, it's a very big difference between the big cloud providers that will just have one big data center in, I guess, a single place. I think we have a question from the audience. Yes, Braham mentioned, or said that you mentioned that because of some of the scaling issues, you decided to have a new open stack install. Can you tell us a little bit more about what kind of issues? Okay, so let you talk about the old one, right? Okay, the old cluster, we found that there's a network out to a broadcast issue that make underlay network loop on, okay. So I'll take your hand more. So on the general open stack deployment is just outside the relan to install it to tenant to controller, right? And when we scale that relan need to expand to another length, yeah. When our engineer makes something wrong, it's make a network loop in cluster, it's bring our cluster down. That is the problem that we got before. And for the open stack, we got problem about the, when we do the snapshot in Compute Node, it take more memory while snapshot. In Thailand, our customer also need to do like auto backup in every day, yeah. So it means every VM in each hypervisor Compute Node it need to take snapshot at night, yeah. That is make memory full and VMs go down. So that is why we move to a volume, yeah. When it takes snapshot, it make from the safe side. We do have another question. What are the bottlenecks that you find specifically in networking side, especially on the L3 agent side? It is not about L3 agent, but it is about how to scale the Compute Node and install it, right? We have Compute Node and Safe Storage and how to scale the underlay network when Compute need to talk to install it. Yeah, that is the main problem. L3 agent is very good. But the reason why we use the TangStand Fabric because the TangStand Fabric is use a BGP protocol. So we can put some BGP from underlay, sorry, we can put some BGP protocol from overlay to the router and router can control the rules, which direction of network can go. Like we have two already so on, right? And we also have two floating network, yeah. So VM can go into the internet directly on each side. Does TangStand Fabric have such a L3 agent at all? Or is it just then distributed as BGP information everywhere? It's like on the Google Chrome network, it's all L3 agent, yeah. Everything in TangStand Fabric is routing, it's know what can. I guess that helps with a bunch of issues. You see some problems, there are a lot of problems. When you scaled up your environment now from the start to the computers you have now, did you see any kind of issues because of scaling? Or did it all work extremely smoothly? It's pretty smooth, right? I think it's so far, one, three, we move a lot of things up to the router, right? So instead of doing on a software side, everything up, move up to the router, make it a lot easier. And I don't see any problem right now in all scaling. Like I said, we can scale up as many as we want and then pop up availability zone as we like. I think one of the product that I mentioned earlier is about Cybege. Cybege is the edge cloud that we can set up at on-prem of any of our customer. And they can use the portal that we have in terms of choosing the way you want. We can put logo in there and they can pop up into our public cloud if they have some resource that they need more. And then some feature that in our public cloud that is available, they can also come in and use as well. So I think make it more flexible for us, for our customers as well. Is your edge cloud popping in your whole infrastructure just like a new AZ or is it a different open stack deployment? It's like a new AC. Is that it? Yeah, it feels like you're in a good spot from the scaling perspective because you're not at a stage where we usually encounter a lot of scaling issues. Like Arno or Felix, maybe you can comment on, I would say like the, you know, we usually encounter scaling limits within a given region once we start reaching 500,000 compute nodes, but they should be safe scaling up to that until what would be your expert advice on when they should expect the next scaling issues? Usually in my experience, it mostly depends on the neutral side, but as you're using different neutral drivers than we do, so it could be very different on your side. It really depends on how the networking stack could scale on your data center, but I think you are good to go with at least 500 computes. That's a target, I think you could reach very easily. We hope so. Five minutes. Yeah. It starts fun of the open stack journey. Yeah, you know, we learn a lot from the old cluster, and I think we encountered some problem that we need to solve it, but the whole thing is come from learning on the real stuff on the real situation. You have to take the risk, you know, in terms of moving to the area or something that people don't try before. I think moving up to tungsten and create, give us the opportunity to do that, but at the time when we start, just like when we start open stack, we don't know what's going to happen. If we want to use it as a public cloud, whether it's suitable for that. Okay, so we take a lot of risk, but now we take another risk on tungsten and publics, and I think it pays out nicely. So I know Nova is very easy aware. I imagine Neutron with your tungsten fabric is easy aware as well. What about other services like Glence, Sinder? Are you deploying them? I mean, one Sinder per AZ, or is it global Sinder for the whole region? Sinder or Glence for volume and images? Sinder. Yeah, Sinder. We deployed the Sinder volume for the each region, because we... You use AZ, right? Yeah, you use AZ, yeah. AZ, yeah. Because the Sinder volume needs to do some conversion. Yeah, we want to do the conversion on the each side. Okay, so you have one safe cluster per AZ? Yeah, correct. Okay, and what about images for Glence? Grants, we are using object storage as a backend, and object storage on the safe storage on the two sides. Yeah. So it's the same. You duplicate the images on the two AZs? Two AZ applications. Yeah. But the API itself, Glence API, is above AZs, right? Grants API have three interfaces. The first one, when users connect to the profit, it's like they can connect to two AZs. And when the Sinder volume needs to connect to Grants, it configs to connect directly on the AZ. Yeah. Because when users try to upload the volume to image or convert image to volume, it needs to be on that side. Yeah. I don't want to transfer across side. Okay. Oh, what we have done here, we also documented, and it's on our website. I mean, not a website, a portal site, right? Yeah. So, but we wrote in Thai yet. We have not translated yet. Okay. And then we also talk about it at the Vancouver, right? Oh, no, no. Oh, in Berlin. Yeah. I spoke about this thing at Berlin as well, with a lot of diagram on that one, too. But we will translate our document, you know, in English. Yeah. So besides the issue you had with the OCP cards, did you have any other interesting outage you want to talk about? The D-Dot attack. Yeah. Okay. We got some attack D-Dot. Yeah, we got D-Dot attack. And then what happened to V-Router? V-Router memory is full and all network connectivity of that compute node is go down. And then, I mean, even the one that did not get the attack, right? Also, I have some effect on that one. How you solve it? Oh, okay. All right. So, so what happened is that we, we, you know, working, we are working with our partner like in Purva, right? Redware, Cloudflare, you know, to do all this D-Dot attack, IP, IP, basically D-Dot, IP protected. So you need to work on, on, on those provided. But, but there's some issue with that. We do a lot of research on, on D-Dot's attack here. And I think with all this in Purva and Redware and Cloudflare, we need to help them to be local as well. Otherwise, if you're going to use always on, all the traffic has to go out of the country and that create a lot of latency. And we don't want that. Okay. So most of the D-Dot's attack come from outside. Okay. So, so these are the things that we're trying to work with the provider that they usually come from abroad. What we need them to be is just have the local presence. So maybe you're going to resolve something, do it here. But, but even attack from outside, you can use your note around the world and then protect us and then send only the good traffic back to Thailand. Okay. Okay. We're close to the end of the show, but I wanted to ask one more question because you, you overall feel like very happy with your, your open stock deployment. And usually on the show we have, we have large scale deployers that are suffering from some, some personal pain point. And so I was, I was wondering if you had one specific pain point, pain point with open stack that you would like to share some specific issue that you always encounter and wish weren't there or some other personal issue that you have with open stack. Okay. Pain point. Oh, okay. I got one. It's about the Octavia load balancer. Yeah. When we deploy the multiple AC, right? The M4A instead need to be on the each AC. Yeah. And in, in Octavia is have the hail monitoring. Yeah. The hail monitoring cannot specify which AC can monitor. So the problem is when the network between AC go down, every M4A VM will be failover. Yeah. That is our problem. So you end up in a failover loop because of the network issue between your AC. No, it's just a link down for the two minutes, three, three minutes and the hail monitor detect the VM down, the M4A down. Okay. Also, also, you, you, you know, we talk about the old cluster, right? We have some problem with the old cluster. Yeah. That you use open AC, right? So you probably, you found a problem with the LHC. Yeah. So we cannot simulate the, the exactly same from our production to staging for Thetis because LHC, when you deploy it, it's just a Lee install, APT install, YAM install. Yeah. Up there. When, when we move to a Docker container in the Kola instrument, we can like duplicate on the production to staging. Yeah. So make life easy. Yeah. Okay. Okay. I think we, we may have time for one or two other questions before we wrap up Christine. Yeah. Let me see. Um, so somebody asked, how do you design a VPC network with the multi zone model? All right. I mean, one can subnet and one VPC be extended across multiple zones. Uh, first of all, the underlying network of the computer needs to be connect each other. Right. Uh, and the second one is when, when, when we use thanks for being all, all new thrown open. Uh, they have, uh, overlay network like a VX land, right? In constant fabric is used the VX land MPA based over UDP. Yeah. Uh, the private network can connect together. So you just need the underlying network before. Sorry. So I think. Yeah. So it's always because you have your, your, your experience with the VX and you are able to do that. Yes. I know that there's some question that if you don't use fabric, then maybe it is something that you'd have to work around it. That's going to be a little bit hard on it. Well, um, we're almost out of time and I wanted to thank all of our awesome speakers today. Thank you, Dr. Trulia and tech for joining us. And thank you NEPA cloud for being an open and profoundation gold member. And a big thank you to our audience for asking a lot of great questions today during the show. And as a reminder, if you have an idea for a future episode, we want to hear from you. You can submit your ideas at ideas.openinfra.live and maybe we'll see you on a future show. Thanks again to today's guest and we'll see you in the next opening for live. Thank you. Bye. Thank you.