 I think that we can start the recording now. Yes, sorry, recording started. Welcome, everybody. I'd like to thank everyone for joining us today for another webinar from the Cloud Native Computer Foundation, the title is, as you can see, Filling the Kubernetes Beast, Bringing Data Locality Back to Data Workloads. My name is Alessandro Votza. I'm a Principal Software Engineer in Microsoft and a Cloud Native Ambassador. So I'll be moderating today's webinar. So I will collect your Q&A. I will interact with the speaker on your BALF. And I hope everybody's going to have a great time. We would like to welcome our presenter today, Adit Madan. It's ProgerMantel Luxio. And before we get started, of course, just a few housekeeping items we will collect. There's a Q&A box in Zoom at the bottom of your screen. That's where you can write your questions for after the webinar. We will collect them. So the webinar will last for 30 to 35 minutes. Then we will have time to talk about, to answer your questions. This is an official webinar of the Cloud Native Foundation. So please, please, everybody be aware there's a code of conduct and we will not let any questions that will violate that code of conduct. Now, if Adit is ready, he may start the webinar. He may start the presentation. Thanks, Alessandro. Welcome, everyone, to the webinar. Today, like Alessandro mentioned, I will be talking about feeding the Kubernetes beast with Luxio. I'm a lead engineer at Luxio. I've been here for about three years now. I hope we all learn something new today. So here's the agenda for today. I'll start with a quick introduction of Luxio for those who don't know about it. Then I'll move on to describing some fundamental Kubernetes concepts, which I'll use in the rest of the presentation. Then I'll talk about different deployment options and use cases for a Luxio on Kubernetes. And I'll end it all with a demo of Spark and a Luxio running in Kubernetes. Here we go. So if you look at the Luxio project, the project itself began as a research project in 2013 at Amplab in UC Berkeley. Now it's been about four years that the company has been established, it's well-funded, and the goal that we have at a Luxio is to orchestrate data at memory speed for the cloud. And what does data orchestration even mean? Like, that's something I'll get into some details in the rest of the talk, so stay tuned for that. We have a fast-growing open-source community. We have a variety of contributors from both the industry and academia spread across different parts of the world. Our GitHub repository is extremely active, and in case you want to learn more about the project, feel free to get in touch with us on our community Slack channel. The link is there on the screen for you guys to view. Okay, so to give you some context on why you need a project like a Luxio and in the evolution of the big data ecosystem, let's start with what the first iteration of the big data ecosystem looked like. So when we started, we only had one compute framework, which was Hadoop MapReduce, which was co-located with one storage system, which was the Hadoop Distributed File System, and your data and compute would reside on the same cluster, you would grow both compute and storage together. The whole premise of running compute on the node which have the storage was that you would obtain data locality and data locality is something that is critical for performance, and so that you gain timely insight from the data that you have. But if you look at the big data ecosystem today, we have a proliferation of both compute frameworks, including Presto, Spark, MapReduce, Flink, and also a variety of storage systems that are available. And these storage systems include storage systems both on-premise, such as EMC ECS, Hadoop Distributed File System, and also in the cloud, including Amazon S3, Microsoft Azure, Google Cloud Storage, and so on. Now, each of these compute frameworks serve a specific purpose. They are good at a certain kind of workload, but they still do need to interface with the variety of storage systems that we have listed at the bottom. Like I mentioned, initially, we have in our journey, we had a co-located compute and storage cluster. In this picture, I have MapReduce as the compute framework running on HDFS. Typically, what happened in big enterprises and also in the cloud was that people observed that most of the clusters were typically compute bound, but in order to add compute capacity, people also had to add storage capacity since storage and compute was co-located. So you can't grow compute and storage independently. Then we moved on to the world with disaggregated compute and storage, in which since you have disaggregated storage and compute, you can add based on your needs. So if you need only storage, you would add nodes to your storage cluster, and if you need only compute, you would add more compute capacity, which is more economical, but what you lose in the process is data locality. And remember, data locality was the whole premise of why MapReduce and how do the data ecosystem started with co-locating compute and storage. Also, what happened was that when people have big clusters of HDFS on-premise, when they want to grow out the compute, they could either grow out the compute by adding more nodes to their on-premise clusters, or they could choose compute in the cloud. So you could say that, no, I want additional compute capacity and I'll want to use that in Google's cloud, Amazon's cloud, or Microsoft's cloud. And also, like we mentioned before, since with the proliferation of the compute frameworks, people now want to use many more compute frameworks in addition to just Hive or MapReduce, which are more suitable to their workloads. The other thing which happened with the move to the cloud was that now we have cheap object storage solutions which are available. And the cost of data storage is much cheaper than it is to provision a physical node and store your data in HDFS. So all of this entire stack is also highly useful as deployed on top of Kubernetes in many deployments. Okay, so this is exactly where Luxio comes in with different compute frameworks talking to different storage systems. Luxio acts as a virtual abstraction layer which sits in between the compute frameworks and the storage systems. Luxio exposes different APIs, including the Java native file system API, a Hadoop compatible API which we have labeled as the HDFS interface in this picture and other interfaces such as the POSIX interface which allows applications to access storage systems including object stores with the same familiar interface that they are used to. So end applications do not need to make any code changes but they could still work with any of the storage systems that we have labeled below. So some of the key innovations of Luxio include the three bullets, the three pillars that I have on the screen. The first one is data locality. Like I mentioned, as you're increasingly moving from a world where there was co-located compute and storage into a world where we have disaggregated compute and storage, data locality is something that is not easily preserved in a situation like that. So by having a layer of Luxio in the middle and I'll talk about the solution in the following slides, we are able to have data locality without making additional copies or migrating data to a compute cluster which may not be the same as your storage cluster. The second pillar that I have over there is labeled data accessibility. So you could still continue to use the popular APIs which your frameworks are familiar with to access your storage which could be located anywhere. It could be in the cloud. It could be in a cloud which is in a different region, are spread across geographically. And the third bullet I have over there is data elasticity. What this means is that within the single five system namespace that Luxio provides, you could have access to data spread across storage systems. So imagine you have a high table which could be stored across different storage systems. So you could have some partitions stored in HDFS, some partitions in S3 or some partitions in yet another storage system. Here are some examples of how an end application interacts with the Luxio. If you look at, if you're familiar with Spark, you can see that accessing data through Luxio is very similar to how you access data from HDFS. The only difference is that you, instead of the HDFS scheme, you have the Luxio scheme and pointing to the Luxio master. And so what you have is that with only configuration changes and no code changes for your end application, you could switch from moving HDFS to using a Luxio which enables you to access data across storage systems and also access data on object stores. Similarly, for Presto, if you're familiar with Presto, it looks exactly the same. The only change is that you would move from the HDFS scheme to the Luxio scheme. The third bullet that I have over there is the POSIX API. So you could interact with the Luxio using a familiar POSIX API by simply issuing calls as you would issue calls to a local file system. And this is an interface that we've seen increasingly popular for machine learning applications such as TensorFlow running on Kubernetes. The most flexible way of interacting with the Luxio which gives you the most amount of control is using our native Java API which is our Java client library. And this, in case you want to roll out some applications, fresh applications running directly on Luxio, you could choose to use that. Okay, so since I'll be talking about the deployment of a Luxio on Kubernetes, I just want to give you a high level architecture of what Luxio looks like. So a Luxio is a distributed file system which has master and worker components. The master component for a Luxio is something which stores the metadata for the distributed file system and the workers is the component which stores the actual data cached by a Luxio. So in this picture that I have, I have two applications, Presto and Spark accessing a single Luxio cluster. All, and the end data that's being accessed could be, is residing in the two pillars I have on the right, which is the object store and also HDFS. So you could think of, if we are talking about a cloud deployment in which a Luxio and the compute application is deployed in a cloud, we could have a situation in which HDFS resides on-premise. You are accessing data in the cloud from your on-premise HDFS cluster. And at the same time you have some additional data which you want to store in the object store which is provided by that cloud. So what would happen is when you access data from a Luxio, you would cache the data in a Luxio based on whatever the location is. So if you want to access HDFS data, you could set policies in which control how a Luxio stores the data that you're accessing. For a high availability in a Luxio, we have both, we have a couple of options. We could either, for on-premise deployments of a Luxio, we have the option of using ZooKeeper as a consensus, quorum consensus or for environments like Kubernetes, we have an embedded quorum consensus algorithm between the Luxio masters, which ensures that a Luxio clusters are highly available. So I'm sure this picture reminded you of the HDFS architecture. A Luxio masters are similar to Hadoop name nodes and a Luxio workers are similar to Hadoop data nodes. Okay, next I'll talk about a few Kubernetes concepts which I'll mention in the rest of the talk, especially of, so these are some components that we use to deploy a Luxio on Kubernetes. And for people who need a quick refresher, here we go. So like most of you know, Kubernetes is a system for deploying and managing containerized applications. This could include applications like Spark and Presto and also stateful applications like Luxio. In the next couple of slides, we'll cover some basics, talk about different options for deploying a Luxio on Kubernetes and like I mentioned before, we'll have a demo running Spark on a Luxio accessing data from Amazon S3 in instances in a Kubernetes cluster deployed in Amazon EC2. Okay, so some of the concepts that a Luxio uses from Kubernetes as the container orchestration platform, are listed here on the slide. Kubernetes abstracts away the physical infrastructure so you can run containers on different physical hosts and this makes the deployment of applications like a Luxio repeatable, regardless of what your physical host operating system or infrastructure was. To make it easy for applications to connect to each other on Kubernetes, especially when containers are launched across different hosts, there is a mechanism called service discovery. The other thing that Kubernetes provides and a Luxio uses is the self-feeling capacity in which let's say an a Luxio master goes down, Kubernetes provides you with the ability to relaunch the desired number of pods and containers on the cluster. So secrets is a way of managing credentials when a Luxio connects to different storage systems such as Amazon S3, sensitive credentials such as the access key and the secret key can be stored in the secret store so that data, any sensitive data is not readily available to anyone. Kubernetes also has different options for storage management such as persistent volumes and this is also something that we use to store the Luxio journal in Kubernetes. Okay, so all these terms should look familiar. Containers are basically a Docker image with a lightweight operating system and the application executor environment. Once a container is in a running state, we once an image is in a running state, we can we call it a container. Pods in Kubernetes are the basic schedulable unit. Multiple containers can be combined into the same pod and this is something that we do for a Luxio as well. Controllers are a way to specify the desired state and Kubernetes ensures that regardless of the failure scenarios, the desired state of the pods are still maintained on the Kubernetes cluster. Persistent volumes, we already talked about this. So persistent volumes is used by Luxio as well to store journal or store any state that should be maintained regardless of if a pod is restarted or a pod fails. Okay, so there are different ways of launching or deploying and managing a Kubernetes application. The most basic way of deploying an application is using a declarative YAML file which specifies the controller, the container image, what containers are combined into a pod and also the different persistent volumes and resources used by an application like a Luxio. So typically what we have is that we have a set of YAML files which we would need to modify to deploy an application like a Luxio and there is a lot of redundancy within these files. For example, the image, the container image and the tag would be duplicated across a set of these files. What Helm provides is Helm is a thin wrapper over the declarative specifications. It reduces the complexity by specifying any redundant values in a single configuration file. This single configuration file kind of compiles into the multiple declarative YAMLs that we used in the previous steps so that when you make any configuration change you have a single location to modify and all necessary YAMLs are modified. So another abstraction over the declarative specifications and something that's increasingly used to deploy applications on Kubernetes are Kubernetes operators. So any domain knowledge that is specific to the application can be built into the Kubernetes and this provides you with the most flexible and easy way of deploying any application on top of Kubernetes. For example, if you are doing upgrades and you want to improve troubleshooting during the upgrade process, this kind of domain knowledge can be built into your operator so that it makes any DevOps kind of operations easy for your admins. Okay, so in the next few slides I'll go into some more details of what the solution of Alexia on Kubernetes looks like. Okay, so when you have a Kubernetes cluster which is segregated from the data store, the original source of data, one way of making the data accessible on your Kubernetes cluster is by copying the data over into the Kubernetes cluster. But what this does is that you need to set up an ETL job and make multiple copies of the data to make that data accessible on your Kubernetes cluster. Now, in order to migrate the data and for the ETL to work, you need some kind of stateful storage system on Kubernetes. And having this kind of storage system on Kubernetes can be very hard. So to tolerate elasticity and when you scale your Kubernetes cluster up or down, you might have to both migrate or rebalance your data. For example, if you are scaling your cluster up to have an even distribution of the data, you might want to rebalance your data so that you have your performance that you want. And also if you scale your cluster down, in order to not lose any data, you would need to migrate data which is present on the larger cluster to the smaller cluster so that any of the data is not lost during the elasticity process. And also changing applications to any new storage system which is deployed on Kubernetes can be very hard as well. So if your storage system that is available on Kubernetes does not provide a familiar API, your applications will need to change. And also not just the modifications that are needed to your applications, but tuning your applications for performance can also be equally challenging. So this is kind of why we need a solution like Aluxio on Kubernetes. So when you have your compute cluster running with Kubernetes, accessing data which is not present on the Kubernetes cluster, Aluxio brings a few useful features which we have described on this slide over here. The first thing that Aluxio brings is that Aluxio gets data locality regardless of if your storage is not deployed on Kubernetes. So in this picture that I have on the right, Spark and Aluxio are running on a Kubernetes cluster and the data being accessed is in an object store like Amazon S3 or it could be in a different object store such as Google, such as GCS. Now when you access data from any of these object stores, the first access, what would happen on the first access is that the data would be fetched from your object storage onto the Kubernetes cluster into Aluxio. Aluxio would share this data across different jobs and your application would schedule on Aluxio with data locality for any subsequent accesses. Okay, so let's talk about another important use case for Aluxio with Kubernetes. So what we've seen increasingly is that big enterprises who traditionally store their data in HDFS are increasingly wanting to burst into the cloud, burst their compute clusters into the cloud. Like we mentioned before, we've observed that many of the on-premise clusters are typically compute bound and what they want to do is that they want to access, they want to provision additional compute in the cloud and access their on-premise data. So the two options that we talked about previously was that for a situation like this, what you can do is either you can set up an ETL pipeline, copy all of the data that you need onto the Kubernetes cluster and then only can you start running your applications. But the solution with Aluxio provides you with a couple of useful things. So the first thing is that the data is accessible immediately. So as soon as you spin up your compute cluster in the cloud with Kubernetes, your data can immediately be accessed from your on-premise cluster without setting up any ETL pipeline. The other thing is that data is fetched on access. So even if the storage capacity on your Kubernetes cluster is not high, you would still be able to access data and only cache data in Aluxio which you access. So you don't need to guess or you don't need to predict what data an analyst will want to use anytime in the future and migrate all of that. So like this is something that fetching data on access is also a reason why you would want to use a solution like Aluxio. So it's called zero copy bursting because you are not making any persistent copies in your on your Kubernetes cluster and you don't need to set up an ETL pipeline. Okay, so getting closer to the demo, I'll talk a little bit about the architecture of Aluxio on Kubernetes. So in this picture, I have two physical hosts running different containers. The host on the left is running the Aluxio master container and the host on the right is running the Aluxio poker container. Like I mentioned, Aluxio master is the demon which stores the journal and the metadata for the distributed file system. Aluxio service is used by clients to identify and connect to the Aluxio master. So even if the Aluxio master switches from the host on the left to the host on the right, an Aluxio client would still be able to connect to the Aluxio master using the Aluxio service which provides a DNS hostname for any client regardless of the physical location of the Aluxio master container. To store the persistent metadata, Aluxio can be configured to work with persistent volumes such as Amazon EBS if you're running in Amazon or you can use any other a persistent store, a persistent volume dependent on the cloud that you're working with. So when you're running Spark on Aluxio on Kubernetes, we'll have a Spark driver being deployed on some node in the cluster and we'll also have a set of Spark executors. Now a Spark driver and a Spark executor both have the Aluxio client jar embedded in them and they connect to Aluxio masters to identify the location of blocks and then they'll access the data from the Aluxio workers. Deploying Aluxio on Kubernetes, we have different options. We can choose to use the declarative demo specifications and you can deploy Aluxio with the default configuration using a set of commands that I've mentioned on this slide. Now, recently we've added support for the Aluxio Helm chart which, like I mentioned, it's a single location for specifying any redundant configuration and it makes it much easier to install Aluxio on your Kubernetes cluster. Now in the slide, the repository for Helm that we're using is Aluxio repo which should be available to your Kubernetes cluster for installing Aluxio using Helm and this will be in the stable Helm repository starting Aluxio 2.1 which is scheduled for later this month. Some of the ongoing work that we have related to Kubernetes at Aluxio is including support for large production deployments by improving the high availability solution that we use in the absence of ZooKeeper. Like I mentioned before, Aluxio masters have an embedded Quorum Consessness Algorithm which can be used for HA in the absence of ZooKeeper. Now, the other thing that the other major thing that we have validated recently is the off-heap metadata layer which allows Aluxio deployment in Kubernetes to store metadata for your files which could be in the billion. So you could have files in HTML and S3 and for Aluxio to be able to handle data across multiple storage systems, the metadata layer in Aluxio actually needs to be much more scalable than the metadata layer for the storage systems that it's accessing and this is something that we have included recently. Now, Helm charts, like I already mentioned, this is a more convenient way of deploying Aluxio and now the Helm chart has parity with Aluxio deployment in a non-containerized environment. Now, we also have a CSI driver for Aluxio coming soon which makes it easy to access Aluxio using the FOSIX API. So applications like TensorFlow or any other machine learning applications can simply mount a persistent volume of type Aluxio and start using Aluxio without the need to distribute the Aluxio client jar into the applications that is accessing Aluxio. Okay, so I have prepared a demo for running Aluxio and Spark in Kubernetes and I will jump into that space soon. Okay, so like I mentioned before, the setup for the demo is a four node Kubernetes cluster deployed on Amazon EC2. Let's just make sure nothing is running on the cluster as of now. Okay, so the way that we'll use to deploy Aluxio on Kubernetes right now is using the YAMLs, the declarative YAML specifications. The first thing that we need to do is deploy the Aluxio configuration which is Aluxio config map. So if you look at the config map, it's a set of configurations that we have for the Aluxio cluster. So it specifies the storage system which is an Amazon S3 bucket and it also specifies different parameters which are needed for Aluxio to run in Amazon EC2. So once the config map is deployed, the next thing that we do is we create the journal volume. So the journal volume is, let me just delete that first. So the journal volume is used by the Aluxio masters to store any persistent states such as the metadata for the file system cluster regardless of if the Aluxio master pod restarts or not. Now once the configuration and the volume has been deployed, what we do next is that we create the Aluxio master. If you look at the state of the deployment, we see that we have an Aluxio master pod running and it has two containers running inside. Now once the Aluxio masters are up, we can also launch the Aluxio workers. In this case, we use daemon sets for the Aluxio workers. A daemon set would mean that an Aluxio worker is launched on every single node in the cluster. Now once the Aluxio master and workers are running, we exact into the Aluxio master container and now we can access the Aluxio cluster using the PLI. So if I type in Aluxio FS, we can see the contents of the Aluxio cluster. So default test files is something that was present in the S3 bucket which was mounted at the root of the Aluxio file system namespace. So in addition to that, what we'll do is that in the demo, we'll be accessing a two gigabyte file which is in the bucket that I just mounted at location S3A. So what this means is that if you look at the Aluxio file system tree and at the location S3A, we'll have access to the bucket that I just specified on the highlighted line. So any contents of the S3 bucket other hyphen demo hyphen public are now accessible in the Aluxio namespace at the location S3A data. So this location is, as you can see, it's a two gigabyte file. The annotation persisted means that the data is only present in Amazon S3 and not inside Aluxio at the moment. And 0% also means that 0% of the data is cached in Aluxio at the moment. Okay, so in the tab that I have opened on the right, this is something we'll use to run Spark in Kubernetes. We have a Docker image for running Spark. We deploy it on this cluster. The Spark image contains the Aluxio client jar and this can be used by the Spark driver and executors to interface with Aluxio. Okay, so running Spark on Aluxio is as simple as a Spark submit job. So once we run this job, we just specify some configurations needed to access Aluxio. As you can see, we specify that we'll access the Aluxio master and we'll access the data on the location S3a slash data, which is the two gigabyte file. Actually, let me just make a quick modification. There you go, to specify the correct service name. And once the container finishes, we should be able to see the logs too. Okay, so it looks like we are a little short on time and also I'm running into some issues with the demo. But since we have limited time remaining, I would like to wrap up on the remaining presentation and I'll just walk you through what would have happened in the demo if that was working. So we had Aluxio, so what we saw so far was that Aluxio was deployed on the cluster. We had a single Aluxio master pod and a set of Aluxio workers. The workers store the actual data and the Aluxio master stores the metadata. We mounted an S3 bucket which we attempted to access through Spark and the first access through Spark would have cashed the data and any subsequent accesses through Spark would show you a performance boost because the data is now available locally on your Kubernetes cluster. In the interest of time, let me just wrap up and we can come back to the demo if we have any time remaining towards the end. So in this talk, we gave an overview of what data orchestration is. Aluxio acts as an abstraction layer accessing data from multiple storage systems such as Amazon S3 or HDFS. It enabled access to data in the Kubernetes cluster regardless of the location of the data. We ran through a guide of deploying and managing Aluxio in Kubernetes using the declarative demo specifications and we also showed you a demo of running Spark on Aluxio in Kubernetes. So in case you guys have any questions left after the session, feel free to reach us on the community Slack. The Slack address is aluxio.io-slite-slack. Feel free to find me. My name is Adit Madan and I'll open the floor to any questions now. Awesome, Adit, thank you very much for the presentation. The demo was a cool, that's all right. There was just a couple of questions on the Q&A window. Some of them have been answered already. Of course, Aluxio is an open source software and it is under a budget to consider license. And for that, ask a question. I hope you can answer. So the question is, does Aluxio remove the storage space assigned to a container once the container is not running or removed? So is it clean up the storage after the container is gone? So if you're talking about the storage space that Aluxio uses to cache the data so once the aluxio container is gone, Aluxio would clean up after itself and any storage would be removed. But we also have the option of running Aluxio with persistent volumes. And in case you want to preserve the data, that is also a viable alternative. Okay, I think it's a good answer. Just now, another question pop up from Vishnu. Master and slaves architecture looks like similar to SAF. So the question is about the similarity to SAF and does this architecture have an issue in rebuilding master specifically? So how does it compare to SAF? Is there a similarity there? So the architecture for Aluxio and SAF are a little different. The way that we rebuild the Aluxio master is by depending on persistent volumes. So in the configuration that we have like any meta data for the Aluxio cluster is stored in persistent volumes and once the master goes down and is brought up on a different node or once a secondary master is started, the state of the Aluxio cluster can be rebuilt from the persistent volume. So there are no issues in rebuilding the state of the Aluxio cluster. And this is something that we have worked on extremely hard recently and it definitely does not have any issues that I'm aware of. In case there is any specific issue or specific kind of issue that you have in mind, I would love to hear from you. Please get in touch with me on the Slack channel or with the follow up question. I see. So you're saying that of course the masters of Aluxio are protected by the same mechanisms. Are they stateful sets or deployments under Kubernetes? That's what you're trying to say? Yes. Okay, awesome. Another question is also from Farzad. If the container starts after long hours between downtime and runtime, would the storage data be preserved somewhere? So container can start to re-stop. So if the container is stopped and then restart, is the data being preserved? I suppose, well, I think the answer from the previous question. Yeah, so like I mentioned, so for the data itself, there are various ways in which you could preserve the data across restarts of the worker pod. Let's say for the Aluxio workers, the data could be stored in volumes of type empty der which is lost on restart, but Aluxio would still be able to access the data from the underlying storage system. So let's say Aluxio was acting as an abstraction layer between your computer applications and an HTFS cluster. Even if you restart the Aluxio workers and you used volumes of type empty der which is cleared on restart, Aluxio would still be able to fetch the data from your HTFS cluster. If that is not an acceptable alternative, what you can do is that Aluxio workers can be provisioned with persistent volumes. And regardless of how long your clusters were stopped for, persistent volumes can be used to recover the data. So Aluxio can store data in different tiers. Memory is just one tier, which is lost on restarts, but Aluxio can also manage data in SSDs and HDds or any other persistent storage that is available on your Kubernetes cluster. And that will be preserved across restarts of the Aluxio processes. That's interesting. Okay, I think this answer, I have only one last question for myself. If it's possible to run Aluxio in a multi-cloud environment, so to say having the storage in S3 but then provide that storage to a cluster running Azure or Google Cloud. Yes, so that is definitely possible and Aluxio brings a value of not locking you into a specific cloud provider. So if you have data available in Amazon S3 and you're running your compute on Kubernetes cluster on Amazon resources, today you can easily just as well migrate your Kubernetes cluster to a Google cluster and still access your data which is present in Amazon S3. The first access would have higher latency since Aluxio has to fetch data from a cluster which is not on the same resources but any subsequent accesses would still be able to run Kubernetes on a Google cluster but still access your data from Amazon. Interesting, very interesting. Let's see, we have three minutes left. If you can take a very last question, oh, this seems to be answered already. Well then, in that case, I will thank you very much, Adit. As you can see, there's a data orchestration summit in Mountain View in November. You can register in, we also posted the link in the chat and I would like to thank Aluxio and Adit for the great presentation and I hope you had a good time. Thank you for joining us. The webinar and the recording will be shared online later today. We look forward to more cloud-native computing foundation webinars and to everybody have a great day. Thank you.