 down the open source initiatives. I'm going to kick off the minute summit by giving you the flavor of what we are doing at the board level, the Soda Foundation. And then we'll actually jump into more technical details. The draw cage from IBM is going to cover the Soda from a use case and the higher level framework perspective. And then we have Sanyo from Huawei. He is going to cover the architecture click down. And then we'll essentially get into the member projects. Yusuf from BlendBit is going to cover the lint store. And then we have Kiran from Maya Data who is going to cover the open EBS. And then we have Stefano from Scality is going to cover the ZENCO. And then we'll wrap up with Larry from Huawei covering the Soda outreach community. And then we'll do the Q&A towards that. Hopefully we'll have enough time for a Q&A. Currently, we are planning around 10 minutes or so. Do post your questions while we are presenting. And we'll consolidate that and do the Q&A towards the end. With that, let's go to the next slide. Do I need to click on the slide or? OK, perfect. Let's go to the next one. One more. All right, OK. So the little bit of background on the Soda Foundation, which is really part of the Linux Foundation, we started this effort back in 2016. It is essentially a combination of few storage vendors getting together, plus Intel on my side, to really figure out what is the best way to consolidate the integration touch points for the storage control plane. We were primarily looking for what is the best way for us to create a control plane that works across different orchestration stacks. This includes the commercial orchestration stacks like VMware, Microsoft, and so on, as well as the open source flavors, including open stack and Kubernetes, which are essentially virtualized container abstraction layers. And we wanted to essentially look at what is the best way to create one common control plane to uniform different orchestration stacks. And then if you look at what happened in 2016, most of our focus has been with consolidating the orchestration layer. Kubernetes was fairly in the beginning. There was a lot of discussion going on on the Kubernetes container storage interfaces. What is the best way to abstract it out? But over the course of time, what happened was rather than focusing on just an integration touch point for the orchestration stacks, the whole ecosystem around storage has changed quite a bit from 2016 onwards. And more and more data is essentially anchored towards AI type of workloads. And there has been a lot of focus on the edge, autonomous driving. So the community focus has evolved from let's actually create a common control plane for different orchestration stacks to let's create a common data plane that really addresses the emerging workloads that are really around data-centric workloads. Let's go to the next slide. The SOTA Foundation is essentially a direct fund project in the Linux Foundation, where the funding comes from premium members and general members. We have folks from China, Unicorn and Fujitsu, Huawei, NTT, Toyota being the premier members, and then there are several general members as well. There are a few folks who are going to present, Skeletee, Glenbeth, Maya Data being the focused member projects today. So the goal of SOTA Foundation is to essentially create an open source data and storage control plane orchestration where we can actually bring in different projects out there that can actually be part of the foundation through the common abstraction layer. Let's go to the next one. OK. What you see today is what is the rationale behind on the SOTA project. When we looked at very early on, our rationale was mostly control plane consolidation. Then as we started looking at the workloads, we had different set of problems that we wanted to really address. And there was no one common framework that can actually do this. So lots of implementations out there where you have siloed data management implementations with different types of interfaces. They're not uniform. Mostly manual, and it's not easy to actually integrate them. So there has been a lot of focus around what is the best way to interrupt these discrete point siloed solutions, and what is the best way to create that using the uniform framework. That was really the reasoning behind kicking off a SOTA Foundation project. So it's really anchored towards a streamlined data orchestration platform. Next slide. OK. If you look at the left-hand side, the data management, you can actually bring in a different data management project under the umbrella of SOTA. And our goal is to not actually do everything organically within the SOTA Foundation project, but rather bring in the existing open source initiatives, as well as the commercial initiatives under the umbrella of one unified data management framework. That's really the intent of having the SOTA Foundation. The way we see it is, the foundation is going to look at this from a framework perspective. But the intent is you need to be able to move the data on-prem and cloud. You need to be able to take advantage of the standard interfaces. There should be an interoperability focus, as well. And then we need to be able to deliver a scale and then be able to take advantage of the solutions, rapid time to market solutions. That's the underpinning of what we are trying to achieve out of the SOTA Foundation focus. OK. I won't go into the details of the framework. We have a couple of talks for this one. If you look at the highest level on the right-hand side, focus is it has to be open source, whether it is data management related, storage management related. We want to be able to deliver an open source framework. It has to be standardized so that it's easy to plug in different vertical point solutions easily. And we need to be able to deliver that via the ecosystem partnerships, both on the hardware platforms, as well as the software, be able to stitch them together to build solutions for different use cases. It could be AI. It could be Edge. It could be on-prem. It could be be able to move the data back and forth between public and private and so on. And then over the course of time, we would like to actually get into the certification aspect as well. So as a user that is consuming the solution, exactly the quality and the stability you should be expecting based on the certification process. Next slide. Here are the different programs. Our goal is to bring in and support the incubation aspect. So if you have a project that you really want to be able to incubate, get the early support going on, something that is related to data and storage related projects. We do that through the SOTA incubation program. For the developer ecosystem, we have good camps for vendors. We will be able to support vendor ecosystem through SOTA Foundry. And then, obviously, the users will be able to use the lab infrastructure to be able to do certain POCs and get familiar with the SOTA framework and the solution that they are actually looking at. And then, obviously, the community has a way to reach out to any support that you need. And then, obviously, we look at events as a way to broaden the visibility of what SOTA can actually do and how it can help to address the end user pain points. Next slide. I'll spend a couple of minutes on the governance just to give you a flavor of how we are managing the governance in the SOTA. Next slide. At the top level, we have the board. As you can imagine, the board focuses really to manage the budget, manage the allocation of the money for different types of work, whether it is the development, whether it is the outreach, or whether it is POCs and customer solutions or the event. That's really the important parcel of the governing board of the responsibility. And then, we have a technical screening committee. And there are different working groups focusing on the different aspects of the technical domain. It could be architecture. It could be use cases. And it could be POCs. That is part and parcel of the technical screening committee scope. Rakesh is going to cover that. And then, we have end user advisory committee. This is something that we wanted to consciously promote, primarily from the perspective of understanding the use cases and the pain points and how do we actually address that methodically via well-defined standard interfaces and then developing a framework that really will evolve into addressing the solutions and pain points as opposed to a technical view and then outreach community to really broaden the usage aspect, as well as raising the awareness in the industry. So that's really in the nutshell what the governance is, starting with the board, technical screening committee, end user focus, and outreach focus. Next slide. Here are the members from different companies, as you can imagine. The members are from most of the big companies that you might have come across. And our goal is to actually expand this based on over the course of time with more broader representatives. And we'll continue to work on that one. Next slide. Next one. All right. I will hand it over to Rakesh. Rakesh is going to cover the technical screening committee aspect as well as the few of the projects that are actually in flight currently that will give you a flavor of what's happening at a technical level. Rakesh. All right. Thank you. Thank you, Reddy. All right. So actually, I'm actually going to talk about, not so much about PSC right now, but I'll talk about the SODA projects, the use cases, that's what I can be used, and the SODA roadmap. Before I start, something about me. I work for IBM Research in San Jose and involved with this project, the predecessor of this project is OpenHDS, which has roots in storage management. And now we are getting into the data management aspect as well. So part of the TSE and the board members as well, what is that? So yeah, let's start with the SODA projects first. The, you know, we have to have the architecture data. This is how the architecture of SODA looks like. I won't go into the details of this right now, but Sanil is going to cover more details of the architecture itself. But what I want to mention here is that all these boxes to see here, these are actually in the middle, these are actually independent projects which make up the SODA framework. So that's what we call SODA projects, but that's not just it. So just to explain it a little bit, you know, SODA foundation basically is a foundation where multiple projects are hosted, right? And it's not like other foundations where the similar thing happens, but our goal here is to offer something which is kind of a product, an open source product. You can install and use. So that's the whole idea, you know, about how we actually integrate different projects in SODA foundation which are more or less, you know, into the storage and data management. So that's the whole thing. So there are, there are multiple types of projects we can consider or you can call it categories of projects. First is the core projects. That's basically how it makes the software. It's about printing, right? So that's the very core of it. In that we have this multiple projects like API, controller, it's multicloud, telemetry, dark end. So, right? So this basically makes the core. You have to have this for SODA as a installable working product, right? On top of that, so this is the work, you know, the base SODA community does. This are originated in SODA. These are maintained by SODA and they make up the release part of the release cycle, right? And there is a set of projects which get donated to SODA foundation. For example, Yig from China Unicom, you're going to listen about Lin Store today and Xenco. So these are basically, even though this started separately and they have their own release cycle, but they are actually donated to SODA foundation and they are part of the SODA foundation. So they are considered as member projects. There are some other projects like eco projects and ecosystem projects and others which we are still working on. So that's, you know, high level what we have for the projects, right? So next thing I want to go into the use cases. Okay, so what are you going to do with SODA, right? So there are a few use cases I will cover. There is actually a broad range of them, but I will cover, you know, high level what we are working on. So as already mentioned, we have end user advisory committee, right? So that's actually the source for us to get the requirements. That tell us, you know, what are the pain points? What do we want, you know, essay end user? So that is actually our source of requirements. For example, this use case is for cloud native storage. It's coming from KPN, Yahoo Japan, China Construction Bank and so on, right? So, and this use case applies to a lot of companies these are our end user community companies. So what is the use case here? Basically, you know, you have heterogeneous storage systems and devices and you want to provision storage for Kubernetes, right? Now the problem with that is that you don't get the unified view of the storage. Now what SODA has continuing on that? SODA has its own CSI driver. SODA doesn't have any storage of its own. That's a CSI driver. And if you look at the bottom of this slide, SODA actually interacts with a lot of other storage systems through Cinder or Manila or, you know, Swordfish based APIs and some of them directly, like so. As well as, you know, there's a new word stream going on that it can interact with the CSI drivers of those devices as well. But, you know, the advantage here is that once you have SODA in your system, in your framework, in your infrastructure, you get access or ability to provision storage to all these different devices. That's the main idea here. And why? Because let's first talk about the CSI thing, right? So you get to provision storage for your Kubernetes environment. Then we have this something called SODA multi cloud where you can connect to different clouds to basically object storage to put your data in. Now what SODA CSI and its orchestration does is that you can actually take Snapshot directly into any of the clouds you want, right? So this is a cool feature. You get irrespective of what storage device you use, right? On top of that, you know, it's just Kubernetes, but yeah. SODA supports OpenStake as well. The work for VMware is in the plan, not get started. But so what we call is not bound, we have Kubernetes, OpenStake and so on, especially the open source ones. Southbound, they are all different devices. And then we have the multi cloud. On top of that, there is telemetry and intelligence and all that, that is a separate set of projects which are going on. So this is basically the core idea for that particular use case. And you know, basically the main thing here is that you get a unified view and one interface today. Okay, next use case is about data lifecycle. So this is coming from another end user, Toyota. And this is an interesting use case. They have tons of data coming from their cars as from edge devices. And you know, IoT data, it has high value when it is fresh, as it gets little, little, little older, the value goes down. And so what they want to do here is that they want to actually keep the data in different types of storage based on the age of the data. As it gets older, it gets moved to secondary storage, or to another storage, right? Or archived for compliance reasons and what not. So that's the whole idea here, that this is where we get into the data lifecycle. The solution here basically is that using the policy, you can manage how you want to transfer data to different storage devices or different types of storage. All right, so that's the data lifecycle use case. Then there is another use case coming from China Unicorn. And this is about the data lake. This has to do with the project, I think there was another talk about this. This project is from China Unicorn for YIG. This is a collaboration between YIG and Soda MultiCloud. It's basically, you know, that the use case here is that I want to have a data lake. I will dump all kinds of data here or doing all kinds of analysis, whether it is reporting or whether it is machine learning or maybe just kind of whatever, right? So this is actually an interesting project. And as you know, YIG has become part of the Soda Projection and we are working actually very closely with them on this particular use case. All right, so this one I touched upon a little bit, MultiCloud coming from entity. And I think, so this is simple, right? You want to store your data in different clouds. You know, we had a problem with the storage, heterogeneity in the past, now we have problem with the cloud heterogeneity. So this is the one which addresses that. And I think we will talk about Zenco later on. Stefano will talk about it. So he will cover more details on how the MultiCloud controller and all that is going to work. All right, so those were some use cases, not all of those. And there are more creative ways we know people will come up with when they use the sooner. So what is the plan for 2020? As you know, we just announced the Soda Foundation earlier this week. There are a bunch of companies, in fact, this is not a complete list, who have joined the foundation. In terms of collaboration, we are working closely with CNCA and other open source, as well as industry partners. In fact, OpenEVAs is the one, Kiran is going to talk today as well. It's about the container storage as well as Linboot, you should be able to talk about. So development is continuing. It started with OpenSDAs and it is going on. We moved everything to now Soda. But, yes, Anil will talk about more about that. He's leading the development part. And then we do actually a lot of meetups and forums and events, and like this one, we actually used to go to different conferences, but now we do virtue. So that's what Soda is. In fact, when the other open source projects join Soda Foundation, they actually get quite a bit of benefit because it's a combined effort, a lot of other projects are there, and you get publicity as part of the foundation. So that's all I had. I think we'll do question answers in the end, and I will hand over to Sranegh for the next one. Yeah, thanks, Reddy and Rakesh. Okay, so we have seen the introduction, vision and also a glimpse about the project. Also use cases and high level roadmap for us in 2020. So basically for any open source projects, the project ecosystem is very important and the architecture. Before we go into the details of this, my session. Good afternoon, if you're in CDT timeline, or good morning or good evening. Here it's just one AM in the night. So thanks for joining us. So we'll see in this session, mainly about the products and the architecture, and probably if you, as some of the developers are there in the audience, I think you can get to know how to join and contribute. And myself, Sunil, I work as a DSC member and driving the architecture work group. So you can find me in Slack anytime. I think that's the place where we work. Okay, so let's get started. Can we go to the next slide? Okay, maybe. Thank you. So as Reddy and Rakesh mentioned, this is the overall framework architecture. We will bore you with this architecture diagram in most of the sessions. So just to add to Rakesh and Reddy, what they mentioned is that, say this is unified, because if you see Soda anywhere, we say that one data network and infinite possibilities. That's our dream. So how far we have traveled for that dream, we can see some of the projects in the session. So basically we want to have unified and open standardization. And two more key aspects. If you see on the north side, you can see most of the application platforms. And on the south side, if you see there are storages. So the key aspect is that we want to make application platform agnostic. So basically we say in our style that any platform. And on the south side, we want to make it vendor neutral. That means we want to support any storage. So that is when we say that any platform, any storage we support. And that's why we call it as a single data network. Why data network? If you see open SDS in the past, you might have seen data management. But we have removed that management. In fact, we have not removed management. We have added some more things to management. So we support management and also the data plane. So our vision now refined to grab more scope based on our inputs from our end users committee. So we support control plane as well as data plane. So this framework provides a high level view that how sort of framework is envisioning to solve those data silos and provide a unified framework. You can see different layers here. And this we want to support edge core and cloud. Core when I say core, it's on-prem. So on-prem cloud is very common and edge is coming up. So we want to support this unified framework even in the edge. So let us see how the core projects like Rakesh mentioned, the core projects are the projects maintained by us, developed by us for some of the key features. So we see some of the core projects in this and the overall architecture. How we are shaping up this direction. Okay, so let's go to the next slide. So this slide shows about some of the key focus areas like data mobility, heterogeneous storage, data energy and cloud native storage, this kind of. So basically there are different focus areas. Why focus area? We've just wanted to explore that what are the collaboration possible? What are the projects in that area possible? Should we unify these focus areas under one project or multiple projects? Can we have ecosystem projects from our partners to solve these problems? Because our idea is to probably bring a unified framework rather than building everything from scratch. Okay, we can go to the next slide. Okay, sorry, once again, architecture diagram. Don't worry, actually all my slides till the end, this architecture diagrams will keep coming. Okay, so just before we go to this particular simplistic diagram, I just want to say that in the architecture, okay, maybe we'll come back to that. So let's see a very, very simple view of the overall architecture. If you see this view, on the top you can see application platforms and then you see sort of framework and the storage. That's what Rakesh was trying to explain with the different use cases. So basically in a simplistic sense, we just connect from the application platforms to the storage where the data belongs to. Now, if you have different platforms, if you have different storages, it can be the storages in the edge. Storages can be on-prem, different storages like different vendor storage and also it can be a multi-cloud storage backends. And when you say platform, you have already seen in the higher framework diagram that it can be big data, it can be Kubernetes, it can be OpenStack, VMware, and so on and so forth. So now, today if you see, the key issue is that, now Kubernetes, if you see, Kubernetes has got some specific way to connect to the storage. OpenStack, it has got its own way. VMware has it its own way. That's where sort of framework will become handy in one aspect that we connect to them irrespective of the platform to irrespective of the southbound storages. On top of it, Rakesh was trying to say different features like data mobility, data management, data lifecycle, data protection, all that we want to provide those key features for data inside this unified framework. So the application framework can focus on the application business logic and the storage can simply focus on the storage. So we connect between and provide a unified interface. When you say unified interface, we are moving in the direction of some standardization, collaborating with our partners and standard organizations. So this is a simplistic view. Now, we can go to the next slide. Now, if you want to see what work we are doing, you can just go to GitHub.com slash Soda Foundation. We have multiple projects there. So you will see API, controller, doc, MVP. MVP is nothing but northbound plugin. We will see the details. And multi-cloud. And the new project introduced, which is called Delfin. If you go to Soda Foundation GitHub, you will see Delfin. That is Soda Infrastructure Manager. We'll talk about it. And there are other projects like Installer, Documentation, X-Samples, and also some of the experimental projects like Orchestration and Anomaly Detection. So we will see how these projects, how this means to the overall architecture and how these projects are positioned. So when you go to GitHub, if you see the project, you will be able to connect that project, how it fits into our overall architecture. So let's see a slightly complex view, but we will try to discuss in a simple way. Can we go to the next slide? So it looks complex, but it is not. So if you see the blue shaded part, that is our set of key core projects. So you can see on the top in that blue shaded area, Northbound Plugins. And then you have API and then Controller and below that Dock. And on the left-hand side, you can see Soda Infrastructure Manager. On the right-hand side, you can see Multi-Cloud. So if you go to GitHub, you will clearly see NBPS a project that is Northbound Plugin. Then you will find API as a separate project. You will see Controller as a project, Dock as a project, Sim as Soda Infrastructure Manager. You will see it as a Delphin. Delphin is nothing but Dolphin in Spanish. Thanks for the community for suggestion of that name. We are actually, you may feel that this API, Controller and all boring names. So we are in the process of changing the names to some interesting names like Delphin. So on the bottom, you can see the storages. So I just put it in two boxes. One is on-prem or edge kind of native storages. And on the right-hand side, you can see the cloud storages. The top most part, you can see application platforms and also a dashboard, which is in a different color. Dashboard also, you can see it as a separate project under GitHub, because this dashboard is our client to provide the complete experience of all our core projects of Soda. So when you have a Soda release, basically you experience all the projects of this dashboard. On the left-hand side, you see a big vertical box, though it is a small project, it's installer. So installer and dashboard are the two projects, which will get you an integrated view of all the Soda projects currently. So if you go to our latest release, basically you are going to install all the projects which are configured through the installer. And once it is successfully installed, you are going to experience in a dashboard. So we have a live demo on the cloud of based on our latest release, probably we'll share that link at the end of the session. Then you can see on the left-hand side again, there are some examples, documentation, documentation is overall documentation, it is docs.sodafoundation.io, all the documentations are available there. And examples are like use cases, because last release we have provided a use case for streaming using our multi-cloud. These are all experiment use cases to try out that how our projects can connect together and give a particular use case solutions. And then you will see anomaly detection and orchestration. So this anomaly detection and orchestration we will talk little later. Now we come back to the main portion. So now we talked about, we will spend some time in this slide so that the subsequent slides, again I'm going to bore you with all the architecture diagram with the different project information, okay? So now we'll spend some time to understand here and then we can go faster in those slides. So here you see that application platforms. Now example application platforms say it is Kubernetes, it can be OpenStack, it can be VMware. These are three platforms which we are already supporting. Now if Kubernetes has to support, has to be supported with Soda, what does that mean? It means that Kubernetes can connect to our southbound all the storages supported in the Soda Foundation projects. So that's the meaning, right? So how do we connect? So you know that Kubernetes connects to the storage through CSI. So container storage interface it uses. Now it can connect to any storages with specific CSI drivers. So we also have a Soda CSI driver. The specialty of this is that this single CSI driver or plugin can connect to all the Soda supported storages on the south side. So that's the beauty of it. Now we, how do we connect? Because we connect through our unified API in the block. If you see API, API provides a unified API for data management. Now CSI doesn't understand this. Now what do we do? So we need a connector. Unless these people, that's our dream that all these platforms, tomorrow when Soda has an API standard, all these platforms will come back and comply to these API standards. Then these plugins are not required because it can directly work. Our dashboard directly works because it uses Soda API. But till then, we cannot say that Kubernetes not supported or OpenStack doesn't support. So nobody will use this platform. So now we have a mode to connect, a simple way to connect that. We write plugins to connect Kubernetes. A plugin to connect OpenStack. A plugin to connect VMware. Tomorrow if you want OpenShift or any other platforms, you just need to write the plugin. As soon as you have the plugin to connect to Soda Core, then all the supported storages will be available because you don't need to worry about the storage interfaces. Now that's a way the northbound plugin helps to connect any platform. So we already have Kubernetes plugin that is Soda CSI plugin, then OpenStack Cinder and for VMware, NGC type of plugins are available and we are adding more and more plugins to connect. Now connect to what? The API layer. So what we are trying to do is that if you see API controller, dog and MPP, this gives you a heterogeneous storage and platform connectivity on premise because these are currently supporting file and block storages. So you can have file operations and block operations onto the on-prem heterogeneous storages through this interface. So API, API will provide you a standard interface, consolidated unified API interface. Controller is just a bookkeeping. So local database and metadata, those kinds of management will be happening in the controller. And dog is another very, very important project wherein we can connect any vendor storage driver plugged. Now, similar mode like how we connect application to the API, we connect an adapter layer driver for any of the storage vendors. Say for example, you have NetApp or EMC or any other storages, you can connect to the dog with a very small thing layer. So we show you a week time or something like that, you can just develop and deploy and it connect. So as soon as you connect these storages, the dog is enriched with all the supported storage drivers. So we already support IBM, NetApp, Fujitsu, Huawei, LVM, Chef, those kinds of interfaces we already support in dog. So that is where the drivers are connected. Now we come back to the Soda infrastructure manager. This is a very important project to be newly released in the latest release, our latest release is the major release 1.0 called Ferro. We just released along with our launch. So this project is introduced in that release. This Soda infrastructure manager, what does it do? So basically, if you want to manage the storages directly at the disk level, you need more control. So you need to have the control at the pool level. Okay, so earlier we used to have the API controller doc in that line. We used to support at volume level. The volume is the input. Now, if you want to manage your storage, you want to increase the storage, decrease the storage, or you want to check the IO, all those things you can do through this resource management. So we integrated resource management into this unified framework. So it can support heterogeneous. That's the beauty of it. Any storage behind which is supported in the doc can automatically support with this interface for resource management, not only resource management. We support the alarms, notifications, and also the monitoring the telemetry performance data using this interface. So now you get the notifications, you get the telemetry data, and also you can manage the storages directly from this framework, that to heterogeneous storages. Now this telemetry data is very important for any of the application framework because you need to analyze your storages, the health, and so on and so forth, how it is performing, and whether any errors are there, do I need to take an action? Those kind of stuff can be derived from this interface. And we provide a scalable interface called Exporters so that you can write using Exporter interface, Exporter for say Prometheus. You can write or you want to write an Exporter for Kafka, you can write. So as soon as I say Kafka, then you can easily connect to anomaly detection. Now if you write an Exporter for Kafka and it can easily connect to anomaly detection. So anomaly detection is an experimental project from our side to prove that we can connect these things and you can do some prediction. So as of now anomaly detection is a very small project. We have only one algorithm called based on DB scan and the algorithms can be plug and play. If you're interested in anomaly detection or further AIML kind of stuff, you're welcome to come to talk to this project. And one project is missing so far is orchestration. As you know in storages, each workflow will be having a set of operations, a various set of operations and each services, the operations will be keep changing. Now how do you manage the situation? So we thought that we can provide an orchestration automation and orchestration framework wherein you can do the workflow automation. You don't confuse this with the data orchestration. This is like a workflow orchestration you can do with this project. So this is also an experimental project. You can have custom workflows and things like that. You can experience through our dashboard. So if you go to Ferro release, take the latest release and you can experience all the projects which we have just discussed here. Okay, so maybe we can move to the next slide. Actually the session is almost covered. Okay, so this just notes about some of the projects which I have already mentioned that some are core projects, some are projects which are helping to experience and some are just the documentation or use cases and kind of stuff. Just a note to understand better. We can go to the next slide. So the further slides, each slide, that's what I keep my word that every slide I will bore you with the architecture diagram. So every slide has each project short information and what we support currently and what we are planning to do. So probably the notes you can just skip because all the information if you go to the specific project, you have a read me and we have a repository called design spec. You can get most of the information there or docs.sodafoundation.io you can get and in case you're not getting, please bring us in Slack. So the API right now we support file and block and in the latest release, we have already released a API specification draft which support file block on-prem, multi-cloud. I will talk about it later and then Soda infrastructure manager interface. So three types of API specification we have already released in the current one and our idea is to unify this, take this API project into a standard API specification project for across the board, that's our vision. We can go to the next slide. So this is controller, mostly this uses local database and metadata management. So I'll skip the slide. Yeah, we can go to the next slide because we already have discussed so only some key points I'll touch upon. The doc, see right now, we support NetApp, IBM, Fujitsu, Huawei Storages, LVM, Ceph, because as soon as you support Ceph, I think there are so many other Ceph supported Storages you can use. So there are a variety of Storages supported on this outside as of now through doc but our vision is to have any storage vendor driver to be supported here. So we welcome any of the vendors or interested people to write drivers for the existing storage models. So we can exponentially support all the Storages for any platform. So we can go to the next slide, please. So this is Northbound plugin which we have already discussed. So right now we have Kubernetes, OpenStack and VMware and we plan to support more plugins for VMware because as you know VMware needs different plugins for different operations. So we are trying to support more and more there and also we are trying to do I mean overall just one more input. Rakesh also mentioned about it that the CSI driver, we are just exploring that whether we can support the CSI driver as in plug and play. So if we can plug and play the CSI driver because as you know, most of the vendors because Kubernetes is the platform today moving towards cloud. So most of the vendors, they have CSI driver. So if we can support CSI plug and play then exponentially we can increase the support of storage backends in our platform. So we're just exploring. It's not very easy or tricky. So we have some prototype working now. So we just mentioned in the release. So we are working on that if anybody is interested in CSI, Kubernetes, expertise if you have, please contact us in Slack. I think we need your help. And we can go to the next slide, please. Okay, so this is Soda Infrastructure Manager. This is one of the key projects which we are trying to put a lot of focus on because this can give a lot of storage level information to the application framework which is currently scattered and fragmented, especially across the storage. Any vendor solutions, if you see they will support their storage management. But here we can support heterogeneous because already the framework support that. So we can exploit that feature and provide the resource management alarm and telemetry. And the telemetry is one of the, I think, important aspects to do prediction and better management of the storage. So we can go to the next one. And one more, just one more point is that these, as we just discussed, there are different projects we discussed in this. And there is no language barrier because if you know Go, Python, Java, as of now, Python, Java, scripting, JavaScript, web technology, you have any of these expertise we welcome because we have a lot of projects in different, different languages. And our architecture is based on microservice. So we are not really language bound. So coming to the contribution side, we welcome the contributions because we are building our community. We are working with the students and different community members to enable them. As Arakash mentioned and already mentioned about bootcamp. So we also have a mentoring program which is called Soda Bootcamp. We have already started. So if you're interested, please contact us. And straight away, if you want to contribute, you can just go to the GitHub. You see, just start searching for, start my contribution or SMC label or help-needed label. You can start, I mean, contributing or you can simply test and raise issues. Or if you go to docs.it's a secret. Don't, I mean, quote me. docs.sodafoundation.io if you go. There are a lot of issues because we are just migrated from OpenSDS to Soda Foundation. So you can find issues. You can help us to improve the documentation. So these are the some of the ways you can directly contribute to the engineering side of it. At the same time, every time for a year, we do quarterly release and we have the roadmap already available public in the release repo. So you can see that. If you're interested to suggest any new features, please feel free to do that. And the best place to do that is Slack. Either Slack or you can do Bitly, Soda, global community meeting. If you go there, we have a bi-weekly meeting. You can join us and give us your suggestions. We will definitely consider that in our roadmap. So this is about most of the core projects and how you can contribute to core projects. Now we will listen to the echo and native projects from Yusuf Stefano and Kiran. Thank you for listening. Thank you, Sanel, for this detailed architecture of the project and it has been a pleasure to work with you on Soda as always. Hello, everyone. My name is Yusuf and I work as a solution architect at Lynbit, the company behind the DRBD and Lynstore. And Lynbit is a software company that has devoted its last 20 years to block disk replication. And Lynstore is the software that this company has developed in recent years to provide disk replication and automation. In my session, I will give you some detailed information about Lynstore and then it's architecture and then I will explain why we contribute, why we choose to contribute Lynstore to Soda Foundation. Okay, so Lynstore is basically an orchestrator for Linux building blocks, including the DRBD itself and you can easily provision, migrate, delete volumes within Lynstore with some simple commands and with the help of the great API. And Lynstore main focus is STS customers like let's say cloud native orchestrators which can be Kubernetes, OpenStack, OpenEvula, et cetera. And, but it is also used on bare metal storage solutions or maybe hyperconverged infrastructures, big ecosystem like let's say Intel's RST and et cetera. And in Lynstore, you can have multiple tenants, these are possible, different scenarios are possible. You can easily deploy new nodes into the system also decommissioned all the nodes from the system easily. We have a product called DRBD actually and this is the core product Lynbit developed and I bet you know it already. And you don't need to use DRBD with Lynstore, it's a mandatory, but if you use it, it will come with a replication, block replication on the kernel level. But the Lynstore can also use LVM, ZFS and these kind of volume management systems natively. And it would be bad if I don't mention and it's open source project like everything we do on the GPL of course. Here is I'm showing you the Lynstore architecture and Lynstore has two parts, satellite and the controller. So the satellite is responsible for disk management, creating disks, deleting disks, gathering information from the node and the controller is responsible for managing the satellites within the nodes. And each part of the Lynstore is independent from each other. So there will be no downtime if you reboot, upgrade or just shut down the Lynstore controller or the satellite. And satellites and controllers are stateless and this gives us a plenty play space in our IT designs. Obviously to implement Lynstore to any ecosystem, we have a great API and drivers for such systems and our team of expertise and developers are continuously maintaining those and keeping up the current versions always updated. And here is the summary of Lynstore ecosystem. And on the bottom, we have a hardware level. It could be HTTs, SSDs or new text like, let's say NVMEs or PMEMs that you want to manage. One step up is the volume management part, which can be the LVM or ZFS or nothing maybe. And there, here comes the block storage part which can be a DRBD or something else. If you choose DRBD, that means you can replicate your volumes to other machines, other virtual machines, physical machines, physical storages, I don't know, within the kernel level. And then we are using some transport protocols like ICSI, NVME, Overfabric or DRBD diskless for attaching these disks to the orchestrators. And Lynstore is managing this whole stack in the controller with some simple CLI commands. And it gives you some easy, solid and robust software for your disk automations in the block storage area. It's pretty much it. So here comes the best part of the slide deck. I want to explain a lot of Lynstore, but we are running out of time, so I'm keeping it small. So here comes the best part of this slide deck which is why Lynstore is a good fit for Soda integration. So Soda is an open single framework connecting these separate solutions into seamless and end-to-end solutions. And my colleagues in this Soda Foundation explained to you in the earlier sessions and thank you for that. And I have been invited to Tokyo Soda Summit on December, actually, and since then I'm really excited about the whole project. Soda data framework is everyone's dream in the IT because people don't like stacking in one solution or vendor or ecosystem or storage, something. And Lynstore is an open source software defined storage that can work on any hardware, any platform or any ecosystem. And thanks to this flexibility, we believe Lynstore will give Soda a serious momentum on block storage automation and management. And we believe this is a really good fit on Soda's philosophy. With Lynstore integration, you can easily combine and manage maybe your cloud on-premise or hybrid workloads in your infrastructure. And thank you for listening to me. I give the microphone to the next speaker. Thanks, Yusuf, and all the speakers that came before me as a great introduction to the various projects that are happening on Soda. Hello, everyone. I'm very excited to be part of this virtual event and I hope the next few minutes the internet cards will be in my favor. I'm joining Infrabidia, just like Sadaal. About me, I'm Kiran Mova, I'm the co-founder and chief architect at MyData. MyData is focused on building products and solutions that help infrastructure and platform engineers build data management platforms. And specifically, we are focused on Kubernetes, using Kubernetes itself to build the data management platforms. To that effect, we have several products and solutions that were built. A couple of them are part of the CNCF, the Linux Foundation, OpenEPS and Litmus. Today, I'll be speaking more about OpenEPS and just to kind of introduce Litmus. It just became a sandwich project in just this week. All right, maybe like, yeah, the beginning of this week. It helps you build chaos engineering into your CI CD pipelines, especially if you're like pushing your code into Kubernetes. It's a completely Kubernetes native thing. So to speak about Sadaal, I'm actually really honored to be a TSE member of this sort of foundation. The mission and the values that the sort of foundation brings are something that are close to MyData as well as personally to me. So the foundation's mission, as much as it is about, let's say like building an open source framework, architecture, reference implementations, it's also about building a really strong open source community. And I'm pretty sure like we can pull this off because most of the people that have spoken already and also the members that are not on the call today are experienced in building open source communities as part of Apache or like even open Linux Foundation by CNCF. So I'll just introduce myself there and we'll be talking mostly about eco projects. So we heard a lot about the core and the native projects, but this definitely is some transformative change that's happening in the data management platform and there are so many different ways in which you can build that stack and the packages that you use, they're already part of let's say like CNCF which is kind of primarily leading the Kubernetes efforts and all the container-related projects. But as you'll see in this presentation, that's not sufficient to build and fully feature a data management platform. But you have to work with those projects as well. So Open EBS is one such project in CNCF and Kubernetes obviously is like the driver for many of these. We'll see how that synergy is kind of getting built. Just to recap, like, why do we really need to rethink the data management platforms? Some of us actually have been around when Twitter architectures were the coolest thing to happen and San was like pretty hot. But since then, the cloud native companies are cloud companies that started with zero infrastructure of their own started coming up and we see an enormous growth in terms of the scale at which these applications have to perform and the amount of data that they have to store as well as there's been so much of innovation that has happened in the storage media itself. We speak about NVMF as a fabric for transporting storage and it's as because he has been around for a long time but we are also talking about SPTK, DBTK where you can actually run your storage controllers in the user space right now to kind of get much more performance and agility. So the agility part is the key aspect. I think the storage industry or like the architectures have kind of evolved but in terms of how the software itself has built that has not changed much. The applications have changed in terms of like shifting towards microservices being more distributed in nature and they're kind of becoming more feature rich, more innovative and also the speed at which they get delivered is insane, right? But when it comes to the storages, I think we are still seeing like long release cycles and all that. At the core of this is really open source and Kubernetes that's powering the applications to get re-architected but one might ask like, is Kubernetes right for data or like, you know, stateful applications? It used to be the case that Kubernetes was built for stateless applications but if you look at the recent enhancements and the kind of applications that are running in Kubernetes you will be surprised at how many databases actually running Kubernetes now. Even in fact, like if people who are using Kubernetes and say that they're only running stateless applications if you probe a little bit and ask them about are you running your CI CD or like say observability? They end up saying that, yes, we are running in Kubernetes, you know, Prometheus, Cortex and these things have really become popular and those are stateful applications that are running within Kubernetes. The shift that we are seeing now is also powered by the advancements that Kubernetes itself has done in the storage area, CSI being the major one. While CSI has enabled outside storages to get connected to Kubernetes, I'm a little hesitant to call them as completely container native. The storages were actually purpose built for running with VMs and bare metals or like servers. And we kind of saw if you use cases early on where with Kubernetes, the amount of compute power you have is so much that not a single sand can really support it and you need like some way of the storage layer. So connecting different CSI drivers with the sand will end up creating silos within your Kubernetes clusters and you can't easily move them across different clusters or even within the clusters, you can't scale up and down like the way you really want to do it, right? So this is where like, you know, instead of calling that as a container native storage, the term that my opinion would be good is a container attached storage. The history for this container attached storage, it's kind of started when we were thinking about like, you know, the OpenEBS, which we tried to architect for Kubernetes native environments. It's actually an hyper-conversed storage that runs within Kubernetes. So we started seeing a lot of users come up and say that they really want to write applications where they don't need replication and distribution capabilities from storage. They really want like storage, which is local. So then you go to day two operations and all that. Maybe I need replications for high availability or maybe like I want to do a backup to a remote store, but not on a regular basis. The application is capable of taking care of a lot of these storage features that were traditionally coming from SAN. So we needed an architecture which was somewhere in between DAAS, direct attached storage and what was in the SAN and that's what we call it as container native storage, container attached storage. So container attached storage, a few things to think about is what's the release cycle like or like, you know, what's the time taken to perform some of the day two operations? If it's months or like, you know, even weeks, then you're really not in the container attached storage or even in the Kubernetes ecosystem or like you're not reaping the benefits of Kubernetes. Any kind of operation that you do within Kubernetes has to almost be instantaneous. GitOps is kind of catching up and with Soda and projects like Open EBS, we are trying to bring that kind of agility to the data management platforms. If you want to learn more about container attached storage, there are a few blocks that are published on the CNCF. You can kind of look at them. The key aspects that I would take away about container attached storage is these are storage controllers that are built with microservices patterns. They are delivered as containers and they're orchestrated by Kubernetes itself. A lot of open source and commercial options are available for the container attached storage category as of today. And the other thing that kind of defines them is they have a declarative API which is the cornerstone of Kubernetes. So a few benefits for using the microservices-based container attached storage kind of an architecture. Of course, it's all open source, Apache licensed and you kind of don't have to get locked into the on-premise or the cloud service vendors that provide the storage. You can kind of use the data mobility features, let's say like with Soda to kind of take that workloads along with the storage to some other platforms. It reduces cost from a business point of view not just in terms of giving you the option to move but also ability to reduce the operational cost because most of the APIs are Kubernetes-based and if you're in operational engineers know how to run Kubernetes, they also know how to run this new kind of storage engines. There are a lot of commercial products also available using these open source technologies. For example, like a convoy from MS-House or like the locomotive comes into play and my it also has a new product called Kubera which is on the same lines. So in terms of the use case, I want to present a slightly different one than what we were talking about so far. So this is what I kind of got from talking to a person who is trying to architect a solution for OLAP and he's currently a Oracle user and typically he has like a 100 TB volume that is provisioned but now when he's going into the Kubernetes he wants to split that OLAP server into 100 different parts with each part getting a one TB volume. So you're basically distributing the data per user into a pod associated with its own volume and a set of metadata servers that can route the data in a high availability which are kind of stateless. So this allows the architecture to scale up and down. Scaling up, though it is difficult it's kind of doable with the traditional approaches maybe with this solution it can be much faster because you can kind of hook into the Kubernetes after scaling logic. Scaling down was always a challenge with the older solutions. Let's say you start with a 100 TB volume you can never go down unless you actually schedule like downtime and maintenance and all those kind of stuff. But with this architecture that's based on microservices and Kubernetes you can get that benefit. Also using these approaches you can easily plug in like the GDR, the complaints related requirements that you get like GDPR into the architecture very easily. So if you want to build such an architecture you actually have three pillars. So one of the pillars is about the cluster lifecycle. Kubernetes does not just start by itself. So you need somebody to kind of put together the required storage compute and network and build the nodes and give it to a Kubernetes cluster. And once you have a Kubernetes cluster it's about running the stateful applications in an automated way within the Kubernetes cluster without having a lot of dependency on how the nodes are built. There has to be a complete isolation in terms of responsibility in that aspect. While these two things are something that's already happening today. There are a lot of day two operations that will come up as more and more stateful workloads come into Kubernetes that require a data lifecycle kind of operations that need to happen on these architectures. So that's where Soda Foundation comes in with a unified way of helping the lifecycle for creating the volumes from different types of storages as well as helping support like the data mobility kind of use cases. So there are various projects that are underway for each of these pillars in Soda Foundation. I'm actually kind of helping with the CSI plug and play. In fact, if your storage today supports CSI driver and let's say you have implemented the volume attachment interfaces via the Kubernetes CRs that are exposed then we can take your storage and attach it to the Kubernetes nodes. We have an API that is currently in the prototype that we can kind of test it out with your CSI drivers. Just hit us on Slack if you want to get your storage integrated and run in a Kubernetes native way, like taking the benefits of the agility that Kubernetes promises, you would like to hear from you. Thanks for the time. And I will hand it off to Stefano to talk about Zengbo from Skeletee. Thank you. Okay. Let's see if I can get the slides to number 68. Can you guys see which slides we're on because I'm lagging behind? In any case, I'll start talking and we can ignore the slides. I'm Stefano Maffulli. I work at Skeletee, a company that builds a market leading software-defined file and object storage platform. And we've been working in the past four years on Zengco, which is a project that provides a single API endpoint to store data in any storage location and can also offer a global metadata namespace, policy-based metadata management and it's a fully open source project. So for the sake of time, and despite the fact that I have a huge echo in my feedback, I will scroll rapidly through the deck, maybe if someone scrolls through with me so that we talk a little bit about Zengco. One of the major features and the main reason for why Skeletee started investing about three, four years ago on this project is because we wanted to have customers have the possibility to move their data in multiple clouds and between on-premises storage and off-premises storage. And Zengco designed to cover a lot of use cases from disaster recovery and HAA, high availability for data, cloud media workflows and things like synchronization and movement of data between edge and core or edge in cloud. So the software is, I'd say that skip to the where we would talk a little bit about the technology given the audience and I'll give you a high level view of the architecture since it's the most interesting. One of the reasons for Zengco is to offer best-of-breed technology that provides gives you knowledge as a user and gives intelligent way of managing data across different clouds and different destinations. Plus through Zengco, you have a unified interface across all of these clouds and you can search data across all of them in one through one endpoint. And we will also been working on a policy-based data management engine that allows you to move data in different locations based on the capabilities or the characteristics of those files. It can be deployed anywhere on Kubernetes cluster. We also, as KDT, we maintain a project for bare metal Kubernetes called MetalKates, MetalKates, which is something that is, it's available as worth looking at if you're into running Kubernetes on bare metal. At the high level, the Kubernetes deployment comes with monitoring through Prometheus and Zengco offers one API endpoint with two different compatibilities with Amazon S3 and Azure Blob. So you can write data into the system using applications supporting either of those APIs. And then, these interfaces are offered through one of the components that is also open-source at Zengco called Cloud Server. And then the shuffling of data is done through a workflow engine called Backbeat. And MongoDB keeps the metadata for the objects and manages also the destination that can be Azure, Amazon, Google Cloud Storage and others, including file systems. So the interesting feature that we have released a few, a couple of weeks ago in Zengco 1.2 has been the fact that you can also ingest the data inside Out-of-Band, we call the Out-of-Band communication. So you can connect data directly storage locations to Zengco and Zengco without moving the data will ingest the metadata from existing S3 buckets. But also the new feature is that you can ingest also from existing NFS mount points. So crossing the boundaries between objects and files. And so this allows you to connect your NAS, for example, to a Zengco cluster and let Zengco learn what kind of data you have and ingest the metadata. So you have immediately metadata search across all of your destinations, including the NAS, and you can do policy-based data management. So you can set up workflows like one-to-one replication. So everything that is inside this folder owned by this user replicated into S3, for example, one-to-many multiple destination. You can do other things, lifecycle, expiration between files and objects. And it's a super, super interesting feature, super interesting capabilities. So without taking more time from the conversation, I'll share the slides afterwards for those who are interested. The future for us is bright. We do wanna get more capabilities inside Zengco. We do wanna enable ingest data ingestion, the metadata ingestion also for Samba shares and SMB file systems. Google Cloud Platform, obviously, another object storage platform since things like Backblaze, for example, started to support TS3 APIs. So that's another project that we started looking at. And we're also working into expanding the capabilities of the data workflow system and add function-based capabilities, basically, given enable something like Lambada functions inside the Zengco platform so that you can run more complex scenarios and more complex workflows. And we're looking at Soda Foundation also for the standardization of our APIs towards the data management APIs that Soda is looking at. And obviously, we're looking at the conversations for the incubation into the Soda Foundation. So if you want to get involved with Zengco, the project is open source, share the links in the chat for the GitHub projects and the website. And with that, I pass to next, I think, is Larry. Hello. Hi, can you hear me? Hi, thanks, Stefano. This is Larry Lai. I'm the head of the open source Soda outreach community. So if you, let's go to the next slide. So you've probably seen this slide before from Reddy's talk. So Soda Foundation is the open source project aimed to force the ecosystem of open source data storage software for data autonomy. And so ecosystem is super important for this project. We try to provide a kind of open and collaborative and neutral kind of environment for all the community members to work together to build the platform, build up the ecosystem. Next slide, please. Here's the organization chart for Soda. So outreach community is in these other green boxes. So our mission is to evangelize, try to reach out to industry to various open source other open source project and try to kind of expand and grow our community. So we have, let's go to the next slide. So next slide, please. So we have various regional committee around the world. So in North America, we in Japan, China, India and Europe as well. So we have a regional activities, so we try to grow local kind of a community. But also we also try to start other efforts like ambassador program, try to engage other open source community members, try to help us to expand, spread the words and try to help us to promote Soda brand, Soda project. So we have a global community that, you know, it's been around for like two, three years, Star Wars Open SDS project, but we have been pretty active around the world. We have, you know, engaged many, participate in many events, but we also have regular kind of meetups within different region. Next slide. So there's a few slides, we'll probably go through very quick. But so we have a, so we have organized various events in the past year also. So we did a soft launch of Soda in Tokyo in December, back in December last year. And we, and then we also have held various events in China, in India, like meetups, you know, open source events as well. Let's, for the sake of time, let's go to page slide 90, please. Here's a picture of events in the US. Next slide, please, one before that. Okay, so I wanna just quickly explain, you know, the, what is, how you can, if you're interested in this in Soda, how can you join? So in terms of, so we are focusing on real world use cases, use cases. And so we are trying to work, the project is all about working on a solution, complete solution. And we also, you know, provide environment that users and members can influence and the roadmap influence the project and also provide a networking environment. And also this is a great platform allow collaboration. So this is the kind of reason why many organizations have joined Soda. Next slide, please. So we have, like typically we have two, mainly have two types of members. Some of the end user, like Toyota, like Yahoo Japan, China Unicom, those are large end user. So for them, benefits is they can, you know, participate in EUAC, which is End User Advisor Committee. And they can also nominate someone, if they're a premier member, they can send a representative to the governing board and then also can be part of the TSC, Technical Steering Committee. And we have a monthly kind of meeting for different groups. And we also have, you know, discussion about needs and requirement for end users. And the other benefits, sorry, so next, so other type of members are vendors. And for vendors, they also, you know, they can be part of these TSC technical steering committee. They can also participate, you know, in looking at, look gaining insight into your strategy and be able to influence the direction and roadmap of the technical development. Next slide. So there are typically two types. So this is like a Linux Foundation project. It's funded by members. So there are two type of paid members. There's one that's the premier members, there's another general member. And they also have supporters. So the main difference between premier member and general member is the, they will, as a premier member, they will get a board seat automatically once they become a premier member. So there's a list of events we plan to do for the rest of this year. So, you know, because of the pandemic situation, some of the events, they may change, they may become virtual. So it's right now this is the current plan, but we'll see what happens. So in the future, hopefully we can, if you're interested in this, you can also welcome to participate some of the events we plan to engage, plan to participate in the rest of this year. Next slide, please. Oh, so that's the end of it. So thank you very much, Paul, for joining our mini-summit today. And you can look up, if you need more information, you can always look on the website or look, check the GitHub Soda Foundation and you can find more information. So that's the end of my slide. I think we, I don't know how much time we have left. I wanna thank you again for joining us today. So we should, I hope that you had a chance to learn a little here about Soda and learn a little bit about what we're trying to do, what Soda's about. And so now maybe we can have all the speakers on the screen and they can answer some questions.