 Good morning. I'm Stephen Tan, Chair of Soda Foundation from Futureway. With me today, I'm honored to have Tomoko Kondo from Subbank and Kase Kusunoki from ATG Communications. This is also the TOC Co-Chair. Today we'll be sharing what Soda Foundation is about and how we help organizations with their data and storage challenges. The Soda Foundation is a project that's charted under the Linux Foundation. It focuses on data and storage management. The project itself was launched in June of last year and there are three missions that we tried to achieve. One is to foster an ecosystem of open source data and storage software for open data autonomy. And to offer a neutral forum for cross-projects collaboration and integration and provide end users with quality end-to-end solutions. These are the organizations you see on the screen that are supporting the foundation and you can see that within the members and supporters there are many Japanese companies. And besides that, we also have partnerships with various other organizations like SNEAR, the Storage Networking Industry Association, Open Infrastructure Foundation and so on. And within Japan itself, we are also honored to have this partnership with the Japan Data Storage Forum. The Soda Foundation is an end user focused, end users drive the Soda roadmap and provide guidance on direction and opportunities. The Soda projects are designed for end users and so the Soda end users represent the largest organizations and most innovative companies in the world. Together they provide guidance to Soda by providing real-world use cases and requirements and Soda provides a neutral forum for them to address data and storage challenges together. So I'm going to be discussing about two key challenges that organizations face today. The first part is infrastructure is complex. And what do I mean by that? So infrastructure often spans multiple data centers, multiple clouds, and sometimes the age. And this creates challenges in monitoring and control of data and storage. So the challenges are capacity optimization, having redundant or obsolete data that's taking up storage that's not optimized. Storing data in the right place for application performance and identifying critical data that needs to be protected or secure. And these are just a few of the challenges and there are many, many more that we could go on. So we believe the solution lies in the open data framework that we have built that it provides a centralized view and control, connecting storage to containers to VMs and other platforms. This framework provides data and services such as a blog file and object storage, backup and recovery and security and compliance and so on. So today is to unify data and storage across the core, the cloud and the edge with a single framework to provide centralized view and control. This diagram shows what the open data framework architecture is about. The open data framework architecture provides an open API for integration with platforms and integrations. It offers plugins for seamless integration with Kubernetes, OpenStack and VMware, and uses storage profiles for policy based storage provisioning, data protection, data lifecycle and other data management functions. And storage connect to the doc through the through CSI, Swordfish and the OpenStack, Manila, Cinder interfaces, and the performance can be monitored together using the Soda dashboard. Multi-cloud controller offers access to the various cloud services through a single S3 interface, enabling operations such as cloud tiering, cloud backup and so on. With open data framework, users can easily build end-to-end solutions with any storage, any platform and any cloud. The open data framework connects with other projects in the Soda ecosystem that you see here to provide end-to-end solutions. Let's go through briefly what ish of these projects are about. YIG is a project that's developed by a China Unicom. It is a massively scalable object storage that can scale to exabyte level using safe clusters on its backend. This project is in a production use at China Unicom, storing a few petabytes of data at the moment. Next is the DAOS project that's developed by Intel. DAOS is a distributed asynchronous object storage. And it is a project that leverages NVM technology to provide high bandwidth and high IOPS storage to containers and applications. LinStore is a project that's developed by LinBit. LinStore provides mixed building, running and controlling block storage simple with support for Kubernetes, Openabula, OpenStack and OpenShift. OpenEBS is a container-based attached storage providing dynamic persistent volumes that's designed for cognitive environments. OpenEBS is a CNCF sandbox project developed by Maya Data and has recently been acquired by Datacore. And Xenco is a project that's developed by Scality. It is an open-source infrastructure software designed to control data in multi-cloud environments using a single S3 interface. Cortex is an open-source distributed object storage designed for great efficiency, massive capacity and high HDD utilization. It supports HDD, SSD and NVM. Cortex is designed by Seagate. So for all these projects that I've just introduced in the Soda ecosystem, you can go to the Soda Foundation website to find out more about them. The second challenge many organizations are facing is data growth explosion. And what do I mean by that? You can see that in the whole world, back in 2010, we created two zettabytes of data. And now in 2020, 2021, we are at a point where we are around 50 zettabytes of data that's 25x from 10 years ago. And within the next few years, this data is expected to be triple, more than triple, to 175 zettabytes. So how is this data generated? Or who creates this data? Or how fast is this data growing? You can see from this chart. This chart shows what's happening every minute of every day. Most of these are popular applications that we use every day. And for organizations, they may not necessarily be dealing with this data, but they have their own data that they deal with from their AI machine learning applications, from their big data applications, from their enterprise software and so on. So where do all these data sit? So the reality is that with so much data, data is scattered everywhere. There may be stored in the Google Cloud, AWS Cloud, or there may be some data center in London, South Francisco, and so on. So this created a bunch of problems, like to connect to Google Cloud and AWS Cloud, they use different interfaces. And also if you need to connect to a data center, or you get connected. And the data in each of these different locations creates some data silos. So it's hard to extract value or analyze data from all these different silos. And there's a lot of unnecessary data transfers that will take place because of the different locations, you may have to gather them in some locations, somewhere central. And also, it's difficult to secure and govern the data across different clouds and different locations and, and so on, the list goes on. So how do you solve these problems? The solution is a virtual data lake. The benefits of data lake is that it allows all these applications to connect to the data using a single common interface. It provides a unified view of the data from all the different sources. Data stays in place, and it allows better security and governance because everything goes through a single, single common interface. And with the virtual data lake, the integration and deployment of applications will be faster because you don't need to connect to a different, to the different mechanisms every time you want to deploy something. And also, that's going to be better performance because of shorter latencies, and it's going to be highly versatile and more scalable. So we have started this a soda lake project discussion in the soda community. And this is a high level group print of what the soda lake is about. It provides a common history interface for the different applications. It's a global metadata to source the info for the data from the different clouds. There's going to be a search engine to provide a data search and discovery, a cache engine to speed up performance. Identity and I am for access control and compliance. These are metrics for audit and monitoring. That's good. The backend is going to be a highly scalable, supporting heterogeneous kind of a storage and so provides connectivity to any cloud. So, if you are, if you're interested in participating in this project. So let's join our discussion at the soda foundation us like soda like channel. So with this, this is the end of my talk. Next, we'll have a condo son introducing the Saudi us data lake use cases. And after that, personally design will be introducing also another project use case and also some of the activities in the Japan community. Thank you. Hello, everyone. Now, let us begin. Thank you for your time today. I'm Tomoko condo from soft bank. I like to talk about the budget data lake using soda lake. Let's go to my slide. Next, let me introduce myself. I work in the cloud engineering division and the enterprise product and business division. I'm charged of developing solutions and promoting open source strategy as a cloud engineer and developing solution patterns for each industries challenges as a business planner. My first encounter with open source was good new Linux has been my favorite operating system since the moment I met it. I look for join the Linux foundation and participating in the soda project. Second, I like to introduce you soft bank. There are three basic strengths. First is network. Second is data center. Third is access devices. Smart VPN is our main cross network. Smart VPN connect partner core cloud through the direct access. Smart VPN connect soft bank data center. Smart VPN has connecting interface to internet backbone and mobile network backbone. And smart VPN connect customer data center by using access devices. We can easily connect cloud world. We can connect everything with network. We can connect users to various channels and services through the network. Of course, we can access a lot of data in the world. Our mission is to solve business and social challenges with digital transformation. We support achieve digital transformation and provide for infrastructure required for digitalization. First is communication. Second is digital automation. Third is marketing. Last is security. We solve social challenges through co-creation with business and public agencies. There are several use case. Today I like to talk about smart building. Our headquarters is a model building for smart buildings with over 1000 sensors installed in the building and based on the collected data. Cleaning robots operate. Application work together. And the environment is automatically adjusted to make it easier for people to spend time. When I want to eat sandwiches, which overlake, which working overtime, I order sandwiches at a convenience store. Then acute delivery robots bring me sandwiches. I am enjoying every day. We continue, collect a lot of data from any place. Data supply, a lot of supplies and comfortable environment and evolution. I like to our motivation to solder. How data is used is a key to all evolution. Everyone wants to be able to easily access scattered data from anywhere using the same interface while keeping security. I believe in solder is architecture with the highest expectations for achieving this. Let's go next slide. This slide is very important slide. I like to talk about virtual data lake use case. This use case has two points. One point is soda lake controller and soda lake agent. Other point is using carriers network and carriers data center. Customers can take advantage of customer specific virtual data lake from anywhere by simply installing a soda lake agent into their application and connecting to a soda lake controller in the carrier data center. Let's look at the diagram in this slide. We supply the soda lake controller for each of the customer in the carrier data center. Users can download soda lake agent from their soda lake controller. For example, customer A, customer A has strange object in their satellite office AWS and Azure. The soda lake controller manages these storage as single virtual data lake. Customers users can use their own virtual data lake from any place by simply soda lake agent into their application. In addition to soda lakes here and access control features, carrier data centers can connect cloud storage and customer data centers with a cross network. This means that users can take advantage of virtual data lake in the private network. So we can implement the secure virtual data lake by using soda lake and carriers network and carriers data center. I believe this use case become very useful case. This is summary page by providing virtual data lake services implemented in soda lake by carriers. We believe we can further accelerate large scale data transformation. There are many similar services, so we need to have a deep discussion on whether this hypothesis is really a good use case for soda lake. We also need to work out technical specifications for actual implementation from various viewpoint. For example, latency and speed and a comfortable user interface and so on. First, we want to validate the usefulness of this hypothesis and we go to be able to start a POC together as soon as possible. Thank you very much. Okay, let's start. So hello open source summit Japan participants. I'm Keiku Sunoki, technical oversight committee co chair in soda foundation. And also I'm working on entity communications as an infrastructure engineer. So today I'd like to talk about the Japan committee briefly. So let's go to my slide. So first of all, let me introduce myself. Firstly, I'm Keiku Sunoki. I joined the two soda foundation from 2020. And I'm contributing soda through many technical discussion, such as architecture, or some kind of basic requirement, technical requirement as the service provider's perspective. And also, I'm leading the Japan community. So that's why I'd like to show our activity in Japan today. And also I'm just engineer. So I'm working on some kind of the R&D matter about storage crowd. And also I developed the private cloud storage services in entity communications. So this is me. And before the Japan committee introduction, let me introduce our motivation entity communications motivation to soda. So here is our motivation. So entity communication provides smart data platform, it's very complicated platform. This is kind of the micro service architecture. As you can see in the diagram, we are offering private crowd, public crowds by our partners, such as Google and Azure. We are offering storage services. And basically entity is a network service provider. So of course we are providing many network service through these data center network and the cruise network, wired network and also internet network and mobile network. So based on these infrastructure, we are offering many solutions and also application in our smart data platform. So our end user customer can use many solutions through these applications, such as, you know, ID service management services or subscription services or some kind of AI service, voice service, DX, some kind of that. And besides that, we are offering orchestration service, managed services, security services. So in this platform, we need to open data storage management by soda. So currently, our data management storage management is a kind of the original one. So we write this closed data management software in-house. This is in-house software. So of course, API is closed one. So our end user cannot use that. And also, this closed API cannot be used from other platforms such as VMware, OpenStar, Kubernetes. So it's really huge disadvantage on current cloud era. So we really desire such kind of open data storage management. So if we can use soda in our data platform, of course, soda can have some kind of good capability for these environments, VMware, OpenStar. And also, soda can use many multiple storage services, ISCA-G, NFS, objects. And also, soda have some, to some extent, interoperability with traditional storage appliance. We are using such kind of the storage appliance, such as NetLab. So we need such kind of the open storage controller in our platform. So the image is that, so if we can use soda besides our storage here, so our API and our applications can use soda through open API. And also, our storage orchestrator, our provisioning orchestrator can use soda through open API. So it can enable us to accelerate our system development. So it's really helpful for us to reduce our development costs. And also end user can enjoy our storage service through soda API. It's really good benefit for us and our end user. So this is our motivation. So, okay, so next I'd like to show Japan Community. So this is Japan Community. As you can see in the photos, we, Japan Community have many members, official members, Toyota, Fuji Japan, Fujitsu, Sony, Seagate, and us, any communications. And also, officially, we have IBM and SoftBank. So we are discussing many actual use cases with these members. And before COVID-19, we had many meetup event in Japan. So, and in that these indeed those events, we had many technical discussion about the concrete use case or technical requirement from end user perspective. And through these discussion, online offline discussion, we aggregated some kind of the technical requirement. And we provide these requirement or use cases, scenario, solution to soda community. So that is a brief explanation for our Japan Community. Yeah. So here is a memorial photo about the two years ago event. So the form, 2019, just before COVID-19 pandemic, we had the local meetup event in Japan. So we had many participants from global. And also, yeah, actually, in next year, upcoming event, we are, we're gonna have upcoming event in Japan. So, yeah, I'm really looking forward to see you guys next year. So, okay, so after here, I showed the brief explanation for our Japan Community. So here, I'd like to show a typical use case from Yasu Japan. Yasu Japan is a large service provider in Japan. Yasu Japan has concrete use case about soda. So here is the overview image, their overview image about the soda use case. So in Yasu Japan, storage users will request some kind of the daily storage operation. So maybe storage user wanna make the volume, create a bucket. And so in this system, storage user will submit some kind of the request ticket. So currently, the ticketing system is dealing with this request, and the storage admin will operate this kind of daily routine operation following the request ticket. But they want to replace this ticketing system to soda. So if soda can work with their in-house software user interface, soda can work with this system by open API. So once storage user request some volume creation, for example, soda can automatically provision the volume on their on-premise file storage and object storage. This is the overview of Yasu Japan's use case. And let me introduce object storage concrete user scenario. So if end user wanna put the object to soda, soda will expose just a single end point for end users. If end user realize only soda single end point, then end user will put object to this soda end point. Then soda end point, yeah, this is the component, one component of soda merge cloud. One soda merge cloud component will deal with this object following the pre-configured policy. So storage admin, Yasu Japan storage admin can configure these policies into soda. And following this policy, this object D will be forwarded to this storage B. So this storage B has a direct end point, soda will upload, put this object D to storage B, direct end point. But real end user don't realize, don't see this direct end point of storage B. So the benefit of that is user don't need to care this kind of the end point matter. So if end user directly uploads the object to this direct end point, actual end user need to care the direct end point. So the storage admin wanna replace storage for some reason in upcoming year or maybe 10 years later. But if storage admin renew these storages, storage admin need to change this direct end point. But end user don't wanna care of that. So they don't want to change their code. So if soda can abstract these actual storage end point layer, end user, real end user don't need to care this end point matter. So that is the main benefit from this solution. Okay, so recent activity. So to achieve this concrete use case scenario, we had many technical discussion between Yasu Japan team, Japan community support team, and also soda development community team. So we have many discussion. So through this discussion, we had many challenging matter. For example, objects, how soda, matchcraft, handle deal with object storage end point, among the actual soda end point and actual on premise object storage. And also also education matter. It's very complicated. So, yeah, for Japan need to achieve much energy. So, yeah, soda soda don't need don't soda need to care. This kind of the marching as she saw each user need to belong to each tenant. And this tenant level soda need to deal with object storage secret key. So such kind of the many technical challenging matter was broke items. So we had many technical discussion to resolve these issues. So, yeah, we already did this POC as a result, Yahoo team confirmed this use case with soda Jeba release. This is a previous release. So this is the recent activity of soda Japan community. Okay, so up to here. I showed some kind of the brief explanation about the Japan community. But finally, I'd like to show some promotion about our event. So we're gonna have a virtual hackathon event. It's named soda code. So it's kind of the virtual hackathon event. It's a first coding event by soda foundation. So we are expecting many challenging development topics in soda project. So, please, visit this URL, and she's some kind of the information about this event. And join this event. Yeah. So that's all from my side. So thank you so much for hearing my presentation. So thank you. Bye.