 Okay, so I think we can start now. So thanks for joining this session. I hope that you had a good coffee like this nice Japanese snacks. So today I would like to talk about our work in related to the open source project in entity data. So unfortunately the title sounds like a little bit bowling, but I would like to, I will first introduce ourselves quickly and I will try to focus on the latter part about some interesting projects we are kind of working on. So hopefully that will more, I mean it sounds more interesting to you I mean attending this session. So before moving on, I would like to introduce myself quickly. So I'm Takashi Kazunami and you can find my activities in the GitHub launch, but I'm mostly available in the iOS channel in the OFTC. So I joined entity data a few months ago, October this year. And my title sounds like, you know, the little glade, but actually I'm a kind of a software engineer working in the cloud space and now working in some project about the confidential computing. I will talk about that later. And my earlier focus has been in OpenStack. Unfortunately it's not, you know, the next foundation project, but I have been working in the multiple open source project, not only OpenStack, but the other things like Papette or Ansible. Mainly the cloud and deployment framework, tools of framework has been my focus. So before we move on to the technical part, I would like to introduce how we relate you to the open source and how entity data we're talking on open source project. So I kind of belong to the, we are the OSS code as OSS professional services. So what we are doing is that we have the two focuses in our team. So the first one is the user, you know, the technical support about open source software based infrastructure technologies. So I listed up several, you know, examples, heels like the coldly knocks or Apache. So we have some, you know, there are activities in the, mostly in the Apache project like Hadoop or Spark. And also we have been focusing on the other core project like PostgreSQL or OpenJDK. And we are actually providing some solutions as well as the support to the customer, our customers so that to resolve some problems when I'm using these software in actual, you know, the business situation or production services. So in addition to the actual human technical support, we are also working on a kind of R1D projects so that we, I mean, and we invest in the kind of new, I mean, emerging open source based technologies like Green Software Foundation or Confidential Computing, which I will explain later. So this is a kind of a chart which explains the kind of history. So we started from the quite core small part, like you might know this, but Tomoioly knocks is a kind of security modules in the next, which was, which started quite early days. And we, I mean, expanded the scope from the coldly knocks security thing to the other, you know, the middle where it's like the, on the top, you see several Apache projects like Hadoop or Spark or even Kafka. And also the PostgreSQL has been a kind of long, I mean, the focus of us. And one interesting thing is the Hinamost, which is a monitoring software, and we developed it and published it as an open source software so you can use it with, in, you can download this software and use it in your project. Of course, if you're interested. So our core strategy is that, so I think the open source is kind of a, you know, the commonly used and the kind of a core for any business situation nowadays. But we think that very important thing is to not only using the open source, but also contributing back to the open source so that we can sustain the whole ecosystem to sustain our business as well as the community. So we are trying to be kind of a catalyst between the upstream open source communities, open source technologies and actual customers who is facing the business problem. So our main, you know, the activity is providing the open source technologies such as Apache thing or even PostgreSQL to the customer who is looking for any solution, their big data business use case or high performance or high tolerance database, database for the huge data. And also at the same time, we are trying to contribute back our warnings to the obstacles like we submit any bug fixes, we found dwelling in the real situation or even we try to contribute back to any new features just to support the new use case or the expand too so that we expand the open source community as we was helping the actual business of our customers using these technologies. So this is a kind of, you know, the pro-motion leash thing but we have several, you know, contributors to open source. So actually I said that we worked in the multiple open source software but yeah, we do actual contributions which is, you know, recognized as being some core developers. So I think I should have said this in the beginning but you can find the slides in the slides. So I think you can download it and read it if you are very interested. So today, so I will actually talk about our recent project. So as I said, we have been focusing on the belly many projects so there are several many, there are several interesting projects going on. So we are focusing on PostgreSQL and OpenJDK and some Apache project like Hadoop or Spark or Kafka has been never focused but in addition to these, I mean, the core technologies we are trying to also expand the ecosystem of these software so that these core technologies can be more kind of usable and useful for any users. So I like to, I mean, introduce this project. I mean, these are of course open source projects. So I will explain some challenges we are trying to resolve and some enable we believe from the open source project which we are kind of working on. Now also, not only these core technologies we are also working on some emerging technologies so to realize a kind of future use case. So I will also talk about this. So the first one is the big data platform management. So as I explained earlier, we have been working on several Apache project like Apache Hadoop or Spark or Kafka which allows the users to process data or collect data, huge amount of data efficiently. But I think this is quite, I mean, usual. So this kind of big data processing is now very common. I think it was a kind of, I mean, the cut of edge in the few, the five or 10 years ago but I think now people are thinking about the big data processing as a very essential and common thing. And these are actually driving many part of the business. But the problem is that there are several technologies you can use, but I think it's a kind of a common pain point for the open source, some of the open source software but properly maintaining the platform using these technologies sometimes difficult because so in the, for example, in the Apache project there are several in the surface even numbers of project like Hadoop or Spark, HBase. So you have to select and also install these multiple projects to meet your business requirement or in the system requirement. And also you may want to try, you may sometimes need to install this from source by building the Java stuff or you might be able to find some packages but this might not be really up to date thing and the feature you cannot really use for your case. And also it's because there are services have several cross dependencies so you have to be careful about the version compatibility between these and also you might, there are several when the, some vendors might be providing the, you know, the good distribution but if you are trying to, I mean, load more and more component then you might be a need to, you know, manage multiple vendors providing the multiple products which is kind of messing up your whole environment. So to resolve these situations, we are kind of working on the project called Apache Big Top. So this is a project within the Apache and is related to this data processing, you know, project but the focus is a little bit different from the core project, I would say. And the main focus is to provide the easy installation of these data processing platforms so that users can more flexibly and easily use this. So in this project, we are building the packages, for example, you are usual packages like RPM or DevFile so that you can install this quite easily by your command or, you know, up-get or something. Or even we are providing containers so that you can deploy these services more, I mean, easily without looking into the apps from the GitHub and downloading this code and building this. Now also in the project, we have several, I mean, the test patterns and we are running continuous testing. So if you can find a set of a kind of tested version combination of multiple project. So if you download multiple, I mean, the services like HBAs and Hadoop and you can get a tested set of version which will help the users to select the version to be installed in the environment. And also, of course, these are all, you know, the full open-source thing. I mean, the source to create the build is open-source and also the, these, you know, these manifest to build the LPM or packages or open-source so you don't really need to care about the vendors who is providing this. And the second one is the data space. So this is a little bit vague, you know, the concept but gathering more and more interesting these days for, especially in Europe. So the concept of the data space is for the data utilization among companies or industries. So if you are, for example, the, if you are talking about, I mean, you're looking at the trans, so some, so this is quite useful and important for, for example, to gather the whole information in the supply chain, for example, so you, in the current situation the dear civil law companies involved in the whole supply chain. So it is not really sometimes easy to gather the whole situations. Like if you are quite interested in the energy consumption or some kind of the carbon output from your whole activities and you have to, you might, you need to gather the whole information across the companies or industries. But usually this data is stored locally in the individual company or industries, but this activity is aiming to gather all of this data in the central place, not the central place but to share this data among companies or industries so that we can get the output and analyze the situation more widely and correctly. The challenges with this is that, as I said, so data will be stored locally because not all data can be shared with the other people. Like some data might contain some privacy thing and some data might contain barely business critical thing. So if you are trying to achieve this kind of data sharing so you have to be careful about the policy as we was and also another important point is to create a very consistent, you know, the interfaces between these data stores so that you don't need to implement a kind of one-off software to connect just database A and database B and B and C. So dear civil law, I'm the open source project launch recently to meet some, you know, activities mainly going on in Europe and but I like to list up the two, I mean the major projects we are currently working on. So the first one is Apache Camel. So this is more, you know, technical thing compared to the second one I will explain later, but the Camel is defines data flow between the systems, data sources so that it can connect multiple systems and consuming data and providing data. So this allows you to, I mean the defined flow from, for example, industry A to industry B so that industry B can refer to the data of industry A to, you know, to make more flexible and new business analysis. And the quite core part of Camel is that it controls a message protocol and the flows defined by routes. So with this using this, so you can define very flexible data input and output format and also implement some policies or data processing policies so that you allow access to the only specific data you want to allow others to view or you can even, you know, convert some data or hire some data by data processing between the flows, between the routes. So, and defines more, I mean, more flexible but a little bit secure flow between data sources for data sharing. The second one is more, you know, wider project, not only a single component but contains a very many things now, but this is equalized data center components. So it was initially built for the project for the connectors between multiple data sources but it is now, I mean, expanding its scope after the publication of the data space protocol which can be found in the link in the bottom. So I do not explain details of it because it's quite long thing but there is a work to, they are trying to implement that common protocol and the tooling to implement the data sharing based on these protocols. So these are both two open source and we are currently working on in this project and making some POC and also doing some contribution back to this project. The third one is a little bit, you know, the more simple one, I hope, is the database management. So as I said, so the PostgreSQL is one of the most focused project for us for a little bit long but you know that database has been quite essential in the system but the management of PostgreSQL is still, PostgreSQL or any database, especially the production-ready database is still complicated. And as you know that many cloud vendors like AWS or Google or other cloud vendors have already provided the database-ass thing but these are not really open so it's not be usable outside of their clouds. So we are trying to implement more open, I mean the similar but technologies but based on the open, you know, the open source software so that it can be used in more environments like even in your own plumbing so some kind of management cloud. So to achieve this, so we are currently working on the project using the PGO, PostgreSQL operator. So this is, as I said, so this is operators. This is a Kubernetes operator to manage and automate to automate management of the PostgreSQL running in containers in Kubernetes. So as usual Kubernetes operators, it provides automation based on the containers so you can run these on top of Kubernetes without any customization. And also the very good point of this project is that it does not focus only on PostgreSQL but also provides several management features like monitoring stacks or any backup features or some additional component of higher availability. So it allows you to build the PostgreSQL with management feature so that you can set up the PostgreSQL for your production very quickly but with easy management. So the last thing is the confidential computing. So this is the project I'm going to work on. So I would like to talk a little bit longer about this. So the confidential computing, did anyone in the room attend the session yesterday about confidential computing? Okay, so this will be still new thing so I will talk about it a little bit longer. So the confidential computing is very new but actually developed feature and technologies in the cloud space. I think the most of the mega cloud, I mean hyperskillers have already provided their own confidential computing but this is still actively developed so you will see several updates in coming few years, I think. So the core part of confidential computing, actual objective of confidential computing is protect data running on the in cloud from the cloud vendor. So if you run your workload and put your data on cloud and run your workload there so you usually have to trust the cloud vendors because you have data there and being processed is not really usually encrypted. So there are several features like, so you can encrypt the data in transit, I mean in the network traffic using the network technologies like IPsec or TLS and you might be able to encrypt the data in disk using the usual disk encryption features like LACS but the protecting the data being processed is, has been, was not really possible because the data being processed should be decrypted to be processed but recently there was a lot of CPU chip vendors or CPU vendors who provides a feature to hide the data being processed from the, from the, at the host level so that you can keep the, process your data on the hardware but hide that data from the other processes or even the people maintaining the hardware. So the confidential computing is using the technologies so that you can encrypt whole workload in the cloud so not only the data at rest so data at the disk or data in transit which means the data in the network but it allows you to encrypt the data in use which is I mean stored in the VM memory being processed by the CPU and also the other core part of the confidential computing is that so you may ask the cloud to encrypt your workload I mean the launch the VM with the encrypted version machine but you do not trust cross the cloud vendor so you cannot trust the properly set up that your version machines with encryption enabled so you have to, you need some ways to make sure that or test that your workload is actually hidden from the cloud vendors by some way so this is quite complicated because you are running the workload in the remote that you have to make sure that your version machine is running with the proper configuration to enable with the encryption enabled so there is another technology called attestation here I mean implemented by some chip vendors using the belly built in chip processes built in the processors or any other chip to get some, you know the fingerprint about the configuration or set up of the virtual machine to ensure that your version machines is being executed in the secure the environment so this allows you to ensure that your version machine is running with the proper encryption enabled and you can ensure that your workload is encrypted and hidden from the cloud provider without trusting the cloud vendors but only with trusting some chip vendors like AMD or Intel or other things so this is still the new technologies so some of the features are already available when generally available in the processors but there is still ongoing work to implement these features within the software especially the open source software so the first project, so I would like to introduce several projects here so the first one is Verti and this provides a kind of a tool set of the virtualized virtualization based TEs so as I said, so this feature is usually built within the processors so some many chip vendors like AMD or Intel or ARM and vendors providing the ARM has been implemented their own implementation but they are kind of gathering to provide the kind of common tools between them so one example is AMD SCV so this is, we feel, we believe this is a little bit advanced so we are currently looking at this heavily but AMD SCV is one of the example of this confidential computing feature and it allows you to encrypt the VM memory as well as protect the VM memory from host activities so that you can ensure that your data of alcohol is encrypted as well as not without any injection or attack from the host side so and this feature also, this CPU feature also allows you to generate the attestation report so this is a kind of the fingerprint to ensure your execution environment so these kind of features we resolve these two challenges to implement for, you know, confidential computing cloud so there are several other projects because this is emerging, so there are many, many projects being started so the third one, my interesting project is the confidential containers which is not part of the Lax Foundation but currently in the CNCF this is not directly on the Lax Foundation but this is under the CNCF as a new sandbox project so this mainly focus on this the main focus of this project is the containers instead of the virtual machines so they always do some work about to bring these features to the container space and there is a project called CataContainer which implements the container runtime based on the virtual machine technologies and they already released a new feature with this kind of, you know, the confidential computing feature enabled using MDSCV and a few other CPU features since version three and in the, talking about the OpenStack project I've been working in for a while so the OpenStack is, I think, the very common nowadays solution to build a cloud platform especially your private cloud and it kind of supports very first generation of SCV enabled guests to implement the confidential to launch the VM with memory encrypted using the reverb and cam and the CPU feature, I mean, the SMDSCV now but this, and also I noticed I didn't include this but there was an organization called Confidentia Consortium and the Linux Foundation now and that is also another project also working in the several layers of the feed technologies to implement the whole confidential computing world. So this is not really yet reached to the standard but each cloud services are implementing these and some of the vendors are gathering to create a common open source so we are currently also trying to bring you our use case to build the common, the generally available and the common solution to build the full open source cloud with these kind of features enabled. So I think I talked a lot about this so hope you can find some of the very interesting some interesting topics from this. So just to summarize our slides so we think the open source is essential for business and we have been, and we will be committed to contribute to open source to sustain the community and the real rights bill are interesting things and as I said, so we have, our initial focus has been started with the core technologies but we are trying to expand our scope to the emerging technology as well as some ecosystem improvements so there'll be a kind of very, I think there will be a quite good, recent opportunities in the open source world personally so there will be quite interesting for many others as well and also we will welcome anyone. So if you are interested in any of the project I explained today, so we will welcome anyone I mean the working or collaborating with us. So there will be several other projects I didn't, unfortunately the birth it's caused I was about to say please come to our birth but unfortunately it's already caused so if you have any interest about any project or related work then please feel free to ask any questions or reach out to me personally. So I think that's all. Because I didn't know much about your open source effort but it seems to me that big data and cognitive computing is, big data is a huge interest for your company and cognitive computing gonna be the new one but we have, because you say that you're working with a customer and feedback to the community which why you don't have any more versatile technology or solution based on the open source software like security by default, security as a big interest for the customer, right? But cognitive computing is pretty much niche area of the technology itself. Why the LTE data is focusing on big data and cognitive computing? Why other area, you know, imagine, you know, Kubernetes, WebAssembly, language, programming languages and things so that's my curiosity. Yeah, yeah, I think there will be still, you know, there are many interesting projects and interesting technologies in the world. Honestly speaking, we are big but these are based on our current strength. I mean, we invested earlier in the database big data processing. So we have several, you know, the members who have been actively involved in the community. So that's why we decided to make some focus on because we can properly maintain the group and the contribution group. So, and also the, for the cloud, this is, so the reason why I joined entity is one, you know, to make things smooth but we think that, so I think the very important thing is that we make sure that we have some, you know, the feedback loop and the good involvement in the community to use the open source technologies with, you know, with sustaining the community. So it's a little bit difficult to focus on multiple, multiple things with this balance kept very properly. So we are kind of focusing on these too but then in addition to, you know, the database as a service, but in the future, we might be able to, you know, contribute, I'm the focus on this with the proper, you know, the resource of human assignment. Please speak with Mike. So are there anyone exist in your company who decided which area you gonna invest or you gonna put people, engineer onto that or maybe some grass roots, you know, bottom up thing? That's my question. Yeah, because we are providing services by our own, I mean, we have civil services, services we ourselves provide but we are working with the customer. We are main business is providing a system to the customer. We mainly talk with the customer to gather new requirement or gain pain points to decide the new election. And our recent, because we have been focusing on the infrastructure technologies up on entry, so keep, I mean, the focus will be within the infrastructure technology but based on the feedback from the actual, you know, business field. So in the past, I was pretty involved in a lot of the projects that y'all are involved with. In America, so I'm kind of curious, one of the common complaints, at least, I don't know, four or five years ago was that a lot of the projects that y'all are involved with are dominated by committers from certain companies. So like Hive would be dominated by Facebook, Hadoop had a lot of Yahoo and I'm not sure who sparked Databricks. Basically, a small number of companies kind of have a stranglehold. Now, obviously you have a number of committers and PMC members, which is really cool. So I'm just kind of curious, what has your experience been being, you know, deeply involved in these communities but not in America, not in San Francisco, not at Facebook, you know, in Japan. Just kind of, yeah, I'd love to hear about that. Yeah, I think the very key point, I mean, the very key point to have a good balance between the business as well. Your own activity and the AppStream activity. I mean, I think the challenge will be how you, the challenge is mostly about the way to have the good contribution as well as, you know, the relationship within the community. So I think the very key point is to share your use case and the situation very openly with the community so that, and also, you know, for example, so that the community understand what you're doing, then that will allow you to, I mean, I think that as an open source maintainer, I mean, I have been maintaining multiple open source projects, so it's a little bit difficult to understand what any contributor is trying to achieve with just a single patch, for example. So it's, if you know the use case and if you know the whole pain point, then you might be able to understand the use case as well and you might be able to understand how we can collaborate more better. So providing, I mean, not only using the open source in your local, but more actively involved with the AppStream developer as well as the, even the single company kind of maintaining that project and also, I know that it's a little bit difficult deciding a kind of, not too long, but the middle term or long term plan about the software so that you can build some trust about your contribution in the community is quite important so that to be involved in the community as a kind of core level or, you know, and do more flexible things with more, you know, the smear collaboration with the existing core members. I don't think it was, it wasn't not an answer. I mean, it's a hard question to answer, right? I guess maybe a follow-up would be in your experience with these projects, given the engagement that you're describing, which I mean, I think is, is indeed what you have to do. Have you been able to get sort of, have you been able to get features that you felt you really needed, prioritized? Like, has that been an issue? Like, have you been able to sort of drive the things that you, like, you know, especially, I mean, Hadoop, Spark, like, these are, you know, pretty big, pretty important projects. Yeah, yeah, I think so, yeah. We had some, of course we had some difficulties caused by some, you know, a conflict or conflict. So I don't think, I don't say that everything went smooth, but we did, I mean, some of the priorities was successful, I would say. So it's a little bit difficult. So ideally, you all, we get everything you want to make in the upstream, in the, you know, merged review, but we are trying to, I mean, we are trying to make it more smooth, but yeah. The reality is that most of them has been, has worked for us, but yeah, even for us, some part of it, that's a bit of an old one. Thank you. So I think we are running out of time. Arigato. Just so quickly, does NTT have a guidance for your top contributors on how much time they spend trying to help the community's general interests versus how much time they spend trying to specifically contribute code that might bring a feature relevant to your customers? We don't have very specific guidelines, but what we are trying to achieve, our strategy is being aligned with the community. So that they're all, so these are mostly dependent on the individual decisions, but yeah, we don't have very, you know, the numbers like 30% contribution and kind of stuff, but we are trying to, as a single engineers, they are trying to be involved in the community, and as a whole team, we are trying to be aligned with the community. Arigato.