 Hello everyone. Welcome to the webbinder. It is time to talk about DataMesh, which is a distributed data management for Microsoft Surface, presented by Fred and Kenny from Broadbridge Corporation. Let's have a look at the Microsoft Surface challenge first. Usually, there are four phases for Microsoft's architecture introduction. Analyze, decouple, aggregate, and implement. The first thing we must keep in mind is the data. It's the key problem in the current Microsoft Surface architecture implementation. After several years of hard working, the industry and customers have also looked for consultants and have achieved many results in application development and logic design. We have learned a lot of design and development methods such as domain-driven design, splitting methods, design patterns, etc. The reason why the Microsoft Surface architecture is still not formed is that the main problem occurs in data decoupling and fragmentation management. So, we need a solution, a low-risk solution. The most horrible thing to any company is moving data. The risks are high and responsibilities are difficult to clarify. But, you have to face the data problems. Otherwise, the Microsoft Surface architecture will never be implemented. The solution we are looking for is simple. Just don't touch my database. You can do whatever you like. So, we need a data supply planning strategy for Microsoft Surface. Now, think of the problems we may encounter while transforming from shared database to database-per-service. Yes, we all talk about database-per-service in the Microsoft Surface architecture. How to decouple existing data with low risk? How to avoid performance impact? Like long-trip issues, data exchange across surfaces. It can't be done in one step. So, what is the transformation plan? What is the data management transformation plan? Here, we come to the practical issues of database-per-service. A large number of cross-surface queries when linking data requires network and surface pressure. Direct access to the databases of other surfaces can solve some of the former problems, but the antipathic approach will still cause huge pressure on the databases. Mixing the two methods will mesh up the system data management logic. In addition to being unable to control the data access path and security, system performance will still be a big problem. The challenge is how to keep database-per-service while being able to meet the requirements of relational data and high-election query, and to remain service-independence without impacting other surface performance at the same time. This is the way to realize the database-per-service. It fully follows the specification of Microsoft Surface to decide modeling systems with business-oriented. However, there are still some problems we have. First, how to manage when there are too many programs for data retrieval and synchronization? Every new data stream needs a new customized program, which is quite troublesome to implement. How to ensure the stability of each data synchronization job? When you need to create a new dataset and then generate a new virtual table, the loading of source database is very heavy. If you use event sourcing, event store technique, the speed of rebuilding a dataset from the event is too slow to rapidly expand. So, from the previous analysis, we can know whether the Microsoft Surface architecture can be implemented smoothly. The most important keys are, one is the management capability of messy proficient and caged pipeline. The other one is the efficiency and pipeline design and deployment. We need to build a data platform with standardized and unified management data proficient routing and have a cache which you can quickly create and restore virtual tables according to application changing requirements. That is the data mesh architecture we are talking about today. Data demand driven by applications needs to be satisfied by a large amount of data supply chain. We could cheat all databases, whether homogeneous or heterogeneous, into a huge database. The objectives are, the application does not need to know where and how the data comes from in the system. A large cross-system virtual data system. The data required by the application is presented as a virtual table. To satisfy efficient data access, no impact to the existing database systems. The implementation method includes data virtualization, data streaming, data snapshot, data cache and data relay. Why? We don't use the traditional solution because the centralized data management cannot fulfill the needs of Microsoft Surface applications. Such solutions are S, ETL, data virtualization, data warehouse or data lake. How to do for example, centralized data platform and data processing architecture will lead to following drawbacks. Data supply, pipeline, scheduling, flexibility is limited. Cross region, cross site, cross cloud network bandwidth, waste, low throughput and high latency, the performance bottleneck falls on the data processing node. Data transformation easily performed too early, resulting in data unable to be processed in parallel and efficiently cached. The large performance impact on data sources. So, the data mesh concept is the right choice. Distributed data management and supply architecture can meet the needs of Microsoft services with following characteristics. Supply data to applications with high throughput and low latency. High data scheduling flexibility. Proficient data with short disappeared and least amount of data. Data processing work is distributed to different systems in parallel with a Microsoft architecture. Data does not need to be aggregate to a system on a specific node, avoiding single point of performance bottleneck. Each data cache pipeline runs independently and does not interfere with each other. Now, let's see how to build a data mesh infrastructure and how to use it. A data platform needs to meet a large number of data pipelines by directional channels and supply requirements. Using data pipeline to control the flow of data, using data relay cache to ensure performance and throughput, using container to achieve agile deployment to meet application requirements. Having expansion and pipeline performance to reduce data latency and maximize throughput. This is a showcase of the data mesh platform. You can quickly provide data in the form of APIs or even static pages. Just three steps to quickly generate data query API. Event data provisioning, caching and building virtual tables. You can offload the pressure of data warehouse. You can meet the needs of a large number of external data queries by cache optimizations. No coding required, you don't need to write even a single line of program. Now, let's have a look at some user cases. The first case is about the implementation, efficiency and delay in data exchange between new and old systems. With the traditional method, you usually have a program to load data to a middle file from source database. And another program to pull data from the middle file to update the destination database. You may have a schedule A for data query and schedule B for data insert. So, the total exchange time is A plus B. That is the major factor to cause the delay. However, with the data mesh platform, data exchange no longer requires writing a lot of programs. Data exchange will be provided to all applications in real time with low latency. Data sources no longer need to spend the main power and time to prepare data for application sites. Data source and application sites no longer need to spend time to communicating for data exchange. The problem you can solve in efficient data exchange between different business systems. In efficient system data exchange across teams. The second case, in efficient batch processing and performance impact. Batch processing will cause a momentary high system loading burst. The more data is needed for applications, the heavier the loading on the data source. Including those who access in exporting the data. With the data mesh platform, data exchange no longer needs to write multiple programs or provide APIs. The data is sent and processed on a flat rate. There is no momentary performance impact. Read data source once for many professions. You can provide data to many services without querying the data source, even for new services. Supported by cage mechanism, multiple applications can be supplied at the same time without impacting source high throughput. It solves the following issues. Batch processing of multiple large-scale queries. Multiple business system integration and data exchange and efficiency is extremely poor. Too many query requirements for a single database. The third case, aggregation and correlation of multiple data sources often have poor processing efficiency and lack parallel processing capability. Because of, it is not easy to implement. Every demand and action requires querying the database or fetching data causing all systems to be extremely busy. Only start query aggregation and transfer form processing when needed, which is extremely inefficient. Out of same data input from different sources, we cause a lot of waiting and we try. A large number of cross-database relational queries produce a large number of run-trip calls. Again, the more data is recalled by applications, the heavier the loading on the data source and the more severe an impact on performance. After introducing data mesh platform, the aggregation and association jobs can be processed parallel at the same time. The final data is ready before the application execution. Every requirement directory reads the final virtual table nearby. The impact of data source will be greatly reduced for benefit of the cache mechanism. Multiple applications can be supplied concurrently. For example, we can show the problems like data integration efficiency and stability required by IOT system with multiple signs. Poor processing efficiency of cross-system data integration. Multi-database lockup or performance impact caused by cross-system query. Case number four, cross-cloud data management. The common problem must be mentioned here. First, the network pressure is high. Second, large-scale overall system lockup is high. And the normal business operations are effective badly. With the data mesh platform, we can achieve cross-clouded caching. Can provide data to applications from the near nodes. Reduce the dependency between services in the system and avoid system lockup. Does not affect normal system operations. Here, let's see a video demo to show how the transfer for data from an Oracle DB to a Mango DB. Okay, that is all the data mesh architecture concepts we are talking about today. Thank you very much for watching. See you. Bye.