 Let's start. Thank you for giving me a chance to introduce our work. I'm Jin Choi in National Super-Conferring Center from Korea Institute of Science and Technology Information. Today, I'm going to discuss on our chameleon project, Expanding Open Source Unburied for HPC. Four years ago, this project was created with global trends that require HPC and big data convergence. According to the 2015 BDEC Pathway Report, data analytics and computational sciences ecosystems use the different hardware and software. Excuse me. For example, data analytics ecosystem includes the cloud service, hard file system, and R applications. On the other hand, computational sciences ecosystem use the container, lost file system, and batch scheduler, and numerical libraries. But HPC and big data convergence makes the distinction disappear. In the case of Hadoop, the reader of big data ecosystem, the convergence is clearly visible. For example, HDFS supports erasure coding, which is usually in HPC. Yarn Hadoop's resource manager begins to support GPU container and FPGA for scheduling. As another example, RDMA-based Apache Hadoop from Ohio State University can be used to exploit performance on modern clusters with RDMA-enabled interconnect. Intel recently mentioned the necessity of an integrated resource manager that can be applied to both HPC, AI, HPDA, and WorkRoad. In the left side of the figure, cluster A consists of resources for HPC simulation and modeling using MPI. Cluster B targets from machine-running AI platforms such as TensorFlow and Cafe. Cluster C is configured for data analytics using Apache Spark and Hadoop. Each cluster needs its own resource manager, A, B, and C. But as shown in the right side of the figure, Intel's unified resource manager makes it possible to learn applications of the different areas on the same cluster architecture. Along with this global trend, specifically, our genome analysis scientists say that some stages of data pipeline begins to support big data platform, but computation requires HPC cluster and the data is stored in the cluster file system. In other words, we have to consider integrating the big data platform with HPC clusters and parallel file system. This is bird's eye view of our chameleon project. Chameleon is a big data platform operation management system considering HPC environment. As shown in the figure, M-Vary is a big data ecosystem operation management platform. IML is an Intel developed cluster management system. Our chameleon covers full M-Vary features and the part of IML features. So chameleon allows you to install and operate the cluster file system using M-Vary. Also, it helps to learn how to own cluster environment. And then chameleon provides a web user interface for advanced yarn monitoring and HPC resource monitoring. Today's talks will cover, first, M-Vary overview, focusing on how to extend the M-Vary and the cluster file system overview, and then chameleon focusing on what we have done. Apache M-Vary is designed to make it easy to provision, manage, and monitor Apache Hadoop ecosystem. It is made up of an M-Vary server and M-Vary agent. And each agent learns on each host in the cluster. M-Vary server provides a REST API to enable host monitoring, package provisioning, and management in M-Vary web user interface. M-Vary has three extension points for custom service development, M-Vary views and M-Vary stacks, and M-Vary blueprint. M-Vary view is a plugin that provides a way to connect custom functions to the web user interface. M-Vary stack defines a set of everything needed to define services such as HDFS and yarn. M-Vary blueprint provides an API to perform cluster installation. We considered only the M-Vary view and M-Vary stack for the chameleon project. If you want to include new service in the Hadoop ecosystem and provide new service through M-Vary, you need to become familiar with the following terms. For example, given the HDFS service, stack means HTTP. That is a distribution or a package, including the service. And the service means HDFS. The components are the name node and data node of HDFS. The name node corresponds to the master category, and the data node corresponds to the slave. Before stands for metadata repository. In category, master, slave, and client depends on lifecycle commands. Client does not include start and stop in default lifecycle commands. Master must be managed separately, and slave can learn at once. Example in case service start or stop. Chameleon is an advanced extension of M-Vary for provisioning, managing, and monitoring Apache Hadoop clusters and Luster storage. If you want to configure this environment, you need to install M-Vary and Chameleon. And then, install and configure the Luster file system using Chameleon. After that, you can select HDFS-based or Luster-based as an execution environment through the Chameleon. This is a Chameleon component overview. Chameleon provides following services. Luster corner updater, Luster manager, account manager, and yarn app monitor. We developed three M-Vary views for these services, Luster manager view, account manager view, and metric registry view. Chameleon also supports web user interface for advanced yarn application monitor and HPC resource monitor. Next, Luster file system overview. Luster is an open source, object-based, distributed, parallel, and clustered file system. In top 500, most superconfiders use Luster file system that is used for IO performance in large-scale systems. Luster's architecture is as follows. It is defined as three services, management service, metadata service, and object storage service. Host can mount Luster file system through Luster network as shared storage file system. Here is some differences between HDFS and Luster file system. First, Luster file system is projects-complete, so it provides a normal projects interface while HDFS is not a projects-complete file system. Second, HDFS is an application-level file system, but Luster is a system-level file system that requires Luster patched Linux kernel. Finally, Luster uses Infiniband and provides on-network protocol, which will be more efficient for data transfer than the HDFS HTTP protocol. The following shows Luster file system and chameleon. Our chameleon executes and very agent in the client node from Luster-backed Hadoop cluster. And it's a port for Luster installation, configuration, and management. Now, let me introduce the features of chameleon. First, Luster operation management service. To develop the Luster file system operation management service, there are three chameleon components. Luster kernel updater, Luster manager, and account manager. The Luster file system requires a kernel patched to be rebooted. This is why we divide the Luster operation management service into two components. And Luster setting and changes are available through the Luster manager view. And Hadoop on Luster service is provided by the Luster manager. LDAP-based account management is supported using the account manager view. The Luster kernel installation step is done by installing the Luster patched kernel on the MDS or OSS nodes and reboot each node. Luster file system management step involves Luster FS add, remove, mount, and mount, MDS OSS node setup, MDT-OST device setting. OSS deactivate, deactivate, backup, and restore, and then LNET configuration start or stop. I would like to show one minute video about Luster file system configuration. This is the Luster view in the extended and very web interface. Add a Luster file system. Write file system name Luster0. Set the management service node. And then set the MDT device and network for the metadata service node. After that, OSS node setup, OSS1 server setting, and OSS2 server setting. And client node also complete device and network configuration. Then verify, mount the file system. Each node is mounted on Luster. You can verify the Luster file system with the LFS-DF command. This is the Luster Manager view. If you click on the Luster0, we edit earlier in the video. You will see a view. In there, you can check or change the Luster configuration. First, metadata server setting. Next, in the OSS server setting, we can active or deactivate to each OSS device and can add or remove those. We can check the client mount point in the client setting. It is possible to start or stop the client's network in Lnet setting. We can back up or restore OST. Our current version only support back up to the local directory. Here is account management service. Hadoop does not support strong authentication by default. It supports cover loss for that. But it causes performance degradation. On the other hand, Luster depends on projects level account management. Account information needs to be synchronized between Hadoop cluster and Luster file system. This is account manager view. We can add or delete users. Next, Hadoop on Luster service. Our Hadoop on Luster service is provided as follows. In left side, HDFS has aggregation of node local storages, while Luster is a shared storage. Except for the storage architecture, Yarn resource manager and memory store work the same way. So in our Hadoop on Luster execution environment, we can switch between HDFS and Luster file system. There are three related work about Hadoop on Luster. First, Direct Taxes project. In that project, memory use job shows theoretical performance gained on appropriately designed Luster clusters with fast network. Second, C gates Luster FS plug-in. It is suitable for single tenancy. And it needs curve errors to provide verty tenancy. Lastly, in Intel's enterprise edition of Luster, they eliminate shuffle of HDFS. Because Luster FS is a shared file system. In contrast, we develop Hadoop on Luster FS execution environment as custom and very service. This shows Hadoop on Luster services execution environment. It works for diskless cluster back by Luster file system. We use secure container configuration for verty tenancy. And Luster act as if it had a local directory for intermediate data. It is possible through the bind option of the mount command. Each node in the Hadoop cluster must have a local directory for intermediate data, which is 10th directory by default. Imagine a diskless Hadoop cluster with a shared Luster storage. Each node can mount and use local directory in a separate OST using bind option. Again, let's take a minute to see how to works for Hadoop on Luster service. We will learn Hadoop map reduce job as a user one. In the Luster manager service, click Create Working Directory Action. And then attempt the local directory for Hadoop intermediate data is created. Then change the Hadoop space file system to Luster file system via the switch to Luster FS action. This requires restart to the HDFS yarn and memory use services. After that, if you use the Hadoop FS command, you see the result applied to the Luster file system. Once this configuration is done, you can learn what count and see the result similar to learning HDFS. What count is done. This shows before and after mount of the Luster zero file system, respectively. Before mount, after mount. Here, Luster zero is mounted. This is the result of learning Hadoop FS command in our chameleon environment. HDFS and Luster FS. When learning Hadoop job, the only difference between HDFS and Luster is red line. For Luster file system, add an optional JAR file using the leave JAR option. Next, advanced yarn application monitoring service. Advanced yarn monitoring services were developed according to the following requirement. Users need to manage dynamic metrics. Hadoop resource manager REST API can only collect predefined limited information. Metrics and time interval should be configurable as needed. Also, there is lots of Linux performance monitoring tools from Linux ecosystem. This service uses timescale database for time series data monitoring. It provides application history data management. For the monitoring metrics, all of the Linux monitoring utilities based on the process ID can be used. And users can add, delete, modify metrics dynamically through metric registry view. When the metric is updated, collector script is sent to all the client nodes and registered in Chrome tab. The yarn application monitoring service also provides the graph per node or per application. Timescale DB is an open source time series database optimized for fast ingest and complex queries. It uses hyper table concept. The concept enables scalable data management for time series data. Timescale DB automatically split each hyper table into two chunks. Each chunk is implemented using a standard database table. It provides full SQL interface for all SQL natively supported by PostgreSQL. Many people use the NoSQL database as their time series database engine. This graph shows the preference of the NoSQL database. In the graph, more people prefer NoSQL. In this table, Influx DB has the highest utilization. It is a representative NoSQL database. But we use this Timescale DB instead of NoSQL database engines. In particular, time series high cardinality is a common problem in some monitoring and event data work route. Here is how Timescale DB and Influx DB perform on ingesting data. Timescale DB outperforms in terms of high cardinality data set. The following shows our management structure. Generally, on an existing databases, metric can be added or deleted through alternate table. It causes parse data. But in our approach, a table is newly created after adding the metric. Our database changed table name for tracking chronological order as following. It causes dense data. Example, in the metric registry view, we can add new metrics with process ID and script. This is the Camelion's yarn application monitoring screen. We can check, recall, memory, and container status information for the application in real time. Click the View button. We can also check the history poll application. Finally, HPC Resource Monitoring Service. The service provides HPC resource monitoring information through web user interface. For Luster file system, MDS OSS are monitored. It will support GPU and infinite band monitoring. The user interface shows compute partition and storage partition respectively. It also provides dynamic node alignment. User can freely move each node on each partition. And node utilization, including health check, is represented in color. Lastly, it enables user to remote access to the nodes via web SSH. This is Luster's MDS and OSS monitoring dashboard. We can do node alignment dynamically. For that, check the Modify node, and then horizontal and OR vertical button. We can see the dynamic node information as per node graph. Here are node 5 and node 8, OR as summary graph for multiple nodes. Here are from node 5 to node 8. Remote login is possible using V console of each node. Here is node 7 V console. Enter user name and password, then you can remote access the node. Conclusion. HPC and big data convergence makes the distinction between data analytics and computational sciences ecosystem disappear. With this trend, we developed the chameleon, which is a HPC-based big data platform operation management system. Our chameleon project helps you to merge Hadoop ecosystem and Luster file system. We uploaded a full video of chameleon service to YouTube. The following QR code is a YouTube link. Also, we will soon release the chameleon source code to GitHub. Thank you for listening to my first presentation. Do you have any questions? Thank you so much.