 Ladies and gentlemen, good afternoon. I am very glad to stand here and give you guys a speech about ROKMQ in 2012, Alibaba open-sourced ROKMQ third-generation distributed message engine through several years of technical improvement. ROKMQ is now capable of transferring trainings of messages in Alibaba's double-eleven shopping festival. In November 2016, Alibaba donated ROKMQ to the Apache Software Foundation as an incubator project. That was a huge step for us. And it's not an easy thing to make it through Apache's competitive evaluation process. Okay, that's a brink forward for today's sharing. This is an Apache ROKMQ's official Twitter account. Welcome everyone to follow. Latest, we published an article on InfoQ, giving a general view of Apache ROKMQ. Okay, let's begin our speech, part one, from Alibaba to Abatz. From Alibaba to Abatz. I forgot to introduce myself, may I have your attention please? My name is Xiao Rui-Wang and I'm coming from Alibaba Mid-Ware Office. This is my profile. I'm an open source fanatic, also Apache fans. Before talking about ROKMQ, please allow me to introduce my company, Alibaba. Maybe some people didn't know about this China giant, Alibaba is a Chinese e-commerce company that provides consumer-to-consumer business to consumer and business to business cell service while internet. It also provides electronic payment service, a shopping search engine and data center cloud computing service. The Alibaba Group began in 1999 when Jack Ma founded the website Alibaba.com, a business portal to connect Chinese manufacturers with overseas buyers. In 2016, Alibaba's double-eleven shopping festival, the world's biggest online shopping event generating $16.8 billion. Alibaba is like a branch that links the USA and China as this picture described. So what is Alibaba? It is an international marketplace with over 430 million active members from over 2,000 countries and regions. It is a truly global marketplace. You can deem Alibaba as the China's Amazon and eBay combo. So next let's get back to the main point of ROKMQ. As the mass evolution, I will introduce ROKMQ's history. Firstly, notify ROKMQ's photo type product, it will fall from our multi-colored store project and come into play in 2007. It is designed for business to customer trading. For example, you place an order in Taobao. The system will send two messages to back-end service. The downstream service will subscribe and do some complex business. About 2010, Alibaba, another office B2B, designed a new messaging engine named Napoli. It learned from the Apache Famer's message engine, ActiMQ. We're developing a new console for MQ resource administration and monitor. Using links and notify mechanism for messaging replication, there are two messages. The messages are designed for trading business, especially for its eventual consistency transaction demand. And now they are combined into one message engine. About 2011, we started to research a new architecture, not only for trading but also for streaming process, message-message, accumulation, order-message, and so on. Through several years of technical improvement and evolution, the third-generation distributed messaging middleware, ROKMQ is now capable of transferring trillions of messages in Alibaba's double-eleven shopping festival. Last year, we donated ROKMQ to Apache. We hope to grow the base of contributions through Apache Way, and we are happy to get your valuable feedback contribution from outside of Alibaba. Including dashboard based on ProStrap, the Apache Floom integration module, the Apache Storm Ignite and Spark integration module. Next, we will sum up the main sense for Apache ROKMQ. We divide into five directions. First one, application integration, such as the EAP. For example, you may want to download an application integrated with another GoLanguage application. Secondly, losing the coupling between applications, no doubt that's a main function for message engines. Thirdly, bank bond for EDA or CEP architecture. As we know, CEP event processing, call rate, multiple messaging within given time frames. Since 2012, ROKMQ has challenged the task of supporting Alibaba's double-eleven shopping festival, multi-types. During the 2016 festival, ROKMQ robustly provided stable infrastructure with a trustful throughput of more than one trillion messages. We can see clear from these figures. Okay, that's all. Thank you very much for your attention. Next to my partner, Walt Gosling. Welcome. Hello, everybody. It's finally my turn. This is my first journal to American. Yeah. Yes, this is also your first journal to Apache ROKMQ outside of China. Yeah. So, let's clean up the mood. Get ready to start with me, hoping I am a qualified guide for Apache ROKMQ. Sorry. I'm excited to stand here. My name is Walt Gosling. You can call me Walt. I came into Alibaba after graduation, currently in charge of Alibaba and Clown, Clown, Alibaba and Q. As Xiaoliu Wang mentioned before, this is our commercial distribution for Apache ROKMQ. Besides, I am also an open source fanatic. My interest includes distribution systems, Clown computing, and big data. Yeah. I'm also the Apache ROKMQ commuter. There are some more contents in part two. This is a gender trait. First one, I will use some graph to describe Apache ROKMQ's architecture. Following is ROKMQ's common features. Besides, I will introduce some hair-level commercial features in Alibaba and Q. In the second, I will cover the device DevOps topic for ROKMQ. That's about monitoring and administering. Thirdly, I will introduce full-stike performance tuning experience from operating system to ROKMQ itself. It's better in our preparation for Alibaba's double-edit shopping festival. Last but not least, I will show plan for next generation of Apache ROKMQ. What do we want it to be? So let's make a close-crown for ROKMQ. Let's get started with the domain model. We can see the five concepts, message, topic, group, queue, and off-site. As we know, message is the data payload. It includes three parts, header, payload, and properties. I will emphasize two important properties, that is the message ID and the message key. These are indexes for message. The former is a unique system key, consider a message, it's logical, it's life. Where message key is very interesting design, ROKMQ gives you a chance to index your message according to your logic. That's the message key really intention. Another important concept is off-site. Like cursor, you can use its position and time properties to introspect your previous message. So other concept is similar to traditional method, no need to say. ROKMQ consists of four parts, name servers, brokers, producers, and consumers. As shown in this picture, each of them can be authentically extended without a single point failure. Name servers provide lightweight service discovery and routing. Each name server requires full routing information, provides equivalent routing and writing service, and supports fast storage expansion. Producers take care of message storage by providing light WQ mechanism with which message are grouped by topic. It supports full tolerance mechanism, and capacity of communicating hundreds of millions of messages in their original time order. In addition, brokers provide disaster recovery, rich metric statistics. Consumers support distributed deployment. Distributed producers send a message to the broker through multiple load balance modes. The sending process supports fast failure and have a lower latency. Consumers also support distributed deployment. It also supports cluster consumption and message broadcaster. It provides real-time message subscription mechanism that meets most consumer scenario. This picture depicts how to send and consume message in high availability mode. There are two important objects, commit log and consumer queue. Commit log is an opend only log system, where consumer queue just like catalog for message. Each consumer has its own consumer queue. It is also the distributed load balance unit. We can balance consumer operation between these queues. Taking SCADULE message for example, I will illustrate a SCADULE message how to be handled. When a producer sends a message, specifically it is a SCADULE message, just like delivery delay consumed in GMS2 specification, we provide several delay levels. SCADULE message all cluster decoding is a delay level. There is another multi-slide scanning these queues periodically. When the time comes, putting this message into commit log, so like generally processing logic, this message will be dispatched into the consumer queue. Using data file and catalog file separation, we can easily extend our message service like this picture depict. So the following pictures depicts JockMQ server and the client layer architecture. There are too many details in here. Today we are not planning to elaborate it. If you are interested, we can talk about privately. Yeah, this is a client. Okay, next we will enter the next topic, Apache JockMQ's call features. Well, so many features. I listed the many features here. Audit message, keeping an order of message is an appeal feature of JockMQ. It only preserves global auditing when each message can be marked by a unique hash table type, such as a seller account and order number. Broadcast message, you can use it to enable individuals and organization to reach large groups of people quickly and simultaneously, but no guarantee message reachable. There's no more to say. Skidule message and transaction message on the way, not ready. This feature I will describe later. Batch message, it is a new feature in the last release. We provide a collection interface to allow you to send batch message one time. One way message, if you are not concerned about your message reliability, you can use one way message, sending your message to broker and no way to act from broker. Treatable message, we can use it to do message link health check and monitoring. Retractive message, we can use position and time to retrospect previous message. Message filtering, we can use a tiger to guide what we wanted. Our upcoming version introduces a new way to polish here. As XMQ, we use structure query language 92, to construct a message filter expression, making a more fine-grained filtering. Massive cumulation, because we separate data and the catalog fail, IRIF consumer has its own consumer queue. It's only store message consumer status. So we can say Rocking MQ can accumulate a message as long as having enough disk without performance loss. Backoff strategy, it is different for producer and the consumer. If consumer always consume failure, Rocking MQ will send this message into global died lighter queue. Delivery quality of service, or most of the MQ products claim zero feature of delivering at least once. I can't, Rocking MQ is no exception. Currently, Rocking MQ does not check for method duplication, leading users to build or buy their one external global storage for reprocessor. The next generation of Rocking MQ, however, will support this feature. We hope to support several quality of service of duplication message detection, eliminating duplication in various timeline. So let's make a little more detail about our on-the-way feature. This is a scenario message. Yeah, there's no doubt it is like Linux timing wheel algorithm, but we use the hierarchical timing wheel to get more fine granularity. Well, this is another on-the-fly feature about Rocking MQ, transaction message, even to the consensus. Let's take account transfer for example, Bob and Smith are friends, Bob want to transfer $100 to Smith. What would he do? There's two important actions. Subscriber $100 from Bob Smith, from Bob account, send a message to tell the Smith account to add a corresponding dollar if they use MQ to solve this problem. Firstly, we usually send half a message to broker. Secondly, we will do local operation just like a subscriber, $100 in local transaction. According to local operation results, we will commit or roll back this transaction. According to this result, broker will commit or roll back this message as this feature depicts. So this way, we can reach a consistent, even truly consistent. So next, please allow me to introduce Aliware MQ, our commercial distribution clone MQ. This is our portal. I will give us a brief introduction for Aliware MQ features. Firstly, in Aliware MQ, we developed a high availability different from open source high availability. From this feature, we can see clearly they have two important components, controller and zookeeper. Here, controller provides watch broker status change, handle state machine transaction, and push new status to the key. Zookeeper provides maintain persistent state machine, maintain ephemeral broker status, and notification when the master has a disaster. So this is another of our messaging feature about a messaging trace dashboard. From this feature, we can easily check the message status, such as success delivery and consumer failure. This is our method of visualization of trajectory. Recently, we published this feature, Kafka method service. Now, you can send a message to broker and consume it using JockMQ client. So this is an action web browser. This is another exciting feature, Internet of Things. We can publish message through MQTT protocol. As we know, MQTT is a machine to machine. Internet of Things connectivity protocol. It was designed as an extremely lightweight publish and subscriber messaging. It is widely used in IoT career. So next topic about our dev ops monitoring and administrating. Firstly, let's go in the back and this feature. This is our dev architecture. The bottom layers is about kernel trace points. JockMQ has exposed its open API. Based on this open API, we can use MQ admin tools, a graph dashboard to administrating our MQ results, metrics, and operations. So I will show you the most frequently commander for MQ admin. As this picture depicts, if you click the MQ admin, it will list some information as here. We can look at the output. If we're using cluster listing option, we can see our output is such as the cluster name, broker name, broker ID, or drives, broker version in TPS, out TPS, the broker live-in, and this usage space. So another option for MQ admin is consumer status. If you want to look at your consumer status, you can click this commander. We can see another option about MQ admin, query message by key. That is what I find before. If you click this commander in shell, you can look at the output like this. QID is a QID. So how we query message details to achieve it? Yeah, because of a space problem, I just list some fragment here. There are more information about a message. You can try it in your computer. This commander, you can use to look your detail about your message, such as this message topic, this message types, keys, QID, Q off-site, commit log off-site. We consume time. The message bone, the message storm, message bone from where? Stone from where? System flag and the properties. Yeah, it is just the iceberg of a message. You can try it in your computer. Click this commander. So coming to next topic about RockMQ performing tuning, I will share our thinking about RockMQ performance tuning experience this year. Let's look at this picture. We abstract our performance into this picture. We tune our application from these three directions. Java application. We use clean code, static code, and a static code analysts effect to concurrent. We use log-free data grade and back-on algorithm. As for Java virtual machine, we using G1 garbage collection algorithm to replace CMS. Yeah, we're using adjusting time comply to inline some most frequently visitor. We're using, we're doing GVM option to avoid, stop the world disease, mostly low latency reason. So as for Linux kernel, because Apache RockMQ is heavily dependent on page card, so we try our best to build a batch card flash and page forward. So using these tips, we achieve a relatively satisfactory result. The following screenshot virtualizes RockMQ's behavior in throughput and latency on a machine with the configuration. Yeah, we can see from this picture, RockMQ has reached about 500 TPS. The size of our body is 120 and 28. So from last year's double-11 shopping festival replay, the Holy RockMQ broker achieved 99.99% of the delay within 10 million seconds. And 1990.6% were delays within 1 million seconds. So next topic about RockMQ's future. Talking about the future is full of challenges. Apache RockMQ was no exception. We hope Apache RockMQ, the fourth generation, are oriented for e-commons. In this area, we must, we need hero concurrency and the next area is internet of scene. In this area, we must keep the massive online service. So the third area is FENICE. In this area, we need our broker high reliability. And the next is a big data direction. In this area, we need high throughput. So this is our development plan. We have published it on our website. If you are interested with these features, please don't hesitate to join the Apache RockMQ community. This is our ecology overview. We want Apache RockMQ based on open messaging and support online analytic processing and streaming processing and offline processing. Do you have noticed the open messaging? Yeah, this is an interesting topic. Sorry, I will illustrate the open messaging. As we know, messaging and streaming products have been widely used in modern architecture and data processing for decoupling, querying, buffering, ordering, replicating, and so on. But when data transfer across different messaging and the stream platform, the compatibility problem arise, which always means much-addition worker. Open messaging is a window-neutral and language-independent specification, provides industrial guidelines for areas of FENICE, e-commons, internet scene, and big data, and aim to develop messaging and streaming application across heterogeneous system and platforms. We want these good and satisfying modern GLOD-oriented messaging and streaming application. So we are always very happy to help contributors, whether for tribute, cleanups, or back new features. We want to have high quality, well-documented codes for each programming language, as well as the strong ecosystem of integration tools. We strongly value documentation integration with other projects, and the globally and gradually accept improvement for these aspects. As for community, we have adopted a strategy similar to Apache other top-level project. We interact with the open-source community, so activities like Meta, Workshop, and Code Marathon, hoping more and more contributors and committers love Apache.com Q. This is my sharing. Thank you for your attention. By the way, I have listed some articles about Apache.com Q, except for the first article other articles are writing in Chinese. Much more details are here in joint. Thank you. So any question about? Yes? So we don't know what you said, but trust me, but 1.4 trillion messages through your system, only one cluster or multiple clusters? Multiple clusters. Yeah, in Alibaba, there are about 2,000 machine. Yeah, we divide the many clusters for different subsidiaries. As Monty already mentioned, Alibaba has lots of subsidiaries. We have this cluster for this subsidiary, this cluster for this single cluster. Yeah, what's the maximum? Yeah, we have no sum of the single cluster support and low latency. It is ever. So we have some tools, shell tools, to monitor and metric the global cluster. So as you know, every cluster has a different machine skill. This cluster has many, many machines. This cluster can have only two machines. So we have no too. The 1 trillion message is the global support, yes? I noticed that you spend a lot of time tuning the JVM and the Linux kernel. Are those places where you see a lot of slowdown for throughput? I mean, how significant are those changes? Is that where a lot of inhibitions you are? Or is the slow link inside RocketMQ itself? With RocketMQ, for example, gain from patches to the Linux kernel, for example, rather than just tuning the Linux kernel. Yeah, we're tuning the Linux kernel some parameters. As we know, low latency, as for memory exercise, there are two reasons for low latency. The first one is memory reclaim. RocketMQ are heavily dependent on page cache, so we must tune in this first. The second is page fault. We're using memory log. Log as much as the page cache. So avoid the swipe out and the swipe in to achieve this low latency. Besides this, we have tuning some algorithm about messaging persistence algorithm using just as a log-free algorithm, yes? So in master kind of broker and the slave brokers, can you have a fadeover between? So if the master broker dies, will the slave broker become the master? Yeah, yeah, yeah. In this page, in RocketMQ's design, slave broker is just a consumer of a master. So when the master deserter comes, the standby broker will pass it down. So how do the producer and consumers know about the failure of the master? Like how do they recover from that? What is their, do they go for the name server or is it automatic? Yeah, yeah, yeah. In the client libraries. Yeah. We're using ZooKeeper. Yeah, well, you can, ZooKeeper, to monitor the status of the master heist check. Yeah. But I thought the ZooKeeper was also in the open source? Yeah, your question is about the name server and all broker? No, so the ZooKeeper concept, is that also in the open source? Yeah, yeah, yeah. Any question? Yes? You mentioned the batch set message. So your question is, yeah, yeah, yeah. Now this way provides an interface. The interface has one parameter. This parameter is a collection API. Yeah, we're using this collection to send batch message. You can assemble your message in your code. Yeah, it is not like Kafka using an async slide to accumulate a message. Yeah, when this message size is rich, it can drain too. Yeah, so the message, through the broker. No chemicals does not slide. Oh, yeah? Yeah, we were using some metrics to log this here. And the statics using this value to log what the message has not been consumer and what the message has not been persistent. Yeah, this is average value. Average value, yeah. Average value. Any question? That's all. Thank you. Thank you.