 Hello, everyone. Welcome to my talk today. My name is Mary Grigleski. I'm here to talk about Apache Palsar and Apache Pino about all about optimizing speed and scale of real time analytics using these two open source projects. I'm really happy to be here today. And if you are interested in getting a copy of this slide deck, of course it is also available if you have virtual access and I have already uploaded the slides up to the schedule.com or that site. And so you should be able to get that. If not, then it is also available here. Basically, it's a bitly link. And basically it is just store on my Google link on my Google share drive, but you can access it here. So now even if you miss it, don't worry about it because it will be this slide deck is will be made available to you or is already available for you to look at. My name is Mary Grigleski. I'm a streaming developer advocate at Datastax. Datastax is based in California is a leading data management company. We specialize in essentially database as a service. And primarily to we are very open source focus. We're all about Apache projects. First and foremost Cassandra. If you are user of Cassandra, then you probably have heard about company too, because we have the VP of the Apache Cassandra project is that our company too. And also too, you know, besides you can use it open source, of course, or you can use it, you know, launch it yourself into the cloud. Better yet, you know, is to use our database as a service. So there's no SQL big data, you know, column that database very fast and very, you know, high resiliency. It never goes down. So, and now to the company also adding in streaming. So that's where I'm going to bring a bit of an introduction about the streaming side that we have with Datastax today. And that's Apache Pulsar. And so essentially we're very much cloud native platform, everything is kind of meant to be operating on the cloud. Even though you can still use our non cloud version, but we're, we're encouraging folks, you know, you realize that it's actually save you a lot of time if you use our cloud platform. Previously, I was a developer advocate doing Java stuff at IBM. So I also was involved a lot to mostly, in fact, with open source projects. There's WebSphere. There's also WebSphere, the open source open liberty too. So that's where I was plus I also did a bunch of reactive systems to if you're pay attention to an event streaming that has always been something dear to my heart. I'm based in Chicago. I'm a Java champion. I'm also the president and executive board member of the Chicago Java users group. So that's a volunteer run group. That's I'm a really big believer in community tech community, we all learn together and share things. I also helped to co-organize a couple of IBM run sponsored meetups to in Chicago. In the background, though, I have extensive experience in doing product and application design development, integration work deployment experience. You name it, I've done it. I specialize more in event driven reactive systems and open source and cloud based cloud enable distributed system. So that's my area. And these are how you can get a hold of me but I'll share with, you know, my information with you towards the end of this presentation. Now I just want to bring kind of like quick to your attention to this talk, right? I was supposed to be doing it with Star Tree and with because of the real time analytics part and supposed to be Karin Walock. However, she's no longer the community manager at Star Tree, and actually Mark Needham, who's based in England, the United Kingdom. He was very happy to do the talk, however, I think because of the short notice and there's just not time to do it. And so I just want to bring you all to his attend or bring, you know, him to your attention, because if you have more questions about Star Tree, about a page he will be the person that you can reach out to. So these are his links, right? He's a developer advocate but he's also strongly very technical too. His experience has been a lot into like big data analytics that type and he's a great guy to work with. So don't be afraid to reach out to him to Mark. So, okay. Now, let me kind of give you a quick kind of rundown on what we're going to talk about. So, first of all, what is real time analytics will spend some time kind of giving you the, you know, the basics of what is real time analytics. And then we can then go into understanding the problems that we're trying to solve, which is the types of analytics use cases, examples of user facing real time analytics and evolution and also over time real time analytics. Now, the thing is then the next thing then is basically saying that we know the kind of problem that we are trying to solve. Right. What is the solution. So just want to kind of present to you to open source solutions. One, Apache Pino, which is the one that kind of handles all the real time analytics part right and and then the other thing is a real time ingestion right you you're analyzing all the data comes in but how does it comes in. And that's when Apache Pulsar comes into play right it is an event streaming platform so I'm just going to introduce to you these two platforms essentially and bear in mind that this is going to be very much introductory level. If you want to kind of get more information I provide you with a lot of links to that you can get to and plus companies, data stacks and also start to we have folks there to help you at any time. Okay, so, what is real time analytics right real time analytics is basically the discipline that applies logic and mathematics to data, right and then basically data that comes in right it doesn't mean anything if you don't do anything kind of useful to it and basically there's logic there's mathematics in fact we live in a world that we rely a lot in mathematics right all the scientific stuff and it basically then for real time analytics is that it provide insights right for making better decisions quickly and this comes this definition comes from Gardner right and the folks I'm sure you're familiar with Gardner. I'm a famous analyst, famous analyst company. And so if you kind of take a look at it right we are kind of all talking about events, you know, in these days, we hear about event driven event storming event driven event driven microservices serverless, all of these things. And for the other two, it is actually around the concept of events right in my own talk that I introduced much deeper level of Apache pulsar I talked first to about what is what are events right what exactly it is that we're trying to capture. It's kind of harder to look at right, especially when we right when we first started programming I bet all of us, you know you to myself included right, we learn how to do things a bit on the old fashioned way right the traditional way when we do coding. We never did any event driven stuff because it's it's hard to do it's not easy. We're dealing with concurrency we're dealing with asynchronous style of processing. So what like events are right to the to computing. And so if you kind of take a look at it right if I give the definition, I go to, like a dictionary dot com right so it's talked about events are essentially a point in time and space right, time and space. Okay, first of all event happening and it assumes itself a position in space, XYZ coordinate right if you want to bring the math in. And then there's also time to so basically to the data changes over time if you plot, you know time as a fourth dimension you can see over time right the data the value changes but interestingly to how we kind of process data in for events is that we basically treating events once it happens is basically immutable, you cannot change it right so it kind of brings in a lot of very interesting way how can we capture events and we make sense of, you know all of the data that are captured by events and then we also then have other ways of saving all of these events for later analysis kind of purposes for example. Now the thing is though in today's talk we're talking about real time events so that's why to we want to get into events because it's only through events and I don't think there are any other better way if you don't use event how can you capture all the changes of data in real time Okay so there we go. Events is one thing and from events we want to draw some insight right we look into the events we take the data in chunks you know in the huge volume we want to draw some conclusion to it some insight right to make sense and and as you can see right. The insight we draw we can basically based on the results of these insight we kind of lead to a set of steps for taking action on all these different conclusions that we have drawn for example right. And then say for example you have your data warehouse right you realize that you know a lot of people are ordering you know something let's say right now almost holiday time in you know in the US we celebrate Christmas so basically to their companies are stocking up right a lot of the. You know vendors are stocking up all of the Christmas presents and there are certain things that are very popular so basically to you kind of keep track of the you know of your inventory and then you realize oh wow that this year what is popular right let's say some kind of. You know game can think very popular everybody's trying to get it then basically to you can take a look and over time you can say oh wow. Every day there's so many people ordering and we make you know maybe 2000 units and they're sold out within two days and we know we need to increase the inventory for example right so that kind of like based on events some insight drawn and doing action on it. Now as you can see right the value of data over time as you can see value on the y axis time on the. x axis value on the y so as such to when an event happens. And basically when it happens right away you can see the value is being high so over time though then then the value of the data becomes stale. So, as you can see, then what we're trying to capture is real time is basically with the, you know that when it happens you want to capture that value immediately kind of sends it to some processing and draw insights and then have action on it that type of stuff so. Now, who is interested in this data that can be analysts, management or users right now that's the thing to in today's talk we want to kind of highlight the fact that in this day and age right the modern day and age we're talking about. Giving users access to the data analytics part and then that's what is kind of becoming kind of more important than just analysts looking at some business intelligence dashboard or management reading some reports right even all those things they are less of a real time kind of flavor to it but users and user analytics for example. I'll be showing you some example of where we are seeing users analytics becoming more and more important. Okay, so if we kind of take a look at this quadrant I really like from mark right he has this quadrant. Basically, right you take a look at real time analytics, divide into four pieces right the machine facing. That's more about observability right and kind of that aspect right machine facing it's like, it could be like your, your, you know, cloud is like running, you know, all these things and, and you're observing it and see the what is the load look like over time for example right observability. How many transactions are being performed all those things. Now, internally to then you have also human facing right so you take maybe certain kind of applications that are make sense to the business and basically these are human, you know, beings that take all of these, you know, kind of like let's say my, I operate a store, and the store is making sales things and I want to have a real time dashboard that shows you know what is the, the sales figure it looks like right over time that's human facing right. And then now, if you kind of look a little further to is that you bring in a lot more like complexity to this picture is basically kind of more complexity is actually is more valuable for the business. For example, recommendation engine, as we all know right if we're shopping for things and we want to be able to kind of, let's say I'm shopping for car and I want that car to be a certain way say for example even for me two months a year ago or actually not two months a year ago I was searching, looking for a car I actually want a manual transmission, and I always like Honda right so Honda does actually unfortunately doesn't make CRV or HRV with manual transmission so then I have to kind of look for some other way so immediately to if I input those information to a car search kind of configuration thing it can immediately comes back to me and say hey you know what, get a Subaru for example right or how do you put down Subaru right. So I actually that's what I got I finally found the car that Subaru is a SUV right and sports utility vehicle that has actually manual transmission right so it's kind of really cool I think yeah so that's one thing. Another usage right how what do you use this for fraud detection right we all have bank accounts we all kind of know that there are always thieves kind of out there right in real banks you get thieves trying to break into the bank and steal money. On the internet right on cyberspace in computers, you also have hackers trying to kind of commit crime and steal somebody's money. So this kind of kind of, you know, essentially the systems that we can deploy to is some sort of fraud detection system that are monitoring maybe your account right. Let's say you are all based in Tokyo, and then often your bank account got some debit is basically the money was kind of drawn withdrawn in Hokkaido or something right. So in that case there's a problem there and you have your fraud detection system that are event driven in nature so immediately it knows that Oh, something is wrong I get notified. I grab all the data is basically you know, this, the customer account to the customer is living in Tokyo but they're strange debit being taken in Hokkaido so please check and they will notify you and then you take action on it and you might say oh I'm on vacation so not an issue right like that. Okay, so those are examples of some real time analytics right and also tune. I think in here to that provides you with even more kind of higher level kind of a business value would be like say an order tracking system right and order that these kind of things are more external because you're tracking for example you have package delivery here in the US Web Amazon delivery we order things and it delivers on Amazon truck and so basically the order gets tracked over time and gives you real time data is basically was could be saying that Oh, you know I'm I live in Chicago so the the I order some things and basically it's being shipped from New York right something New York City so New York City comes and then to Chicago and I'll be like okay let me see it is in downtown you know but I live in more uptown so I kind of follow it and those things are what we need right in order to be able to analyze all the data and provides very useful information real time to the end user. Okay, so with that let me kind of go into the next thing is exciting right so types of analytics use cases. So there are dashboards and bi like business intelligence kind of tools that we are more maybe as we were younger back in the days right, we were kind of helping out to generate reports you know based on some data that occurs during the day we collect all the data from spreadsheet from different department whatever it is and then basically we take the data and drop some graphs and show some kind of movement of your money price of your maybe you're working for a trading firm some instruments how do they change it over time things like that right as you can see over here dashboard give you these they're not real time. And then the thing is to we live in a world everything is all artificial intelligence and you know machine learning so machine learning to in fact they're prime kind of candidate to be using like real time analytics and also kind of using some way of ingesting all these data in to type of really good candidate for this kind of use cases right. Okay, and then another thing is basically user facing analytics so, as you can see right those two are kind of older times and so now in a newer times as we can see, we all have linked in profile on your LinkedIn profile for example and by the way this is picture Karen and she was supposed to be, you know, presenting with me but she kind of left me with these slides so I can use them. So, but let me kind of kind of step through all these first so dashboards right bi tools as you can see these are kind of more like fixed kind of graphs you can look at machine learning for example right these are like they immediately kind of process you know huge amounts of data to that kind of travel through time and gives you an analysis of, you know, upper bound, lower bound profile and now Middle East score all these things you can kind of, you know, look at the data and draw immediate conclusion and alert people, the aggregation for example in this case. And here to I wanted to get to this one, like user facing analytics right we all have linked in profile. As you can see right it says who view your profile, how many people look at it and over like a spread of from July 7 to September 27 so gives you this kind of kind of a basic and now and analytics which is very useful to sometimes we like to know. So basically this I wanted to let you know is that is powered by Apache Pinot so LinkedIn is a customer of Apache Pinot so they're using that to do the produce right as can you imagine they have like many users, and they can give us kind of really fast kind of time of processing all the data to has your profile so it is pretty significant to so actionable insight so these are insight right that's what we want to draw and why should I care right so as you can see over here operators and we go back right operators are doing business monitoring and basically then the analysts then will kind of look into business insights and then basically to the insights can become, you know, in the hands of me the users, customers, and they can use this for making money to monetization to. So, as you can see, right, this being able to have access to so much data, basically help transform the business world that we live in. These are examples of user facing real time analytics as you can see right as you, you know over here to total number of users 700 million query per seconds 10,000 plus and latency right the SLA is less than 100 milliseconds so as you can see is real real fast, the freshness of the data is in seconds right so that's what it is so as you can see over here, we all know, you know real time analytics in on our dashboard and LinkedIn so here to there's also, you know, recruiters that are using it to analyze candidates right. So, are they qualify am I looking for a certain people with certain number of years of experience in certain area so and where they are located all of these things so, as you can see you know the opportunities are limitless in here. And over here to even an example of Uber eats to it might be something that not sure if it is in your area but in the US to we have Uber, and then there's Uber eats which is like food delivery service but as you can see, having this real time component to it really helps because what happened is that over here right. You can have let's say give you an example right, I own pizza place I'm the restaurant owner, and I have customers order food for me and over here as you can see there's missed order or orders right and they're down time all these things and having it real time is actually very useful. Right, you are delivering pizza to a customer and they want actually mushroom pizza and you accidentally deliver meat pizza and they're not happy they make actually go to the, to the app right the Uber app and say you know, this, you know, pizza place is bad place because they gave me the wrong thing, but you, you know, immediately you can get this information right away. If you get it you say that oh you know I'm the owner of course I don't want to lose my customer, I might react to it right away and say hey, let me make you the, you know the mushroom pizza and immediately I'll deliver to you I'll replace it with no cost, you know, you keep the other one like that right so those are kind of opportunities that you can look into having something like real time is certainly very very useful and examples of real time analytics like stripe is a payment system right. So there's a lot of kinds of, you know, kind of thing to deal with financial especially financial statements right reporting, detecting bugs and problems and detecting financial risk, managing stripes liquidity, auditing past actions all these things that you know financial companies are needing it kind of, kind of very useful to so stripe also uses a lot of it to that's a, you know, not sure if you guys have stripe here in Japan but the thing is stripe also is a payment system I believe it is global but anyway so over here to the kind of problem then we are trying to solve is basically teams are executing money movement right and you have financial data consumers in different departments so so let's see what kind of problem we're trying to face right so these are kind of unique challenges and opportunities. You want, you know, high precision accuracy requirements you know aggregations must be exact, and there are many small unit transactions plus small currency units that can vary too. And there's also like transactional level kind of integrate on granularity reports must be reproducible. And basically, you, you know, as you know right sometimes to have systems are not good. Then the problems and you are being serviced on an airline just think of that right you're ready to fly and, let's say from Japan to the US and often like oh system doesn't work and the, the, you know, the staff would have to manually do things and those kind of things are kind of pain right so as you can see in any situation you can have that and you have to kind of address them. Okay, so, again, back to financial places there are strict compliance and security requirements to and all these things so as you can see right so over here as you can see over here. Internal analytics you can actually have latency a higher because internal maybe less about making money is more about analyzing things. Freshness can be seconds to minutes concurrency hundreds of users but for external analytics as you can see, we need really super fast like milliseconds kind of latency right it doesn't have much delay. Freshness has to be in seconds and concurrency has can be millions of users to them like that so. As you can see, evolution of real time analytics so real time analytics landscape is rapidly changing right there's a start with OLAP right there OLAP system were more like batch type of systems right there data collected you want to do online analytics processing these are like evolving with these trends and what we used to have internal facing analytics and data was much more structured and approximate data inquiry consistent consistency. And also to there's queries to you can slice and dice them and that's what kind of like their data warehousing kind of technique to use it right so at that time to is more about gigabytes to terabytes of data, but look at today right. So we want user facing analytics. These data are semi structured. Also, we want to have strong data and query consistency, and these are full sequel semantics to it to and plus to the amount of data from terabyte we're going to go into petabytes of data. Now, how do we then deal with a building a user facing real time analytics system right. So, as you can see right there are real time ingestion, and you know, high dimension, now melody and philosophy of an ingestion has to be has to we have to meet these kind of the requirement. Essentially, this system has to be highly available, very scalable, and also cost effective to, and also to over on the other side is basically then kind of can generate the seconds freshness. Thousands of you know, query per seconds and milliseconds of latency so very, very fast and not much latency. So we want to introduce to you a patchy pulsar. So let's take a look at pulsar first. So as such right pulsar is a patchy pulsar is an open source project as I mentioned to you already event streaming. And the thing is it uses a pop up architecture much like Kafka, but kind of bear in mind to there I've been asked to about what the difference is. Well, certainly there differences, because a patchy Kafka was designed back in like 2005 or six or that timeframe. So at the time, the cloud wasn't so much in the forefront yet it was still starting so but then for Yahoo, they already are you can go into the cloud and everything and then they realized that whatever Kafka was doing would not be able to address some of the concerns that any kind of cloud environment would need, especially for event streaming. I mean as such do you know event streaming is kind of when we do an event programming is a lot of moving parts it's not as much like you know when we code is static thing right it's more like dynamic kind of in nature. So it uses a pops up kind of producer consumer type of you know architecture this pattern, and essentially to then they realize on the runtime the broker which is a Java runtime broker, and you can have more than one broker to and this will help to increase throughput of your messages to get through so there are also those things being supported in pulsar, and also to as you can see zk stands for zookeeper zookeeper as some of you may be aware that's also an Apache project and that's, you know, as you can see it's very open sourcing here zookeepers basically you know kind of like managing a zoo you know many things are happening the data is changing. So I'm there to kind of oversee all of these kind of clusters and the configuration all of these sort of like look at it kind of like an overall managed manager kind of managing everything. Now, the thing is though, you still can think a look that the broker is the one that is kind of being the core of everything it coordinates and talk with the bookie talks to the zookeeper and talks to the producer and consumer so broker is essentially the center piece of everything center mind of everything. As you can see broker interacts with all of the bookkeeper, the bookies, I wanted to point out to the apache pulsar right it doesn't want to deal with it so to speak right doesn't want to deal with all managing all of the log messages it's a lot of work if you kind of think of you have to manage the messages that comes in and deliver them and there are many rules sometimes you deliver these messages right it can be if you're familiar with messaging systems and they're there can be huge chunk of data coming in you need to chunk them. There are many things and there are what about messages not being picked up you know you have or not being acknowledged by the consumer you have to keep them in the system. There are also rules about time to live right. Kind of all these things and dead letter queue right if they get if the letter, for example, the topics is that there's something right you need to have special topics to handle those things and many things so basically pulsar says I am going to deal with all the important part working with producers and your consumer delivering the messages. Let me find let me ask a bookkeeper which is apache bookkeeper and their specialty is to manage huge amounts of data. Read and write to disk and also manage them and if you have data that's already you know in the system you need to do searches and look up there highly efficient as well. So as you can see that's the architecture of apache pulsar. Okay, so real real quick right let's kind of take a look step into you know what is event streaming and a step beyond just event messaging, basically to pulsar is not just a messaging thing right messaging we're dealing with okay messages travel from one point a source to a destination. But the streaming is basically I handle a whole and ongoing delivery of all these event messages so makes it very very powerful to that that the need you know for this kind of system. So, let's take a look then what is driving the change why do we need event streaming but as such we already could talk about real time analytics is a good candidate to make use of, you know, like event streaming. So, for example, we want the data to be real time right to this will enhance your customers experience right create a competitive advantage for your business to. And another thing is basically, it allows you to use data pipelines for machine learning type of applications for AI type of applications to, or it also helps with scalability aspect right. You have many brokers handle all of the messages that comes in, but because the broker doesn't have to worry about where to send the messages to I only label it and say okay messages comes in. I have a topic so it's basically telling the broker how to route the messages. I have a topic label it. And so basically to in that case it frees up. All of the producer clients not to have to worry about keeping address of the receiver you decouple the center and the receiver so as a result if you need to scale it is actually much faster because of you know, you just need to have the topic right to handle all of the messages So, so essentially that's what it is and the hero to just summarizes, you know, it allows event streaming, it allows for real time processing right make decisions in real time and that's what our analytics need right not after the event in just high frequency of messages with very low latency. So over here I just wanted to point out to, there's also streaming and then versus not streaming so let's take a quick look at it right. What we are used to in the past are not streaming so basically they're extract transform load ETL process like they're more batch. Basically in just data, and then data comes in, and it persists it to some sort of data store let's say a database. So from there, you can select the data do whatever you need and then from there, you push the data down to the so to speak, stream it like pipeline whatever it is. But as you can see, it's slower because you involve a data store in between. So anytime you do this kind of data store, I think you can get slowed down, you know, no matter what you do, even though you think, Okay, it's faster faster I always faster, you still slow things down. However, for doing streaming and here's the thing you're ingesting data, and the data doesn't actually get to the disk is basically all in memory you're ingesting it you processing the data right away. And what do you do right let's say I mentioned about pulsar function to later, the pulse of function can allow you to transform all the data that's flowing through the pipeline. When you transform you can write it to a sink right it can be written to the elastic search sink, a database sink right so to speak and so on and so forth so that's the advantages of using streaming versus not streaming. And here it is pulsar I want to kind of give a, you know, five minutes kind of intro to it now. This is an open source project created by Yahoo. Yahoo contributed to Apache software foundation in 2016. It has become a top level project less in less than two years because everybody is recognizing how important is now to do event streaming in the cloud and they can deliver pulsar itself has a very native kind of cloud native awareness is designed with the cloud in mind cloud native design cluster based and also is a multi tenant kind of a model. Essentially you can have data that you can separate into different units of namespace of operations that's already kind of being done in a 10 multi tendency format. So makes it much kind of nicer to write all of these things to pulsar handles for you. Again, it separates out the compute and the storage compute is the broker handling all of the messages of the policies blah blah blah did you know kind of get the message from producer deliver them accordingly and handles all the dead letter cues whatever time to live. But at the same time, all the things you're not you're done just throw it over the fence to the Apache bookie right so it separates out so anytime you need to add a note to your cloud infrastructure. Basically, it can scale independently and this makes your job easier right for example if you use something like the older messaging systems then if you need to like, you know, do redundancy. Increase the number of notes in a in a cluster whatever you have to do a lot of manual thing right with some of the older messaging so. So that's the thing, and also to you can write your client code in Java in C sharp to and Python and go and other community contribution to including I think I believe rust is there. Scala is there to if you're into those things. Okay. Also, it guarantees your message delivery so if a message successfully reaches a broker so it will be delivered to the intended target to so don't worry about it you know if your network goes down. Once your message reach a broker at the broker will take care of things for you so guarantee. And also to there's this serverless function I talked about, essentially to serverless functions is works kind of like AWS AWS Lambda, it basically small bite size you can transform the data as it flows through data pipeline within the cluster so it makes it very very efficient very convenient right to do a lightweight to, and also it has a tiered storage off load so if you have data that's kind of becoming cold and stale you don't want to take room up in the main storage you offloaded to the long term storage so it's also kind of handles a lot of the so to speak of the infrastructural concern for you so it's kind of freeze up if you're a developer is actually makes your life easier. Okay, so what is Apache pulsar over here, I think I already mentioned about is Yahoo, and they're increasing number of GitHub stars contributors like that so now who else is using I wanted to point out Yahoo in here. Yahoo Japan is also one of our biggest user, and then there are other companies that's listed here and there are more to that are not shown in here but these are like the major initial adopters of Apache pulsar. The pulsar is different but I just want to point out this producer consumer that's what you write the code in right producer consumer client and then work interact with the broker broker will communicate with the bookkeeper and zookeeper you need to look up messages you go through the broker to so. All right, so all these I won't kind of step through they do get into quite deep about you know what why do you want a pulsar but I just wanted to highlight a couple of, you know pulsar kind of specific features. You know the data pipeline right we talked about the pipeline of think of it like water flowing through the pipeline, get into pulsar does all the transformation when it's done and output data to a sink that's the idea data pipeline, and then essentially to what does the transformation is using pulsar functions lightweight right allows a complex stream streaming processing very lightweight. It's kind of like a AWS lambda or Google function. So it's only supports like Java, Python and go but there will be more languages supported to, and then there's also pulsar schema to so schema is anytime you do need to do distributed messaging over the wire, you need to deserialize it right or serialize it to flatten your data structure. You can get it over the wire over to your destination you need to deserialize it reconstruct the object. That's a lot of work to right. And so basically if you use this feature called pulsar schema. It basically will keep track of your changes of your data structure so that just something to let you know there's also pulsar IO so I talked about source and sync, you need to write you know different connectors you can output it to a Kafka sync, a source could be Cassandra, whatever it is, like if it isn't there, you can write your own using pulsar IO. Okay, so these are data sex flavors I don't want to spend way too much time because this isn't so to speak you know these are our commercial side of things these are extra streaming, but you get a $25 credit if you want to use that to test things right essentially is pulsar already managed in the cloud is it's kind of makes it very easier. $25 per month and $300 a year so give that a try right I have the link in my resource section to there's also Luna streaming right enterprise support and open source to if you want to do an open source. Now, back to here. So, building a user facing real time analytics we use pulsar and then who else we combine it with Apache Pino Pino, I just want to quickly kind of point out to you as you can see there are different sources, you know going to the events is very much event driven kind of way of doing things to and basically to real time ingestion, it has, you know pulsar coming in the Pino controller, and then it also segment off you know using zookeeper of the different server and then brokers will handle all of the queries to, and it consumes and index and serve and one thing just wanted to point out the star tree index in here. It's actually very fast is their proprietary algorithms to but like I said you can also talk with market will give you more information. And summary to as you can see, you know latency wise is milliseconds freshness seconds and concurrency is millions of users. Okay, start tree index so I won't go into all the detail, but this is what who other companies using Pino right stripe I mentioned LinkedIn is a big one too. And then with that, you know, I think I will just go into here to share with you, you know, these are Apache Pino, and how do you reach mark, you know, and he also has this example of using pulsar and then feeding into Pino to to do a simple test case to. And then Apache pulsar as I mentioned, astra.datastex.com is where you go and get a sign up for a free account get $25 credit to helps and I guess I guess it's here. Okay, I think here. Okay, back here. And okay, these are just additional discount and things and also oops, I think you have to go back and okay. I think I think I have to basically go back here, sometimes to this one. Okay. And then if you want pulsar on YouTube, there this is the link and also I have my Mary's Twitch stream on Twitch every Wednesday at 2pm central time, please follow me I'll do a lot of hands on coding. Also, follow us on Apache neighborhood pulsar neighborhood you can contribute to it is our wiki page. With that, I want to thank you very much. I believe that I'm out of time now but thank you so much for having said through my talk and feel free to reach out to me, also to mark for more Apache Pino questions. Thanks a lot. Enjoy the rest of your conference.