 So I'm going to talk about analytical dashboard and how do you build them at scale. So this is one of the effort that we have in before that flip card before for information is our analytical platform which we are building so you can check it out at our booth. Okay, so what's so special about that everybody has seen live dashboard and they know how to build it but they are two different differentiators here. We wanted live of course we will want our dashboard so we actually updating with live information as it's flowing in so you have clicked an order. You have browsing our website. We want to create dashboard that update that reflects the latest information that is happening on our website. Second thing is analytical. So most of the dashboard are not analytical in nature what they are used mostly for its monitoring. Yes, we want them to be used for monitoring but we want or we would allow our users to actually drill down to filtering and do other sort of things that they normally get. They cannot normally do it on a regular dashboard. So these were the two motivating factors. Once we would want to give business users the business actually what is happening inside flip card and second thing is we would want to do we want to make them powerful enough you don't want to we would want to enable them to actually go and drill down the data a little deeper. Okay, so what we have what we have as a system so flip card is largely has adopted so a service oriented architecture. So what we have is a lot of services. So these services actually they are microservices services that take care of usually a very small part of the process. They actually deal with certain a couple of entities in that process and what is happening is that people are just they are orchestrated written that go and update entities to update the process data and these services they what they do is they generate event in the process they generate update event what we have is a lot of live event so by the way those who are those people who have a question that why do you need event? Why don't you go to the database and pull out all the information first? It is going to be very cumbersome. Hundreds and hundreds of services going to each database every some few seconds and pulling the data out of the database is not going to scale at all. So what we have what we have done is that made sure that services they push event they push update event in a well-defined semantic format. So once we have those events there are certain challenges that there are certain challenges in building live dashboard. It's not very simple that you just collect those events right to some processing pipeline and just sort them in a database and then start doing query these are the three major challenges that we start we face when we start building this analysis of them first is metric definition. So these are not just just a random kind of event. Okay, so these are a function of fees. So like like I told you in the previous slide what we have is a set of services and these services are actually dealing with entity and they are generating update event. So metrics are not just the counts of how many events have happened like this is the common case in when you are doing web analytics number of page views number of sessions number of checkout then you have a conversion funnel and you are done. No, that is not the case here. What you have is event that are actually associated with an entity that have changed the state of an entity. So you are dealing with set of entities in a in a certain point in time that even so what you have to do is you have to take these events you have to find the entity make the changes reflect on these entities in our data warehouse and actually extract the field relevant field from there and then on each event or a batch of event we have to statistical analysis. So suppose we want to calculate the average we want to calculate the percentile. We want to calculate a histogram we want to calculate the distinct values. So we have to collect a batch of event there and then associate these values to a set of dimension. So every event will have set of dimensions associated with the dimensions actually set the context of the update. So for example, when you place an order the dimension would be who was the customer? What is the product that he bought? What time he bought it? How many of them he bought it? So these are the dimension that we get with each event. So we have to actually calculate the metric extrapolate or basically report it on certain dimension. So the second is scale. So when you have took as in this slide again, let me put this slide again since they are hundreds of microservices which are catering to 10,000 of request for second from the user in the process. They start generating some lack of event 100,000 of event in the process. So we get into the second second set of challenges with the scale challenges. What you have to do is you have to start getting the dimension. So first step is that you have to actually merge the event to associate that entity that's the first time first thing and then you have to find out the dimension from these entities look up into the dimension table find the actual dimension. So for example, a product a product is usually identified by an ID and even right, but that product actually has multiple levels of dimensions within itself for example, color for example brand for example form factor. So there are n number of attributes associated with these dimensions and we would like to provide filtering and flashing and desing capabilities on those dimensions as well. So what you have to do is whenever an event comes in to your system, you have to go back look up the dimension that is the all the associated dimension and actually ended in this the event at hand. Again thing is that once you have the event your process the event you have to write back and when you're when you're writing back since there are so many events you have to you really need a data store which can support high report right. Then while querying these are dashboards you are updating at a very high frequency say one minute and it does say 5,000 users that are pulling updating these metrics these dashboard what you need is a data store that has low query latency. So in order of some few milliseconds not more than that so what we have what the requirement that came to us was around 10 millisecond 10 millisecond is the cap for 99.9 percentile under that you have to have to serve the metric for the dashboard and typically a dashboard doesn't contain just one measure. It contains a lot of measures and when the entire dashboard refreshes for say 5,000 users then it actually is more than 5,000 calls to you. And finally since all our measures are actually reported against the dimension you need a database that support storing measures across a dimension. That means that if I want to filter on certain dimensions or certain combination of that dimension the query should be fast so there should be an inbuilt support for multidimensional queries into the data store itself. Then reliability since it's an analytical dashboard that means it will be used can be used by analysts who are monitoring the progress of a process or it can be even used by control towers in our supply chain where they are actually monitoring the entire supply chain pipeline and they would want to know what is happening in the supply chain pipeline. Reliability is a very big challenge for us. So most of the dashboards or most of the dashboarding techniques what we have seen or the architectures we have seen they don't care about reliability that much what they talk about is a general trend of what is happening. They don't talk about okay, what this is a general trend. What is the actual values there? What we have the challenge at hand was will dashboard that are actually accurate and to meet that accuracy standard we have some challenges that we have to maintain the consistency of the data and we have to be call tolerant. So to give you a simple use case around consistency again, we have event events that are update events that are changing the state of the process. So you have an event a occurring at time t1. It is going to trigger update in other entities say even e2 in the same time frame when you are joining or when you are aggregating data across these events. You have to take into account that I am aggregating or joining these events again the similar event. I cannot say that I am going to join an entity at time t1 that update at the time t1 with an update at time t2. No, you cannot do that because the results will be wrong. They will not reflect the current state of the process as it is okay and fall fault is something that you have to deal with in a any distributed or cloud environment fall can happen at various places. Your system can go down your source system can go down. Your own analytical system can go down your data store window go down so every place is wherever there is a downtime we have to anticipate that and we have to actually built in fault tolerance into our analysis system and usually again coming back to the traditional dashboard. This is not the case if the fault occurs because simply accept that fault and move on because again that they are used for monitoring purposes and left for analytical purposes. So having said that what is the solution so real time let's scale is equivalent to stream processing is equivalent to Kafka plus Tom that is that is a standard set of technologies that everybody use. How many of you know Kafka and Tom good number of people for those who don't know. Let me give you a simple introduction. So Kafka is a queuing message you that was built by LinkedIn for a similar use case where they want to gather a lot of event from a large array of servers and do analytics on that. It's reliable it has actually with the 0.8.1 isn't 0.8 release they have also they are also supporting their application building in more reliability into it. It distributed distributed at score the positioning of event was built-in you when you create a topic in this message you you have to create partitions for that topic and you can distribute these partitions over multiple brokers second thing is Tom again Tom needs no introduction but still Tom was built by Nathan Mars who is currently at Twitter. I think he has left Twitter. I don't know about that but he built it for stream analysis of Twitter stream it again built on distributed distributed stream processing system what it has it has to work on that one is about and one is a bold what is about about is basically data from some kind of a queue and actually send it to both both are unit of computation these both they run on the incoming event sent by the spout in a and you can distribute actually spouts and both independently on multiple machines. So right now stream processing has talk of the storm is equivalent or it's in on M for stream processing so we went without any much thought process we went with Kafka plus now it has proved to be very stable up till now. So what we do is take all our system take all the event and fund them into Kafka and from Kafka this Tom topology which pull these even from the from Kafka and does the process but a story doesn't end here. There's a lot going on behind storage the story it is we are not so fortunate with the storage choices actually in the storage domain. We have a lot of choices so these are the requirements on which we actually narrowed down on the various storage system multi-dimensional support that I talked about that quarries the store should be optimized for quarries on multiple dimensions optimized for time series query. So these are dashboards they are necessarily going to give you a display of your system with every interval of time. So time so all quarries of the of time series in nature look what is response time high-right support that we have already discussed and it should be scalable. So as and when the adoption of the system grows the amount of data grows exponentially with it because whenever we integrate one system we are not integrating one metric we are actually integrating around hundreds of meters from that system. It has to be scalable and the best way to scale is scale out horizontally for us. So what were the options? There are a lot of options. There's druid there is open TSP there are people at ever fight there they have a good database there. We had Vertica we had plain edge page. There are a lot of choices in that field but we settled with open TSP reason being that we have used open TSP in the past and it proved to be stable and proved to be true to provide most of our requirements except for locally response time the response times were acceptable where they were within our 99% time. So we went with that the only problem with the open TSP was that it does not support support Kerberos. So most of you people are whenever you actually create a data warehouse you are actually have to be aware about security concerns around whenever you are aggregating data at one place so far to address the same concern how do our database Hadoop ecosystem come with authentication protocol and then implemented Kerberos and open TSP doesn't work with Kerberos. So what we ended up doing was so I'm just reporting the path and passing we have to actually implement a very scaled-down version of open TSP for our purpose. So moving on and metric definition what we did for metric definition. So the challenge is metric definition is I have already told that it has to be a function of media fields in that in an event but there's another thing is that with storm coming in if you people know the architecture of from you have to build a storm topology you have to actually write a Java closure or whatever kind of a code you have to compile it and then you have to deploy it on sound cluster that was not acceptable for us plane reason we have both business users and non-business users and we didn't want it go we would didn't want it to invest in engineering bandwidth in just writing storm topology catered to different metric what we ended up doing what we ended up requiring was a DSL through which we can express our queries and simply deploy it on Tom cluster. So during that time summing word was still not out so that you could have written any kind of a DSL or something but Tom from strident I think called Tom strident. So Tom also came up with concept of compute unit that was still not out that was telling Alpha so we needed a DSL that that analytics or analysts actually and business users can use it very easily and most of the analysts and business users are actually comfortable with SQL right SQL SQL like concept. They are very easy. They know how it is used how what it conveys what are the different cement syntax and semantics about it. So we first and we found something called as per how many of you guys know as per I think hardly anybody so as per is a complex event processing library what it provides is that's the most interesting part is a SQL implementation of that so you can push an event to it write a SQL processing or you declare your what type of processing you want using a SQL like construct and as per is effort will proper process that again there's a lot of details hidden around how we are using effort effort isn't even processing system complex even processing system that is usually used for monitoring event stream rather than processing event stream processing is a side that monitoring is the first class first class concern of effort but we needed reverse we needed processing as the first class construct rather than monitoring so we actually went ahead and you we had to modify not not modify the effort for but we have to modify the way we were using effort to actually get the processing so we actually read the source code found out open interfaces and actually what the processing out rather than the monitoring part so this is how our top topology look like so there's an event on the left-hand side that systems are pushing in into kappa kappa then they are kappa kappa these are actually pulling even and giving it to dimension lookup and nature board actually and nature board join these women to the entities they do a dimension lookup and actually create and put it back into another kappa and an enriched event so in this event is now an object it it's a nested object which is the original event plus the associated entity and the associated dimension the second part of topology took this in this event the kappa first provided them to an effort board so we mounted effort on one of the boards of one of the board and then converted these as per both actually converted the incoming events to a metric name dimension in value pairs associated dimension the value and the timestamp and put it toward it in a open DSP open DSP actually understand a metric name the associated dimension and the value so so far so good but this is not the end of it because we haven't really actually taken care of reliability we haven't actually taken care about a lot of aspects so let me talk about them before that reliability so all are agreeing had different points of time there are great that needs to restart the system then their metric definition that have burst they need to be redeployed for all these what you need to do is you have to replay the entire processing pipeline and that one of the learnings we have is that you cannot delete your old data you cannot you delete your old data or a longer period of longest period of time not one day not today for entire seven days 15 days reason being the birds usually get detected over time debugging you find out the cause and then you actually do a redeployment you need to build and replay for replaying the classic thing is that you checkpoint your computation at the same intervals of time once you have check pointed said that up to this point the computation is fine and once you have to replay you to play after that the point of the player previously stored checkpoint so all this is actually already in built into Tom and that's called transactional topology you can go and check it out the transactional topology actually allow checkpointing of a batch of event you can collect so out they need to collect a batch of event to transactional topology and transaction any point at any point if there is any exception or if there is any issue the entire batch is discarded and can replay the batch again. So we use transactional topology and for batching we did something again something different. It's actually back on time rather than creating random batches so and this time was actually even time or if you go back you see that every entity needs to have a timestamp when it was created so actually we passed event on time on the entity create time what is allowed us to do a was that we could now group events occurring together at one place and then we can do such statistical analysis that okay say in an hour or in a minute how many events occurred or how many say how many am I three were bought by a user bought bought by the customer so we would we were able to calculate status it we could do window join so there are two types of joints in streaming world the most favored one is lookup join basically you have a database of dimensions or any other entity you go and lookup by ID and then do a joint or you do a window join basically if you have two streams of event you time align them you align them on time and then you join the event in the same time window that allowed us to window join and it actually ties back in our early discussion that event that have triggered other event in the same time range in the same transaction could not tie them together and another advantage that you can now we are not tolerant of out of order event a lot of time what happens is the source system is down okay and it has queued up all the events that it needs to send and it has happened that sometimes the source system is down because of various reasons for and for hours and hours and you have to actually get those events and you have to actually attribute them to the right time window otherwise your graph or the pattern will not be correct for a diet code it allows us to actually be tolerant of out of order event as well so putting these time batching plus transactional topology together we came up with a second step if you have noticed that the solution was Tom topology three dash so this was our first step this is what we substituted in between so we have an endless event then we have something called a time batch bold which actually created time based queues or I'll talk about that how we are going we are creating those time based news in an edge based table so we use edge based for actually creating these time better reason being at base because it gives us sorted keys and it gives us good time look up so this is the schema so table one was our event queue so event queue the key schema was even namespace basically what was them what is the processing namespace or what is the process ID slot which is just a key I didn't fire that this is a slot or a time slot and then there is a batch ID batch ID was nothing but even timestamp and finding out which interval it actually like so if you're catering a one hour interval so time batch will be 11 o'clock 12 o'clock 1 o'clock and so and so forth and the actual value so the actual value of the event or actual value of that bucket was put as column and we were just even just on so edge based the table structure is key is a batch ID effectively and each column had the value and the column name was the timestamp so that's how we get a batch there was a second table which was a update log of this event queue remember I talked about that we have to be tolerant about out of order even what we want is that whenever a older batch or a batch back in time is updated we want to know about that okay, so this is where it came in handy whenever a batch in batch which is older in time or which is currently not within the given time window is updated we actually go back and create a journal entry so in this the key is batch ID and the version version of that batch that version is simply a non monotonically increasing number which we are increment which we are adding at the end of the batch ID which simply says that this batch ID has a new version so what we got was a set of keys increasing in order the last key of the most isn't he says that what is a batch ID which batch ID was recently updated so what we did was we created a simple time batch out which is reading from this event update you so it just has an iterator over this event update you and it is finding out which batch is recently updated so we just iterate over this we find out we find out from the last checkpoint which is the batch that is updated pick it out read the entire row which has so many columns which indicate one batch put them together as a batch and give it to the transactional topology and the rest of our topology was same nothing changed there so putting this together so this is our first topology which end this event putting a recap this is our second topology which batch those event and this event and this is the third topology which actually read from these time batches and put it in a dimension source so together they this entire system allowed us to achieve reliability and accountability at amazing degrees we could even debug so whenever there's any discrepancy in the matrix we can actually narrow down it on which time range there's a discrepancy we can go back replace those we play those events and find out where the problem lies so and we can actually we have proven time and again that you are accurate with the offline analytic system to the last rupee or to the last event count. So learning replayability I think is the most important tenant when you are actually building analytical dashboard. There is an event and entity schema should be declared before you should not be dealing with entities or unstructured entities plain reason being debuffing and actually building accountability becomes big problem you need to have checkpointing at the regular interval so that you can actually play checkpoint one to checkpoint two that is very essential for debugging or for actually correcting some mistake during that point of time bootstrapping so bootstrapping is something that when a new metric definition has been deployed you want to actually run it on the path data to see if the results are correct or not so you need to bootstrap or basically some people call it creating a sandboxing so sandbox or bootstrapping whatever you want to call it you need to build that so that you can test out new metric new definitions on new processes in isolation sidelining you cannot lose event again being a reliable system cannot lose event if you see that there's an problem in any event sideline them take those errors in the event and play them back again and finally call it all the time so your system has to be called all at all so so we got about 10 minutes for questions and if we have anyone from Nintra also in the audience I was told to ask for you so please come up to the stage thank you very much hello hi listen the question is that when you use transactions both Bolson storm did you face any issues while running in production with respect to throughput and how did you go about optimizing the bats? Okay, so we actually the number of events that we were receiving was very high and the so we ran these topology that slightly two different frequency so one the batching frequency was very high actually it was creating batches every some few millisecond and the actually batched the topology that was running as far as creating the actual metric it was running at a slightly higher slightly lower frequency which was around half a minute to one minute and so that in the meantime we had enough number of batches so that we can parallelize so that was one thing second thing is that we were receiving so many events for so many different system that you could parallelize or we can distribute this computation like crazy because distribute all of them in the time so we didn't actually see any issues running batch topology so we yeah, yeah, did this answer the question? Yeah, another question here actually I have two questions first one is about the infrastructure that we have this amount of data you get into your Kafka every day how many hardware systems you had deployed what are the events per second that you get and you expect and the second question is about do you have any machine learning skills deployed on these clusters and how they were in tandem with storm or Kafka so we haven't deployed any machine learning up till now around the event we are seeing around 2000 to 4000 at the lean period and around 100,000 at peak so this is the event so the deployment is right now on a small cluster we are not processing all the event we are discarding a lot of event so deployment is not more than 10 machines right now they are three five machines for Kafka collect all these events and put five machines for running the storm topology your Kafka and storm co-exist together on the same node or they are on different so we started off co-hosting them but later on we separated them all because of scale issues. Thank you and we also have two different zookeeper instances for one for Kafka and one for storm because Tom actually storm transaction topology for doing check pointing it actually right back into the zookeeper for checkpointing so zookeeper is the database so actually we have to separate that out because it was overwhelming Kafka thank you hi hi so how are you actually sharding the data that you are storing gives it the automatic de-sharding or do you have your own sharding policy no it is automatic so we haven't tuned open to be up till now you guys have but no that's what I said that open to be we are using two instances of open to be one for a separate purpose second thing we couldn't make it work with corporate we actually wrote on a very small scale down version of open to be basically similar construct similar principles but we wrote it ourselves it was to to class files that's all and what's the what's like the data written once for the dashboard and it's okay archive after a while or just kept it for now. Okay, so we actually format we actually start was for experimenting with the Lambda architecture we didn't really completed it so our policy was we will take the data in streaming dashboard or in open to be for as long as some few days until the batch processing system can catch up that was configured for three days or seven days I don't know remember but three days of 70 not more than that hi we had a we had a use case on real-time analytics here here so we were using just Tom I wanted to know why we come up with Kafka plus Tom and why not just Tom because Tom can also do your stream process is not for stream processing Kafka is for storing even now Kafka is a very unique message you which is very different from traditional message you is in the fact that you can have multiple consumers for the same queue so the consumers maintain the state of the state of the messages up till what point they have read not the server. So if you take for instance rabbit MQ or some other kind of a queue the server actually maintains what length a consumer has read or how many messages the consumer has pulled out of the queue but in Kafka it is reversing the consumer who does the state management so can have multiple consumers reading from Kafka point number one second advantage is that since Kafka never delete messages theoretically can actually make Kafka delete messages. There's a retention policy there but since Kafka theoretically cannot does not delete messages you can actually go back and start raising messages on the start that gives you bootstrapping and it gives you replayability so that's why we went for Kafka. Okay for the next question we have one right up on the screen right there. Can you talk about the size of data? Can you would you be able to shed some light on that one? Third one from the top from the bottom. Can you talk about the size of data in GBs? So I don't have statuses around the size of data in GB what we calculate was the size of data in terms of number of events that we are processing so the number of events I have as I've already told you that the number of even that we see with 2000 to 2000 in steady state or in very when there's no peak at because we are getting around 100,000 events per second and what are the benchmarking metrics so we actually we had the benchmarks of the requirements on the customer so something like I should have metrics updated every minute. I should have metrics the query time should be not more than some few milliseconds and say not not more than say 10 milliseconds for 99 percentile. So these are our actually guidelines. These are our SLAs and we try to engineer our product to meet those SLAs. So first stream processing you know, we cannot process a lot of data so we restrict restricted ourselves to three dimensions of not more than that but these dimensions were actually very large. So they were product dimension user dimension which were runs into millions themselves by themselves and the event size the actual event size we observed that in we didn't get more than 10 KB of an event or as a site so that was the highest that we got question here. So after here after TSDB like what do you use for dashboarding? Okay, so we so we were we actually created our own dashboard using nvd3 which was written over d3.js. We didn't explore any existing that voting solution because none of them provided slice and dice or basically filtering capability so we had to create our own so it is it like a dynamic dashboard anybody can comment so not dynamic. So we actually the dynamic is something that we are building right now. So once we ended up doing processing there was a team that went ahead and create those that work for the user because it mentioned like SQL stop over no SQL was just for create a processing right? Okay, so the team which was responsible for creating metrics measures versus the team which was the first which was responsible for consumption but two different teams. So we are actually developing them both of them in parallel. So one finish that job. Second person is second team is doing it. They got another question. So do you think that tried into stable as of now? So can can you like would you want to move away from as per to try? So yes, we may want to we can but again, we have to provide our own DSL to work with Trident or enable business for X or actually convert SQL into a Trident contract. We have to do something like Trident has like a general SQL constraint. Trident has a little construct but you have to still go and put them right? You have to go and write code come to actually deployed on on stop what we want is that a customer just declares the processing pipeline and done with it and our system takes care of deployment making it live taking it everything. So so somebody declares it in as per somebody declares it in SQL like construct which is supported by as per right now. Okay, so it is called as per QL by the way. Okay, okay, we have time for one more question. But before we do as you can see a peach anybody working on please do not do any self promotion or any things like that we'd appreciate if we can just stick to this with only questions about the actual content going on here. Thank you very much. We appreciate that we have time for one more question after this guy we're going to go these two and then that's it. Once again, you can take any other questions on to sb.lk slash has geek. Thank you. Yeah, so we are building the audit system at the start from the start begin from the end. We are building an audit system which collect these checkpoints find out how many events are processed and report if there are any anomaly. So we're building it right now coming back to how many what was the throughput so actually throughput was around updating 100 measures for now. Again, I said we are getting a lot of even but we don't have that many measures right now. So it was 100 metrics for now lot more than that and point tuning we learned that all the system needs to use their own databases. So that means if there's a storm and it's doing check pointing in zookeeper you cannot have a common zookeeper cluster while another system is using so you have to decouple the system and logging another thing is logging. So if you leave if you just install plain manila storm it logs a lot. Sorry, kappa logs a lot. You have to actually do not otherwise it's going to eat up your IOPS in their plashing today. It depends on your processing your processing is bound. Yes, it is going to see people. So if you have seen we have actually separated out if you bound node and IO node. So if you bound note were actually concentrated in at per volt and all I have bound note were actually separate from topology. So you can actually kill them independent. How is it different from alarm that they do use? So that's what I said that I can't say if it is different or same we started out with that aim in mind. We haven't reached that point yet. We started off with that if there will be streaming analysis or stream processing that will be done by or more real-time processing that will be done by Tom or our stream processing engine and in parallel we were developing the back processing pipe but we can't really say that since okay, so we haven't seen or we haven't actually nobody has shared an experience of working Lambda architecture. Whatever Nathan Mars have talked about is again very theoretical in nature and people like Jay Krebs who have created Kafka from LinkedIn. They have law force talking about what are the fallacies of that architecture which might or might not work. So again this is his point of view should not take it as that whether Lambda architecture is just point of view. So it's again very theoretical in nature Lambda architecture. People are experimenting with it. There's only one prototype that has come out right now summing bird from Twitter apart from that there has been no other system apart from the last yesterday. Okay, perfect. That's all the time that we have for right now. Everybody please give a big round of applause.