 Hello and welcome, my name is Shannon Kemp and I'm the Chief Digital Manager of Data Diversity. We'd like to thank you for joining today's DM Radio webinar, All the Difference Rapid Discovery on Modern Data, sponsored today by Zoom Data. It is a deep dive in continuing conversation from a live DM Radio broadcast a few weeks ago, which if you missed you can listen to it on demand at dmradio.biz under podcasts. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. If you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the upper right-hand corner for that feature. For questions, we'll be collecting them via the Q&A section in the bottom right-hand corner of your screen. Or if you like to tweet, we encourage you to share highlights or questions via Twitter using hashtag DM Radio. As always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now let me turn the webinar over to Eric Kavanaugh, the host of DM Radio, to introduce today's webinar and speaker, Eric. Hello and welcome. Hello, hello, hello everybody. Thank you so much for your time and attention today. This is your host, Eric Kavanaugh, host of DM Radio. Very excited to be online once again with our partner, Dataversity. Thank you, Shannon Kim, very much. And yes, indeed, the topic for today is really fascinating stuff, folks. I've been tracking this for a while now, and it's really cool stuff. So I'm excited that you took the time to join us today. All the difference, rapid discovery on modern data. Well, we'll be hearing from yours truly, of course, but also on a rock, Tandon, Vice President of Product Management at Zoom Data, a very, very interesting company that has come along and is really changing how we look at the whole concept of discovery. And it's just in time, I have to say. So what is modern data? Well, by and large, it's big data. That's what people are talking about, and there are many different kinds out there. Think of all the sensor data. That's actually an aircraft cockpit right there. Boeing these days has thousands of sensors all over their new planes, feeding data constantly in all sorts of different ways, obviously for safety precautions, but also just to understand what's going on. What is the plane doing? How does it work? How can we gather some more information for keeping these passengers safe, for example, but also for really understanding how these different parts of the plane interrelates? What are the dependencies involved? Tons and tons of sensor data. Long files are everywhere these days. Think about the cloud these days. There's so much heavy-duty cloud infrastructure out there that every one of those cloud applications has long files. Well, this has been a trend now for about 10 or more years, really. Frankly, there are all kinds of interesting innovations coming out of companies that are gathering long files and then using them. Machine learning, of course, can be used on long files to expedite the process of understanding what's happening. I mean, can you just imagine trying to troubleshoot a Google infrastructure situation? Can you imagine? How would you even do that? Well, this is the kind of modern data that we're talking about. There's just tremendous amounts, absolutely vast scale of data out there. Social media, of course, has a tremendous amount of data. Just being able to tap into the fire hose from Twitter or Facebook or LinkedIn or any of these folks and to understand what's happening out there, very useful for retail, obviously very useful for understanding all sorts of different trends in the marketplace. I've got a concept I keep throwing out these days called real-world data at scale. And that's what we're talking about here. There is so much data to be analyzed these days that traditional methods are simply not going to cut it anymore. And so modern data really refers to all these new kinds of data or relatively new kinds of data. Some of it has been around for a long time. A lot of it is JSON, for example. But JPEGs, unstructured data, all kinds of documents. There's lots of machine learning going on in the document space these days, just taking actual Word documents, PDF documents, being able to scan them at scale and work with them. Well, you have all this data out there. How are you going to be able to analyze it all? Network packets. Just think about the size of some of these corporate networks these days. It's just massive. It's absolutely massive, especially if you're using things like Google Docs, for example, or Microsoft online. Think about the complexity of the situation. Well, what's going on out there? One thing to keep in mind is that everything is in flux. Hardware data sources, data staging processes, data volumes, flow. Governance is obviously a very hot topic these days and for good reason. There are a number of regulations that are coming down the pike in short order that are really going to force organizations to be much more responsible about their data practices. Whereas you get the data, what are you doing with the data, who has access to the data, all the roles and responsibilities. And just the exploratory side of all that is pretty important, too. How do you analyze new data sets? How do you know if this other data set is going to provide value to your business? You have to be able to explore all that stuff. And the fact that everything is in flux, well, it's a blessing and a curse, isn't it? It's a blessing in that there are lots of new ways to do things. The curse is that there are lots of new ways to do things. And I think what we're going to see is a fairly turbulent period over the next three to five years as new ways of accessing data and mining data and discovering data are going to slowly but surely supplant old-fashioned ways of doing all that kind of fun stuff. So everything is changing. That's a big deal. However, we're going to solve that problem. Well, one big word you're going to hear time and time again in the world of high technology these days is parallelism. Parallelize. How do you parallelize jobs? Well, that whole had-do movement we've talked about on the show before, that all boils down to parallelized processes. And if you can parallelize these processes, you can get jobs done much, much faster. So we used to see, as you can see, 10x performance improvement every six years or so. Now it's like 1,000x, and that's just an approximation. Well, what we're going to hear today from our friends from Zoom Data is that there are new ways of being able to explore data. And there's a reason for that. There's a reason why we have to focus on that. So I was trying to think of the best way to set this up and I realized that we should talk about this whole concept of digital transformation and look at the leaders out there. Look at the leaders in the world today. Who are they? Well, Uber and Airbnb obviously are two poster children for whole new business models that are upending entire industries. If you think about the scale of these solutions, it's just massive. It's absolutely stunning. But really, what they did is the same thing. So if you step back and think about what Uber has done, what Airbnb has done, and there are several others out there, we'll talk about one in a second. Well, these are behemoths, and I would argue they're straws in the wind. So if you haven't heard that expression, while straw in the wind, that means there's a tornado coming. That's what a straw in the wind is. It means there's a tornado nearby. So you better batten down the hatches and figure something out. What these guys did is they identified massive opportunities. Of course, with Uber, it's car sharing, it's like taxi service. What there would be in the hotel industry. These are massive industries that got turned sideways by these two companies and several other competitors out there. Well, what did they really do? They saw the opportunity and they realized they had to create a new kind of infrastructure, a completely scalable infrastructure that's bulletproof. They also really deconstructed business processes and thought through what is the necessary information you have at each point in the business process. It's a very critical thing that they did. And frankly, mobile plays a big part of this. I think mobile has in many ways electrified the entire user experience design industry for applications because guess what? There's a small amount of space you have to work with, and so the smart companies have really figured out how to design a layout that has tremendous efficiency baked into it. So if you think about the old way of interacting with software, like even your browser, for example, you've got File, Edit, Share, Review, Communicate, I'm just looking at the WebEx thing I'm using right now, Participant Event, Window, Help, et cetera, lots of submenus underneath that. Well, you can't really navigate through that kind of process on a cell phone. So what they did is they re-architected the processes and they re-architected the user experience, that actual design and the workflow, and they broke down what step you need to be at and what options should be available at that step, and no more options are presented to you. So then they re-architected these processes at scale, but what's underneath all of that? Data, tons and tons and tons of data, both batch-oriented data and streaming data. Obviously with Uber, for example, there's a mountain of streaming data, anyone who's ever used the service can see what an amazing innovation it has brought to where you can actually see a representation of that car coming around the block, it turns on your screen when the car turns, that is a streaming architecture of just epic scale, and it really changed things. Think about the bus industry, just taking buses to and from work or to school or wherever, now the forward-thinking companies, forward-thinking cities and organizations are using that kind of technology. So you can see where this bus is, you don't have to be like standing on the street, looking way down the highway, oh, when's that bus gonna come? You can see where it's coming. So these are straws in the wind about how things are changing, and who else has changed things? Well, Amazon, and what's the big concept here at play? Economies of scale, right? Think about how an economy of scale can disrupt any particular industry, vertical, concept, business model, whatever the case may be, Amazon brings a tremendous economy of scale, and Amazon did exactly what Uber and Airbnb have done, they only did it at, I would argue, a larger scale, and I did some research here, I didn't know this, but you can see that the arrow underneath Amazon points from A to Z, so the message is that they sell everything from A to Z, and they keep their customers smiling. That's pretty clever, I have to say, and Amazon is an absolute force to be reckoned with. Think about how much data they have to deal with, think about how much data they process every single day. I actually have some experience these days working with Amazon.com as a distributor, a friend of mine has a pepper sauce company, and I've learned how much money Amazon thinks on that stuff, just from what it's worth, you go online, I now realize why they're making so much money, but it's a very efficient process that they have, but we sell for $10 a bottle, they make five bucks, they make 50% of that, which actually reminds me of a story about Egypt and the old days of Egypt in Hosni Mubarak, the rule of thumb was if you want to do business with Egypt, you have to go through that family, and they keep 50%, if you get 50%, they get 50%. So complexity, this is the challenge that we face down, you start thinking about all the data that's out there, well this is just an image to portray complexity, analytics and complexity don't really get along, let's just be honest, you have to simplify what you're looking at in order to understand it. A complexity picture like this is just gonna bother the mind, so what we need to do is use tools and technologies to be able to filter out the noise and get to the signal, and that's what's really happening these days, it's happening in traditional ways, meaning batch, but also in these new streaming ways and these new streaming architectures. So let's base it, if you're gonna base decisions on analysis, you need to have some clarity, so again, we have to find a way to cut through the weeds basically, to separate the wheat from the chast, to find the needles in the haystack and then focus on them, and that is gonna be critical to success going forward. So I thought of a great analogy because I'm a big fan of history and myth and so forth and there's this great story about the Gordian Knot in Alexander the Great. For those who don't know the story, there was a Gordian Knot, which was the tightest knot in the world, it was no one could untie it, and it was a big challenge that was presented to people who came into this town and this ancient city, and Alexander the Great came along and he said, oh, they challenged him to untie the Gordian Knot, so he took out his sword and he sliced it open. I would say that analysts in the modern world, struggling with modern data really do need to think outside of their current toolbox. They need to think in different ways about how we're going to whittle all this stuff down, how we're gonna navigate through the complexity of modern data and all its different structures, and that's another good point I wanted to throw out. When people talk about unstructured data, what they're really talking about is data that is not in a relational database structure, so data warehousing, which is where most business intelligence has come from over the past 30 to 40 years, well, that's all built around a relational model, and that relational model is very useful, especially when dealing with large amounts of data, because again, you remember earlier, I talked about how everything is in flux, let me throw this slide up here just for a quick second. Traditional approaches for dealing with data management are all built around constraints at the time, so 20 and 30 years ago, guess what? Processors were pretty slow, we weren't using these FGPA processors at all, we weren't using, for example, the different kinds of processors, the graphic processors, we're just using CPUs. Well, graphic processors can do a lot of different things, and they're also very good for parallelizing execution, for parallelizing computation, right? So you think about what has changed? Well, network speeds are much faster, storage is much cheaper, processes are faster, you're talking about taking an equation, let's say, with 10 variables and dramatically modifying half of them, well, that's big stuff, that is why we're experiencing such turbulence right now, that's why there is such a massive movement toward digital transformation these days, because you have to, we're collapsing business processes, collapsing all kinds of different things, business models even, and reinventing how we do business. So all this comes into play, and the last thing I'll say before handing off to Anarog here, is that whenever possible, businesses really need to start looking at supplanting old, batch-oriented latency, laden processes with real-time processes, with streaming data. There's a whole lot of excitement around streaming data these days, and since we talked about Amazon, I just took a screenshot from their website about Amazon Kinesis, they have all kinds of different streams, and this is actually one really interesting case where the consumer world is mimicking the world of enterprise IT, because in streaming, you have all kinds of streaming data, think of stock tickers, for example, that's a good example of streaming data, but also you can have streaming data from anything, from oil rigs, from IOT devices, think about the number of IOT devices out there today, let alone the number that's gonna be in three to five years, there's just an astonishing amount of data out there, and the old systems, data warehousing, are simply not going to be sufficient for managing that environment, so we need new ways of collecting the data, new ways of processing the data, new ways of discovering the data, just trying to understand what's out there, and that discovery is really important stuff, especially these days with all these new data sources, because the analyst needs to be able to quickly determine, is this data set gonna be useful, is it clean, is it relevant, does it have some meanings in my business, whether that's web traffic detail, whether that's order information, point of purchase, for example, point of purchase data comes through, in a lot of cases these days, in real time, and so a company like a retailer that has a really good, robust streaming architecture can see what you're buying at the store, know who you are, correlate that information to your profile that they have, and be able to therefore respond to you much more effectively and much more quickly if you have a problem. This is all getting back to customer experience, and so with that I'm gonna hand it off to Honorog Tanded and Zoom Data, who kind of talked about how things are changing and why modern data really does require a new approach. So I'm gonna try to give you the keys Honorog, there it is, take it away. Thank you so much, Eric. So great, some of the points that you were making were sort of the same and similar things that I was thinking of as well as we were gonna talk about this rapid discovery on modern data. So as I embark upon myself slides here, which I'll try to keep it to a minimum, and then also maybe show you a little bit of what we do at Zoom Data. I was thinking about sort of the same sort of things, like I started my career in BI back in 1999, with late 90s, it was the days of the RDBMSs, so Relational Database Management Systems, where we were constantly working with companies, enterprises in the Fortune 500, Fortune 1000, 2000, depending on how you wanna look at it. And it was all about transactions. It was all about human-initiated events that companies were tracking. It could be a sale, a purchase, a service call, a check into a hotel, maybe I'm paying a bail and that's being tracked and so on and so forth. These are all transactions that were human-initiated, and we were looking at tens of millions of rows or maybe hundreds of millions of rows in the extreme cases. And retail and financial services were essentially the spearheading industries in their data world. They had the most data, they had the leg up in terms of looking at it, analyzing it, using it to their advantage, et cetera, and a lot of other industries were doing the same, but maybe not to the same scale. And the conversations were all around, this OLTP versus OLAP model. I don't know if everyone is familiar, but OLTP stands for Online Transaction Processing, OLAP stands for Online Analytical Processing. And so if you think of the differences between the two, the transaction processing is about just making sure that the transactions are captured, they're communicated back to the backend systems, your point of sale systems are up to date, and so on and the information is getting captured about the transaction. The analytical processing was about rolling up that transaction information to higher levels of groupings or aggregations, and then taking a look at the holistic picture across the organization of what's going on. And these were still very, very structured sets of data, in tabular form, relational schemas in Oracle, SQL Server, tons of other tools and so on. But the contention was all about organizations were trying to figure out how to separate these workloads of OLTP versus OLAP, and how to make sure that their relational databases, in some cases, they were the same databases that were doing both kinds of processing, and the contention that used to happen between the two, that's what organizations were trying to grapple with and separate and make sure that they have data warehouses that are set up to kind of do that kind of processing and separate the transactional loads. So that was essentially what I saw starting out in BI, and analytical processing was, I wouldn't, for lack of a better term, I would say a stepchild of transactional, because the DBAs were all about making sure that the business keeps running, right? Who cares about analysis? But for the savvy companies, the smart companies, they had figured out that analytics is really important. It wasn't a new concept in the 90s that had been around for a while, but still was taking a while to catch on. So from a data scale standpoint, like that's where we were in the 1990s, and if we talk about the 2000s and 2010 decade and so on, there have been some data changes. As I talk about that, I'll come back and then also discuss not just the data changes, but also changes that are happening alongside, I think Eric alluded to some of that in terms of the everything in flux slide and all the stuff that's going on with storage, memory, with processing, with even user habits and their expectations in terms of what's going on. So I'll come back and talk about that a little bit. So all right, so coming back to the data in the 2000s, if you will, we started to look at not just transactional information, but also interactions. So as organizations were understanding better value of their data, they're collecting more and more. The internet was mainstream. Everybody had web websites, everybody had web applications, mobile applications were coming to the fore. And organizations were capturing all of these interactions. And as they were doing that, they were also trying to find out newer ways of storing that information, as well as newer ways of analyzing that information. So unstructured or semi-structured formats and schemas starting to come to the front to forefront, more like in the mid or to late 2000s, no SQL and search types of storage systems came out as well and into the 2010s and so on. And from an industry standpoint, if transactional analysis was more retail financial services spearheading that, in terms of interaction, the tech companies, the Googles, the Facebooks of the world were starting to recognize that value of those interaction streams and capturing that and government was big into getting all of that information captured. In terms of batches, so if in the 1990s, companies were looking at weekly processing or monthly processing, I talked about OLTP versus OLAP, the same concept can be applied to even this world, even though we don't necessarily use those terms anymore, or are not as much. In the 1990s, people were happy with weekly or monthly batches, right? And daily was, oh my God, we have daily information. In the 2000s, daily was the norm, right? And people were wanting to see information from yesterday and be able to act on it today and then see what happens tomorrow. Now, as we're getting into the 2010s, not only are we talking about newer kinds of data collections, so observations, right, and this is touched upon this quite a bit, so I'm not gonna necessarily reiterate, but with the IoT and the sensor data and everything like that, there's just an explosion of observational data collection that's happening where humans are not even in the picture. Machines are communicating with each other and they're recording things, they're sending things, they're storing things, and then even doing, with machine learning and AI and everything like that, analyzing things without humans getting involved, right? Maybe not until the very end, sometimes. And so there's tons and tons of data sources that we're seeing organizations accumulate and assimilate, data lakes coming out with places to basically store all of this observational information and all kinds of industries. It's not just about retail, it's not just about financial services or tech. Every industry out there is being disrupted and is recognizing the value of this data and how to analyze, how to get ahead of others in the competition. So in terms of timing, when monthly and weekly was good enough in the 90s and 2000s daily was the norm, now we're looking at the freshest information possible. So streaming real-time information directly to the user or maybe near real-time. So if every second update is too much, every minute update is not out of the ordinary. Every 15 minutes, every hour is becoming more and more the norm. And we have some data that we've collected from surveys that also kind of validates these assumptions and these premises. So there's a lot of things that have happened on the data side in terms of the changes that are happening. Alongside the data changes, you look at tool sets and how applications are getting deployed, right? In the 1990s, companies were primarily using these desktop-based decision support systems or executive information systems, the DFF, EIS, these kinds of tools. Companies like MicroStrategy, BusinessObjects, Cognos, there's quite a few who in the early 90s were primarily desktop-based, were starting to move a little bit towards the web. And that's what I kind of saw coming in to MicroStrategy, that's where I was working in 1999. And, you know, toward the end of that decade, web servers and web-based deployment, web-based availability of this information was starting to take hold. The IT function was maturing around the same time and becoming the provider of this information to other business users. But largely, from a user standpoint, there was a handful of users who were using these desktop tools, and they were mostly in IT. And, you know, what they were doing was maybe limited ad hoc exploration, but for the most part, what was happening was that they were helping produce these canned reports of data and making them available in a print type of fashion or exported into like a spreadsheet or some other kind of format, or made available through a website, you know, if the company were really savvy. In the 2000s, web had taken hold, and so a lot of this information was actually going to the web. So if you think of a pendulum, if the 90s was more around desktop-based tools and analysis, that pendulum was shifting from desktop to the web, as people were looking at economies of scale and making information more widely available and so on with web-based access becoming more common. And then laptops, towards the end of the 2000s, we saw a huge increase in processing power on the laptop, right? Or even on mobile devices, with CPU memory, et cetera, et cetera. So we again started to see some of these desktop-oriented tools like Tableau and Click come on. And as IT was getting bogged down with a lot of things that they were doing for the organization, business users were starting to say, no, we got this, we can take this ourselves. We know which data sources we need to go to, what exploration we need to do. We can load them into our Tableau data extract or in the QuickView in memory system, we can load it in and we can analyze because we're really just talking about millions of rows of data and gigabytes of data which my laptop can handle, right? And then if I find something that's interesting, I'm gonna be able to share it with other people through the web. So there was this sort of desktop and web kind of hybrid. But then in the 2010s, now what we're seeing is a shift back to the web, the cloud with unlimited distributed scale in terms of how you can process. And in terms of the tooling that's involved, your desktop-based tools that were ingesting this data into in memory is again starting to kind of break down in terms of the scale of data that we're talking about now with observational data. And so it's been very interesting in terms of the ride that I've seen in the last 18 to 20 years in terms of these shifts in the industry from desktop to web, back to desktop, back to web. And so in terms of your tooling and what you're doing with, depending on what kind of analysis you're doing, what kind of use cases we're talking about, it's interesting to see different tools can fit different modes because transactional analysis hasn't really gone away. Interaction analysis hasn't really gone away. We're just talking about new and new and new use cases emerging with newer sets of data, newer ways of recording them, newer ways of storing them, etc. So those sort of a long-winded explanation but I thought that was kind of useful to kind of set the stage for what is it that we're seeing as Zoom data in terms of the types of analysis that is out there and companies are doing with this kind of data set. So here's a little information, a little data. Of course, how can we present without showing any data? So this is a survey O'Reilly conducted in 2017. And if you take a look at the top left, a lot of the newer applications, the newer use cases, the newer projects that people are working on are non-RDBMS related. So this is a huge slice, but this is not one kind of data source. So this is just a grouping of all the non-RDBMS type of data sources that people are using for newer projects. So the relational database systems are there, very much so. But they're more and more used for legacy type of applications or traditional applications, but the newer use cases where the data sizes are really massive are getting into the NoSQL, the cloud-based data sources, the Hadoops of the world, and then the whole different kind of stack with streaming and so on. In terms of companies looking to go to production or in production with big data, here's sort of the distribution that we're seeing of companies that have big data projects in production or multiple in production, et cetera. So it's a fair chunk. There's still room for growth, but as more and more, these bars are just gonna keep growing up. In terms of streaming data, or more accurately if I say it, the freshest data possible. The second old information is not necessarily useful for every use case. You may not even have updates in every use case for at the second level, but we're seeing definitely a barrage of new applications and new uses for streaming data that is as fresh as one second old. But more common is every minute, every hour, and of course every day and so on. And then in terms of the analytics and how they are getting deployed in the marketplace, what we're also seeing is this growth of embedded analytics. So analytics is becoming more and more contextual, where you don't necessarily wanna fire up another tool to do the analysis. But what users want is in the context of what it is that I'm working on, I want the analytics to come to me. So ultimately, if you look at the macro, macro, macro level, over the last, whatever, 30, 40 years, when we're talking about data and getting insights out of the data, ultimately the cycle that we need to follow is data, to analysis, to insights, to decisions, to action, and to results. And then those results feeding back to the data and then kinda continuing the loop. So at the macro, macro level, that is what data is for, right? And so those applications of data or the insights that you wanna get out of the data, the benefit has remained the same. It's as you drill down into the lower levels of detail, the mechanics of how we do that, the tactics of how we do that, which applications we provide, et cetera, et cetera, are changing, but the purpose is still the same. And so as we talk about getting from data to insights to action, how can we shorten that cycle by bringing data directly to the place where you're gonna have the action that's taken, right? And that's the purpose of the embedded analysis and embedded analytics. And so that's exactly what we're seeing with the advent of the cloud, with the advent of these embedded analytic tools and insights and users who are really savvy in understanding data and clamoring for that in insights and so on. And that's what they're looking for is in context analytics. So to give in a few examples of the kinds of applications and the kinds of use cases we're seeing out there, and I'm not gonna necessarily read through all of them, but I'm just trying to tie the use cases to the kinds of types of data that people are looking at. There's a global investment bank that we're working with, monitoring real-time trades, looking for anomalies, market opportunities, they're dealing with billions and billions of rows, thousands of transactions per minute, and actually having their traders monitor from a streaming standpoint, the data is coming into their monitors, they have them up and they're watching things that are happening, and occasionally they'll make an action based on that right away. But this is still transactional in nature. But if you look at others, so for example, cyber crime monitoring or cable TV usage, these companies are looking at also, billions of rows of observational data that is about different kinds of uses and different kinds of opportunities. Sometimes they could be associated with revenue generation, sometimes they could be associated with risk mitigation, and sometimes they are also associated with just cost reduction, right? The kinds of things that businesses will want to generate value from data. So maybe we'll just switch over to why we build Zoom data, what is it that we think we would be able to do in this space, given that we're not the only BI company in the world, the BI has been around for so long, what is it that we wanna bring to the market that is differentiated, that is useful, that we can build a business around, you know? So if you think about that, the way we wanna try to paint that picture is what we are trying to do is Zoom data is wanting to make it easy for people, people could be anybody, could be any business user, whoever wants to explore, understand, and then share these data-driven insights in a world where everything is constantly changing. So this is the constantly changing part is really important, and I'm glad, Eric, that you touched upon this, because that is really true of the industry at this stage. So building a tool and an application, a product at a time when you're not sure where people are gonna deploy, what kind of infrastructure they will use, what data sources are we gonna connect to, how people will visualize, where people will use this information is really hard. And what we wanted to make sure is like, how do we incorporate all of that in our architecture so that we can adjust and navigate these changes that are happening in this environment? So as we look back at our predecessors in the space, the challenges that they have been having with big data, as big data has evolved, is that these tools are hard to use. They were focusing on the power user. So they were loading up the interfaces with lots and lots of features for the power user, and with heavy preparation, semantic modeling, or whatever the case might be, to prepare the data so that it can be ready to analyze. There was a long, what is the word I'm looking for? There was a long lead time between the time that the data was available to the time that the data was ready for analysis and actually analyzed, right? And that made that distance really, really long and made the tools really hard to use. These tools are also not able to scale because they're deployed either on desktop, we talked about the desktop and the web, they're deployed on desktop, but they're also, for the ones that were running on servers, these servers were very monolithic in nature, and so you couldn't break them apart, you couldn't scale them, you couldn't distribute them in terms of the workloads, and ultimately user scalability was tampered. And then performance problems, because these tools are designed for relational data, so when you look into semi-structured, unstructured, streaming kinds of sources, first of all, how do you connect? You can just take a standard ODBC, JDBC interface and just connect to that data source and expect the performance to be great. Sometimes you have to go native to the data source and use the APIs that the data source offers to optimize those queries. And so that, seeing those challenges, we wanted to not, of course, learn from them and not repeat the same types of mistakes. So we kind of built this architecture that can excel where others will fail, so providing that flexibility of offering any kinds of data sources that we connect to, but having a framework so that you could spin up a new type of data connector, depending on the types of APIs that data source offers. So making it really extensible on that layer, making it also very extensible on the UI layer where we know there are tons and tons of ways of visualizing data. If you look at any charting library out there, they probably offer hundreds of types of charts. If we were to try to replicate all of them ourselves, it would take a lot of engineering horsepower and it would eliminate or basically the opportunity cost of investing in that area would be really high where we could actually work on some of these other pieces that are really hard, the flexibility, the scale, the security, and so on. And so having extensibility at this layer that the visualization layer was pretty important as well. And then we needed to make that embeddable. Yeah, sure. If I could just chime in real quick for a second, Anurag, you reminded me of something I wanted to talk about, and it's on that third scale section you have on your slide there. No data movement. It seems to me this is one of the major changes happening right now in the world of information management. We've gone from this era of constantly moving data around via of course ETL was the main reason or the main method I should say. And now there's a whole movement to leave the data where it is. And that's all because we can access it in a scalable way now, right? In a much more efficient way than we could five, 10 years ago, right? Absolutely, absolutely. Thanks for, I mean, I was gonna get there, but thank you for reminding me. And yeah, no, so absolutely, the data movement part is really important because that is at the core of what Zoom data does. We don't wanna replicate the distributed scale that you have in your big data lake or big data store of choice. You know, we have a customer in Telco storing their data in a snowflake environment in the cloud where it's trillions of rows potentially, right? And imagine having to first suck all of that data into your own in-memories type of store, which is proprietary, which nobody else can use. And first of all, like that whole ingest, storing, maintaining, you know, it is going to be really hard. So either we replicate all of that stuff that the snowflake does or other kinds of tools in the space will do, or we would aggregate that to a level that we can actually handle, right? And then you lose that fidelity or the lower level of detail that you sometimes wanna see from an exploration of data standpoint. And so, you know, why not just let the data live where it is, we can have some caching or high performance strategies to optimize the kinds of, you know, queries we run and how we run it. And I'll talk about that in a second in terms of what we do, but why replicate? And, you know, why set up an architecture that's designed to fail? So that's sort of the idea that we have. And so, you know, in terms of, if you were to take away a couple of things in terms of what Zoom data does, and let me show that to you in a second, having this streaming approach, so what we talk about is we stream queries, right? So streaming to us is not just about the data streaming part, because you can have a streaming pipeline where data is coming in through their Kafka or other kinds of queues into a landing area. And then you're analyzing either the stream itself or you're analyzing it after it lands. But we're also streaming inside of our own application from the backend all the way to the front end. So as data is coming in, you know, we're able to send chunks of information to the front end and make sure that we can update that in real time or near real time as the case might be. We also have this mechanism of breaking up a large query into microqueries and then essentially making sure that the user is not waiting for a large query to return and essentially twiddling their thumbs or stepping out and getting a copy while the query is actually running. And then coming back to see if it either failed or it's still spinning and there's no indication of when it's gonna finish, right? Or bringing down the machine, God forbid. So we have like these strategies of breaking up the queries to be able to show quickly the data to the user as it returns. So it may still be partial information but it's giving the user enough of an indication of what it is that they requested. What they're getting back, is that what they really want? And they can either continue to wait and see if the, you know, continue to wait and wait for the information to come back or decide, you know, this is not what I requested or I got enough of an insight of what the data is and I'm able to start analyzing it right then and there. So let me make it more clear by, I'm gonna kind of skip over the names of some time. There are a few customer slides I put down. We'll send this packet out so you guys will see that. Let me go and do a quick demonstration and make sure I'm still logged in. Yeah, and the whole concept of micro queries is really central to the approach here, right? Because a traditional query will do whatever scanning needs to do and only return once the full set is there or as you guys kind of took that animated GIF approach right or the slow rasterization of the image so you can see it as it comes through. And I think that's actually very clever because let's face it, the attention span gets shorter by the day these days and you have to get stuff to quickly write. Right, and so you can see, I mean, this thing is updating. While this is updating, the dashboard is available to me so I can interact with each of these widgets. I can add more charts to this as we go along. The interface is not blocking me at all. I can even change the query from user state to now I wanna actually group by zip code. That's not what I wanted. And we go ahead and cancel the previous query, set up a new one so it's not gonna cause too much load on your system but also it's unblocking the user. And here, even though it's 23%, 24%, I can already see what the trend is of the data and I might be good with it. I might be willing to start going in and zooming into more detail. I could start using these data points to then filter other information and see how that affects the rest of my data. And so that's very crucial to a big data application that's gonna run on large and large and large amounts of data. Because ultimately you can necessarily break the laws of physics. If it's a large amount of data, it will take a significant amount of time to get information back. If you're actually going against all the raw information, you can definitely aggregate it up and have some strategies like that. But how do you make and provide a user experience that it doesn't necessarily bog down the user? A couple of things I didn't really touch on. Let me go back to the interface and kind of show. Like this is the only interface we have. So this is a completely web-based interface. It's very simple. Like in this case, from sales purposes, we've kind of added these screenshots to kind of explain what each of these dashboards does. Here is more of the out-of-the-box interface for me. I think this is gonna be locked out here as well. Yeah, another key that I think you started to show here is the drill down, drill up, drill across, right? That's what's critical for discovery is to be able to navigate and move all around, right? That's exactly right. So here we go. Here's more of the out-of-the-box interface where each of these dashboards is actually taking a screenshot of what it is that it contains and you can see it. Come back here. Here's a pretty cool one that we like to show. Here's an example of a demo where you could actually, right now, text at this number if you guys have your phones and essentially text, let me show you, so something like Apple 500, right? Let me do that from my phone. So I'm going to my messaging app and I'm going to go to this phone number and I'm gonna say Kiwi. Kiwi if I can spell it right and then I can say 300. So this is essentially an order taking app, right? And so I just did this, somebody did orange from this 408 number and you can see how this data is changing. Like other people are doing it too. You can keep doing this. And so what's going on is essentially when you're texting, it's going into a Kafka stream, Kafka stream then gets enriched by this Spark Streaming component. The data is then landing into solar and then Zoom data is analyzing it in real time and pushing results directly to the interface. So this is a great example of the kinds of real-time data pipelines and seeing people are having fun with this thing. These real-time data pipelines that are very critical to understand your data and be in touch with what's going on. And so on and so forth. Let me... Yeah, and you reminded me of... Yeah, quickly, you reminded me of something that I was talking about in my presentation and that is organizations are gonna find that as you can leverage this fresh data as you call it the freshest data in real-time streaming, there are gonna be downstream latency oriented processes that really can go away, right? If you can solve a problem now, well, a stitch in time saves nine, right? I think we're gonna see a pretty significant transformation as more and more organizations realize that you can pull in and you can analyze and visualize real-time data and that's gonna allow you to not have to worry about certain things down the road, right? That's exactly right. And so to the point of like, this is the interface that every user can use. We've tried to keep it really simple in terms of what it is that you can do. It's as you go in to say this search-based tool, so this is sitting on solar and I can go in here to this data pad and I can start searching because this is what the search data source allows me to do. I can just quickly do search-based queries and also what it allows me to do is not only just look at the raw information of hotels where bed bugs are, but also look at them by different countries or cities and start to like drill down into lower levels of detail and then zoom in to say different types of rooms in New York City or different types of hotels and so on and so forth. And so you can make sure that you shouldn't go and sleep in these hotels. So that's really what's cool about the space now is being able to navigate and look at different and disparate type of data sources that search-based, streaming-based, regular data, big data, what have you in like five years time, big data wouldn't even be a term, everything would be big anyway. So that's really what we have to offer in this particular space. So let me come back and kind of wrap up. I'm not gonna spend more people over a couple of these things. The last thing I wanted to say was we actually last year put together a set of master classes. So we went to different industry thought leaders and said come and talk to us about what is it that you're seeing in this space? So very similar to this webinar that we're doing today. So these are people who are very accomplished and knowledgeable about the space and talk to companies all over and they came and participated and they talked about it, we recorded a bunch of small, bite-sized videos and you're welcome to come and watch them on our website. Yeah, that's all good stuff. We do have a couple of questions here in our last six minutes or five minutes or so. Yeah, I threw out the concept of governance at the top of the hour here. How do you see governance changing in this world of big data discovery, especially when you can navigate through all these different systems so quickly? How does that change things in your opinion? Right, right, that's a great question. So what we aim to provide is this data exploration type of a use case, right? Where I have tons of data in my big data lake. For example, I don't know what to do with it. I don't know which one is valuable. I don't know if I should combine it with other data sources and present it as an analytical data set. Maybe curated, maybe not. And so the use cases that we're finding is this exploration of different and large data sets that companies have trying to get value out of them and understand what is it that users will wanna see from this data before they expose it to the users, right? So that is one very prominent use case where governance is not necessarily as much of a concern. It comes after the fight, right? And when it comes after the fight, then you have a set of curated data that IT is managing or maybe even the business or line of business is managing but there's some governance that they're putting across those data sets. And then saying that here is how we will expose some of these discovered data will make that available but then it will also allow you to add in your own data to it or side by side explore other kinds of data. So it's all about how do you manage the environment? How do you give access to and some of this new data catalog type stuff is emerging where companies are trying to put together a company-wide schema on top of all of the data that they have and that might also provide users information of what is governed data, what is more accurate data and so on. Yeah, that's a good point. I'm glad you gave that little demo at the end there around the hotel rooms because it was a good visual to help understand all the different kinds of data and how quickly you can drill through and drill across and drill around because that's really gonna be the key these days is being able to problem solve and troubleshoot or just identify opportunities as fast as possible and not have to jump around from one system to another system. You want that discovery to be a place that has access to all these different environments, right? Yep, that's good. Yeah, and the other thing to keep in mind here is that traditional data back to the whole concept of structured data and SQL queries and so forth was very limiting because back in the day we had to strip out context in order to get little bits and bytes to go across those thin pipes into a data warehouse and be able to do some analysis. Well, now I'm not gonna say that this new method or this new way of doing things has obviated the concept of data warehouse, but it certainly has changed things. I think it's very good for stochastic use cases, right? For just exploring and trying to understand what's going on in a particular marketplace, what are the opportunities? That's where the power of this discovery really comes into play, right? Yep, yep, absolutely, absolutely. Yeah, I think for more deterministic use cases like reporting, for example, to the SEC or reporting to any regulatory agency, that's where you're gonna wanna keep your data warehouse and those practices in place. Would you agree that these... Right, so yeah, and we definitely come across those kinds of questions all the time where companies look at and say, hey, are you gonna be able to do this? Are you gonna be able to do that? This is how we've been doing this for the last 10, 15 years, and we tell them, you are going to have different kinds of data, you are going to have different kinds of use cases and use this for that, and traditional BI is not going away. It is going to exist and it is going to be something that you want. Ultimately, it's about... No one tool is going to be your Swiss knife, right? That does everything, but think of the tools as the different blades in the Swiss knife, where you can have a collection of tools. And depending on the kinds of use cases that you're running across, Zoom data might be one of those tools. I don't know, SAP might be one of those tools, Oracle might be another of those tools, Power BI might be another of those tools. And depending on what is it that you're doing, we don't think that one tool is going to solve all of the problems, but organizations should think about having maybe two or three. That kind of helps with different kinds of use cases. I love it. Good advice here, good demo, good perspective. Thanks for walking us through the last 20 odd years or so, because a lot has changed and it really does force us to rethink how we've been doing things in the last few years. And I think that the future is very, very bright for data discovery and data mining and all this kind of fun stuff. So folks, we do archive all these webcasts for later viewing. I'm going to hand it over to Shannon Kemp to take us out. Thanks so much once again. Thank you, Eric. And thank you, Elarad, for a great presentation. As Eric mentioned, that is all the time we have for today. Just a reminder, I was on a follow-up email by End of Day Friday with links to the slides, links to the recording of this session. And just a reminder, the recording will be published to dmradio.biz. You can also check out the podcast there as well from the radio show. Thanks, everybody. Thanks to our attendees for being so engaged in everything and all the great questions. And I hope everyone has a great day. Thanks again to Zimdata. Thanks. Thank you.