 Hello, everybody, and thank you for joining us today at the Virtual Vertica BDC 2020. Today's breakout session is entitled, The Road to Autonomous Database Management, How DOMO is Delivering SLAs for Less. My name is Sue LeClaire. I'm the Director of Marketing at Vertica, and I'll be your host for this webinar. Joining me is Ben White, Senior Database Engineer at DOMO. But before we begin, I want to encourage you to submit questions or comments during the virtual session. You don't have to wait. Just type your question or comment in the question box below the slides and click submit. There will be a Q&A session at the end of the presentation. We'll answer as many questions as we're able to during that time. Any questions that we aren't able to address will do our best to answer them offline. Alternatively, you can visit Vertica forums to post your questions there after the session. Our engineering team is planning to join the forums to keep the conversation going. Also, as a reminder, you can maximize your screen by clicking the double arrow button in the lower right corner of the slides. And yes, this virtual session is being recorded and will be available to view on-demand this week. We'll send you notification as soon as it's ready. Now let's get started. Ben, over to you. Greetings, everyone, and welcome to our virtual Vertica Big Data Conference 2020. Had we been in Boston, the song you would have heard playing in the intro would have been Boogie Nights by Heatwave. If you've never heard of it, it's a great song. To fully appreciate that song the way I do, you have to believe that I am a genuine database whisperer. Then you have to picture me at 3 a.m. on my laptop, telling a vertical off, getting myself all psyched up. Now, as cool as they may sound, 3 a.m. Boogie Nights are not sustainable. They don't scale. In fact, today's discussion is really all about how DOMO engineered the end of 3 a.m. Boogie Nights. Again, welcome. I am Ben White, Senior Database Engineer at DOMO, and as we heard the topic today, the road to autonomous database management, how DOMO is delivering SLAs for less. The title is a mouthful. In retrospect, I probably could have come up with something snazier. But it is, I think, honest. For me, the most honest word in that title is road. When I hear that word, it evokes, for me, thoughts of the journey and how important it is to just enjoy it. When you truly embrace the journey, often you look up and wonder, how did we get here? Where are we? And of course, what's next, right? Now, I don't intend to come across as too deep, so I'll submit that there's nothing particularly prescient in simply noticing the elephant in the room. When it comes to database autonomy, my opinion is then merely, and perhaps more accurately, my observation. Let me offer some context. Imagine a place where thousands and thousands of users submit millions of ad hoc queries every hour. Now imagine, someone promised all these users that we could deliver BI leverage at cloud scale in record time. I know what many of you should be thinking, who in the world would do such a thing? That news was well received, and after the cheers from executives and business analysts everywhere, and chances keep calm and query on, finally starts to subside. Someone that turns and asks, that's possible. We can do that, right? This is no imaginary place. This is the very real challenge we face to DOMO. Through imaginative engineering, DOMO continues to redefine what's possible. The beautiful minds that DOMO truly embrace the database engineering paradigm that one size does not fit all. That little philosophical nugget is one I would pick up while reading the white papers and books of some guy named Stonebreaker. So to understand how I and by extension DOMO came to truly value analytic database administration look no further than that philosophy and what embracing it would mean. It meant really that while others were engineering skyscrapers, we would endeavor to build data neighborhoods with a diverse topology of database configurations. This is where our journey at DOMO really gets underway. Without any purposeful intent to define our destination, not necessarily thinking about database as a service or anything like that. We had planned this ecosystem of clusters capable of efficiently performing varied work loads. We achieved this with custom configurations for node count, resource pool, configuration parameters, et cetera. But it also meant concerning ourselves with the unintended consequences of our ambition. The impact of increased DDL activities on the catalog system overheads in general. What would be the management requirements of an ever-evolving infrastructure? We would be introducing multiple points of failure. What are the advantages, the disadvantages? Those types of discussions and considerations really help to define what would be the basic characteristics of our system. The database itself needed to be trivial, redundant, potentially ephemeral, customizable, and above all scalable and we'll get more into that later. With this knowledge of what we were getting into, automation would have to be an integral part of development. One might even say automation would become the first point of interest on our journey. Using popular DevOps tools like Softstack, Terraform, ServiceNow, everything would be automated. This included everything from larger, multi-step tasks like database designs, database cluster creation and reboots to smaller routine tasks like license updates, moveouts, and projection refreshments. All of this cool automation certainly made it easier for us to respond to problems within the ecosystem. But these methods alone still left our database administration reactionary and reacting to an unpredictable stream of slow query complaints is not a good way to manage a database. In fact, that's exactly how 3 a.m. boogie nights happen. And again, I understand there is a certain appeal to them. But ultimately, managing that level of instability is not sustainable. I mentioned an elephant in the room. Brings us to the second point of interest on our road to autonomy. Analytics, more specifically, analytics, database administration. Why are analytics so important? Not just in this case, but generally speaking, I mean, we have a whole conference set up to discuss it. Domo itself is self-service analytics. The answer is curiosity. Analytics is the method in which we see the insatiable human curiosity. And that really is the impetus for analytic database administration. Analytics is also the part of the road I like to think of as a bridge, if you will, from automation to autonomy. And with that in mind, I say to you, my fellow engineers, developers, administrators, that as conductors of the symphony of data, we call analytics. We have proven to be capable producers of analytic capacity. We take pride in that and rightfully so. The challenge now is to become more conscientious consumers. In some way, shape or form, many of you already employ some level of analytics to inform your decisions. Far too often, we are using data that would be categorized as nagging, perhaps your monitoring flow queries in the management console. Better still, maybe you consult the workload analyzer. How about a logging and alerting system like Sumo Logic? If you're lucky, you do have Domo. But you monitor and alert on query metrics like this. All examples of analytics that help inform our decisions. Being at Domo, the incorporation of analytics into database administration is very organic. In other words, pretty much company mandated. As a company that provides DI leverage at cloud scale, it makes sense that we would want to use our own product to be better at the business of Domo. Adoption of stretches across the entire company. And everyone uses Domo to deliver insights into the hands of the people that need it when they need it most. So it should come with no surprise that we have, from the very beginning, used our own product to make informed decisions as it relates to the application-backed engine. In engineering, we call our internal system Domo for Domo. Domo for Domo, in its current iteration, uses a rules-based engine with elements of machine learning to identify and eliminate conditions that cause slow query performance. Pulling data from a number of sources, including our own, we could identify all sorts of issues like global query performance, actual query counts, success rate, for instance, success as a function of query counts, and, of course, environment timeout errors. This was a foundation, right? This recognition that we should be using analytics to be better conductive of curiosity. These types of real-time alerts were a legitimate step in the right direction. For the engineering team, though, we saw ourselves in an interesting position as far as Domo for Domo. We started exploring the dynamics of using the platform to not only monitor and alert, of course, but to also triage and remediate. Just how much autonomy could we give the application? What were the pros and cons of that? It's a big part of that equation, trust in the decision-making process, trust that we can mitigate any negative impacts, and trust in the very data itself. Still, much of the data comes from systems that interacted directly and, in some cases, indirectly with the database. By its very nature, much of the data is past, tense, and limited. You know, things that had already happened without any reference or correlation to the conditions that led to those events. Fortunately, the Vertica platform holds a tremendous amount of information about the transactions it had performed, its configurations, the characteristics of its objects, like tables, projections, containers, resource pools, etc. This treasure trove of metadata is collected in the Vertica system tables and the appropriately named data collector tables. As of version 9.3, there are over 190 tables that define the system tables while the data collector is a collection of 215 components. A rich collection can be found in the Vertica system tables. These tables provide a robust, stable set of views that let you monitor information about your system resources, background processes, workload and performance, allowing you to more efficiently profile, diagnose, and correlate historical data such as load streams, query profiles, tuple, mover operations, and more. Here, you see a simple query to retrieve the names and descriptions of the system tables and an example of some of the tables you'll find. The system tables are divided into two schemas. The V-catalog schema contains information about persistent objects and the V-monitor schema tracks transient system states. Most of the tables you find there can be grouped into the following areas, system information, system resources, background processes, and workload and performance. The Vertica data collector extends system table functionality by gathering and retaining aggregated information about your database cluster. The data collector makes this information available in system tables. A moment ago, I show you how you get a list of the system tables and their descriptions, but here we see how to get that information for the data collector tables. With data from the data collector tables and the system tables, we now have enough data to analyze that we would describe as conditional or leading. Data that will allow us to be proactive in our system management. This is a big deal for DOMO and particularly DOMO for DOMO because from here we took the critical next step where we analyzed this data for conditions we know or suspect lead to poor performance and then we can suggest the recommended remediation. Really, for the first time we were using conditional data to be proactive in a database management. In record time, we tracked many of the same conditions that Vertica support analyzes via scrutinized, like tables with too many projections or non-partition fact tables, which can negatively affect query performance. And like the Vertica enviromental suggests, if a table has a date or a time staff column, we recommend the partitioning by the month. We also can track catalog size as a percentage of total memory and alert thresholds can trigger remediation. Requests per hour is a very important metric in determining when to trigger our scaling solution. Tracking memory usage over time allows us to adjust resource pool parameters to achieve the optimal performance for the workload. And of course, the workload analyzer is a great example of analytic database administration. I mean, from here one can easily see the logical next step where we were able to execute these recommendations manually or automatically via some configuration parameter. Now, when I started preparing for this discussion, this slide made a lot of sense as far as the logical next iteration for the workload analyzer. Now, I left it in because together with the next slide, it really illustrates how firmly Vertica has its finger on the pulse of the database engineering community. In 10.0's management console, ta-da, we have the updated workload analyzer. We've added a column to show tuning commands. The management console allows the user to select to run certain recommendations, currently tuning commands that are allowed our analyze statistics. But you can see where this is going. Using DOMO with our Vertica connector, we were able to then pool the metadata from all of our clusters. We constantly analyze that data, creating a number of known conditions. We build these recommendations into scripts that we can then execute immediately via actions or we can save it to a later time for manual execution. And as you would expect, those actions are triggered by thresholds that we can set. From the moment Neon Mode was released to beta, our team began working on a serviceable auto-scaling solution. The elastic nature of Neon Mode's separated storage and compute clearly lent itself to our ecosystem's requirement for scalability. In building our system, we worked hard to overcome many of the obstacles that came with the more rigid architecture of enterprise mode. But with the introduction of Neon Mode, we now had a practical way of giving our ecosystem at DOMO the architectural elasticity our model requires. Using analytics, we can now scale our environment to match demand. What we've built is a system that scales without adding management overhead or unnecessary costs all the while maintaining optimal performance. Well really, this is just our journey up to now. And which begs the question, what's next? For us, we expand the use of DOMO for DOMO within our own application stack. Maybe more importantly, we continue to build logic into the tools we have by bringing machine learning and artificial intelligence to our analysis and decision making. Really to further illustrate those priorities, we announced the support for Amazon SageMaker AutoPilot at our DOMO Prolucid conference just a couple of weeks ago. For Vertica, the future must include in database autonomy the enhanced capabilities in the new management console to meet our clear nod to that future. In fact, with a streamlined and lightweight database design process, all the pieces would be in place to deliver autonomous database management itself. We'll see. Well, I would like to thank you for listening. And now of course we will have a Q&A session, hopefully very robust. Thank you.