 Hello, and welcome. My name is Shannon Kemp, and I'm the Chief Digital Manager of DataVersity. We'd like to thank you for joining the latest installment of the Monthly DataVersity Webinar Series, Advanced Analytics with William McKnight, sponsored today by Vertica and Pure Storage. Today, William will be discussing 2021 trends in enterprise analytics. Just a couple of points to get us started. Due to the large number of people that attend these sessions, you will be muted during the webinar. For questions, we will be collecting them via the Q&A in the bottom right-hand corner of your screen, or if you'd like to tweet, we encourage you to share highlights or questions via Twitter using hashtag ADV Analytics. And if you'd like to chat with us or with each other, we certainly encourage you to do so. Just click the chat icon in the bottom right-hand corner of your screen for that feature. And as always, we will send a follow-up email within two business days containing links to the slides, the recording of the session, and additional information requested throughout the webinar. Now, let me turn it over to Miroslav from Pure Storage and Jeff from Vertica for a brief word from our sponsors. Hello and welcome. Thank you, Shannon. Thank you everyone for joining. This is Jeff Healey from Vertica. I'm really excited to be part of this webcast where Miroslav and I are going to talk about how Vertica and Pure Storage can address your variable workloads on-premises with the cloud-optimized architecture. Welcome, Miroslav. Thanks, Jeff. Happy to be here and looking forward to this webinar. Sounds great. Thanks again. So, I'm just going to cover a quick overview on Vertica, explain a little bit about this cloud-optimized architecture and how we work with Pure Storage to bring all those benefits down to on-premises and hybrid environments. So, just a very quick overview on Vertica. It's a unified analytics warehouse. And if you go to any of our, read any of our case studies, what you'll see is that our customers choose us largely based on high performance at scale. And when we get questions around what is Vertica, it's three things. It's a SQL database, a SQL data warehouse. And just as with all of them, you can load and store very large volumes of data for really fast analytics. It's a column store. It's a massively parallel processing system. And I'd say the biggest area that most interests these days is within analytics and machine learning. And we've made a major investment in this part of the platform. It's all one platform. Excuse me. And what you can do is end machine learning, particularly predictive analytics. Now, there are a lot of built-in analytical functions, everything from time series to pattern matching to what's goes on and on, but also machine learning that a lot of algorithms, logistic regression, naive Bayes, that they've been rewritten to take advantage of the full cluster. What does that mean? It means that you can create, train, and deploy advanced analytics and ML models at a massive scale. And what we say is that no needed-down sample, you can actually build your models either in Vertica or outside of the database, and you can deploy whether you're building them in Python, you can deploy them in database based on the full corpus of the data that leads to more accuracy of your results. And then finally, third, last but not least, it's query engine. So many organizations we speak to, they say, that's fine, we know Vertica is performing it. My data is in Orkin Parquet, JSON-Abro, other data formats, and it's going to stay that way, at least for now. If you have a query engine that you can point at, run your queries against those types of data formats performing in a scalable way, we'll have a conversation. So SQL Data Warehouse, analytics, machine learning, query engine, and it's all one product. So it's one of the primary reasons it makes it unified, analyze that data in place outside of Vertica. So what do we hear from a lot of our customers when they're considering Vertica against many of our competitors? Well, one of the challenges they have, particularly if they're going to the cloud with some of these variable workloads, is that they're concerned about scale and performance, right? So think of all these data unicorns out there, the Ubers of the world, the Yetzys of the world, the Wayfares, where they have a lot of seasonality. They need a separation of compute and storage architecture because they want to scale the data volumes and the user separately. They could, for marketing reasons, have 30 marketing managers that want to dive in and determine how effective the campaigns are, say, at the end of the quarter. But they're not going to be using that system over the course, you know, 100% over the course of the year. So you need to be able to handle those. That's what we mean by variable workloads to be able to handle, you know, being able to scale up users and data volumes separately. It's becoming more table stakes and super important in the more modern data analytical platforms. So if you're working with your vendor, make sure they have that type of capability. Now, more importantly is having that capability, it's often limited just to the cloud or clouds. And we know that many organizations we speak to, they say, hey, very interested in the clouds. In fact, maybe the new workloads are going to go there. But we're going to have a lot of our workloads also reside on premise data centers or in hybrid environments. And so we don't want to sacrifice on any of the capabilities that are often through the cloud. And that's what Vertica NeonMode does. This is a separate compute and storage architecture uses S3 as the storage layer. And what we're able to do is deliver the cloud economics promise on all the major clouds, but also on premises by virtue of pure storage. And Ursula will dig into that. The key benefits of Vertica Cloud Optimized Architecture is being able to scale the infrastructure linear light, right? So you want to add scale out. You do it really quickly in a very seamless way. You want to isolate analytical workloads so you can support various groups of your organization, your data scientists, you know, working on machine learning projects, your BI team that is handling some of your workloads that may be around dashboards or, you know, kind of executive level performance. Improving the database operation. Yes. Sorry, so sorry. The pop-up that you have there for the WebEx is showing. Oh, it's covering. Perfect. Thank you very much, Shannon. Yep, you can prove database operations faster, no recovery. Then you can hibernate. Right? If you want to stop and start your analytics. This is a whole idea around a very versatile, flexible architecture. I'm not going to cover all of these, but it's a very popular way that we found, rather than, you know, just listing a bunch of companies, is to show that typical day in the life, our customers through analytics are improving daily lives. So everything from IoT analytics with Phillips around predictive maintenance to trade desk being able to have full insight with their 500 note cluster on AWS across 10 petabytes of data to determine how ads are performing to a climate corporation with Agritech analog use case where farmers are actually doing analysis with the software offering that embeds Vertica to determine how they can gain better crop yield and then intuit for your taxes, making sure that the software is optimized so that you continue to use their products and you have more affinity with the brand. So many, many different examples, they all rely on Vertica for analytics. That's it for me. I just want to point people to Vertica and Eon mode for pure storage. Miroslav is going to talk a little bit more about pure storage and how this combined offering can really help you get all those benefits from the cloud in on-premises and hybrid environments. Go to vertica.com. And now I want to pass it off to Miroslav. Thanks, Jeff. So, you know, for a lot of people in analytics, they've heard of pure storage but don't really know that much about what we do and our products, especially FlashBlade, which is the scale out distributed storage platform that is really well paired in analytic environments and we work with Vertica around. So let me just very quickly cover pure storage FlashBlade. Pure is a really huge believer in simplicity and at a very high level FlashBlade is big. We scale to really huge amount of storage fast. We can handle gigabytes of throughput per second and low latency and simple. There's almost no learning curve and it's really easy for non-storage people to be able to do sophisticated storage things using FlashBlade. Next slide, please. When it comes to Vertica Eon mode and on-prem storage, what Vertica needs is something that fits this entire bill, something that provides data safety and can deliver, you know, reliable consistent storage access, durable access across failures, performance at scale. As Vertica scales in terms of compute, it needs the performance of the storage to scale as well as the capacity for data growth. That performance and capacity scalability should be linear so that the overall response is predictable and easy to understand how to plan for. Because Vertica is used for a broad range of use cases and we want to be able to consolidate many different analytic related applications on the same storage, it needs to be tuned for everything, which is sort of our way of saying whatever workload you throw on us, we're going to do well. And it needs to be easy to manage because in analytic environments, a lot of times people are not, you know, dedicated storage administrators. They have other jobs around analytics. So being able to manage a large distributed petabyte scale storage appliance with almost no storage background is a big plus. So when you roll all of that together, you get pure storage FlashBlade for Vertica Eon mode. And that delivers on that separation of compute and storage, which is the hallmark of Vertica Eon mode. Next, you know, I've mentioned FlashBlade, this is what it looks like. It's a bladed architecture like the name implies. We have different chassis with blades that plug into them. Each blade is an independent, powerful data processing and storage unit. It runs our operating system that we call PurityFB that manages everything, including the, you know, distribution, the durability, the software defined networking. And on the back, there's a scale out network fabric that interconnects. And that's part of the simplicity story because what it does is it takes all of the node to node blade to blade networking off the table and we manage it. We do it in a software defined way that delivers load balancing without having to mess with network switches or anything else. And it simplifies all of the spaghetti cabling in the back of many analytic clusters because you really only need a small number of cables from your infrastructure networks into the FlashBlade and then the FlashBlade takes care of the rest. Next please. So FlashBlade was really born for this era of unstructured data. When it was conceived and designed and launched in mid-20, you know, I guess the mid-2010 decade, we started working on it around 2015, or 2013, launched it around 2015. And at that point, it was already clear that there was going to be this through the exponential growth of unstructured data. More and more IoT applications were already in production. We saw the trend and that was the design point for FlashBlade and what we built it to deliver. Next please. You know, I already mentioned FlashBlade is a bladed architecture. Each one of our chassis is for you and can hold up to 15 blades. Those blades are either 17 terabytes or 52 terabytes for kind of denser deployments. Next please. And when you look on the back of a FlashBlade, what you see is these fabric modules toward the top. And this is where the network plugs in. If you have more than one chassis, there are actually external fabric modules that the chassis is plugged into and then those external fabric modules plug into the customer's network. But all of this is managed using software-defined networking by the PurityFB operating system. And I also want to point out that they're redundant. So you have two redundant network fabric modules inside of each chassis and there are four redundant power supplies to make sure that everything stays up and running. Next please. And when you sort of look inside the sheet metal of the chassis, what you see is the blades plug into the center plane, which also holds the power supplies and the fabric modules. So everything is there. It's self-contained and it provides high levels of redundancy, high levels of performance, even inside of a single chassis. So we can stack actually up to ten of these chassis together and deliver up to eight petabytes worth of raw storage. Next slide please. Each one of these blades, as I mentioned, is essentially an independent computer. There is a motherboard. It has an Intel Xeon SOC that runs most of the PurityOS software. There's DRAM modules and then there are these flash daughterboards. And these are managed by programmable FPGAs that run our own code. And what that gives Pure is the ability to address all the different NANDIs in parallel and be very smart in terms of how we paralyze the IO and how we preserve and optimize that NAND life. Next slide please. Oh, and there are super caps to make sure everything is safe. And I mentioned the PurityFB operating system. In addition to managing all those resources, it is really built for parallelism. It is a key value database that's distributed across all the blades, everything dynamically adjusts to new blades being added or, you know, in the rare case that there's a blade failure, everything will automatically self-heal and self-adjust. And this gives us the ability to host billions and billions of objects on a flash blade. You know, I think some of our engineers have posted blogs showing how they're testing like, you know, 67 billion objects, 100 billion objects. It's built for parallelism. And on top of that native KB database, we have Object and NFS. And let me just wrap up really quickly. Next slide please. All of this is managed through PureOne, which is our cloud management interface where you can manage it through a single pane of glass with a web browser, even an app on your phone. Plus we have machine learning driven capacity planning and workload planning that's part of it. And if you use virtual machines, we have VM analytics that lets you get deeper insight into what's going on. So with that, let me just kind of wrap up and again suggest you go to vertica.com slash pure to learn more about our joint solution. Thank you very much. All righty. Well, thank you so much to both of you for kicking us off. And thanks to Vertica and Pure Storage for sponsoring. If you have questions for either Jeff, Laura, or Miroslav, feel free to submit the questions in the Q&A in the bottom right hand corner as they will be joining us at the end of the webinar today in the Q&A. Now let me introduce to our speaker for the series, William McKnight. William has advised many of the world's best known organizations. His strategies form the information management plans for leading companies in numerous industries. He's a prolific offer and a popular keynote speaker and trainer. And with that, I will give the floor to William to get his presentation started. Hello and welcome. Hello. Thank you, Shannon. Thank you, Jeff and Miroslav. I really enjoyed looking at the chassis and the blades. I could do that for quite a while. I consider them works of art. So thank you for that. I hope to see some in person in this new year. And that's what we're here to talk about this new year, 2021. And I want to share with you the trends in enterprise advanced analytics as I see them and get you ready. Get you ready for the year I've been introduced. A little bit about me there and our consulting group offerings. We do a lot of strategy training and implementation. So we're here to talk about trends and I don't want you out there chasing any, any trend, you know, that you might hear about. But trends are important and they're important for these reasons. And before I launch into what they are, I want you to know that it's imperative to see trends that affect your business to know how to respond. And these are trends that I think should work their way into your plans, all things being equal. So if you're looking at a couple of different ways to do things and one of them is a trend that you believe in, then that's a way to go right there. Because you don't want to find yourself in an architectural black hole doing things that are non standard that only you do in the universe. And things like this and things that vendors will be marginally supporting us as the years go on. And we do know that's important because we have to get more efficient over time. And a lot of things I'm going to share with you are really about that about getting more efficient over time. So I hope to give you some business ideas here as well as technical architectural ideas make you a leader, not a follower, a leader years up tomorrow. And we need them for sure in our businesses, they can advance maturity while also solving business issues. So you're not going to get a budget for staying on the trend. So if you see some trends here and you take them to your organization, you're not going to get a budget for doing that. You'll get a budget for doing things for the business. So do keep that in mind as excited as you may get. There's ways to do this. There's ways to work this stuff into your enterprise, into your purview, into your resume, et cetera, while not getting direct budget forward. And that's really the trick of the information management leader of tomorrow. It's only good for the company. Information management leaders must pick their winning approaches and get on board. I hope to give you some ideas about that here today because the money tree doesn't exist. You have to be selective about where that money goes and hit your trend pursuit efforts to a budget that's delivering ROI. Yeah, I'm big on that. All right. No, so as we end this very peculiar year, I must say some things about it because we're all hopeful that 2021 is going to be different. It hasn't really started out that way, but we can remain hope. Hope Springs Eternal, I always say. So 2020, those who were left impacted, and I did plenty of consulting in 2020 business was up. You all are out there still doing things for sure, very progressive things. But those who were left impacted and less worried about 2020 left brought down to their knees and brought things to a standstill and slowed everything down. You were cloud first. You were already microservices based. You already treated data as a separate function. You knew it's important to the business. Your development was agile, not waterfall, not agile and name only, which is probably a mnemonic I should coin here agile and name only because I see that a lot. You already had your master data, whether it was in MDM or not. You had your master data develop somewhere that people were coming and using repeatedly. Those of you who did these things were less impacted by 2020. And those of you that had this attitude were also less impacted. I like this cartoon here. How did you survive the coronavirus debt? I just kept working. You know, I think I just kept working too and did the as best as I could with, you know, respect going out to those people that were not able to just keep working and had to have extra care for the family or themselves. I get that but organizations that were able to just keep working spite at all or were less impacted. Now, last year at this time, I gave you these trends and I'm going to put them up here. Show you what they were. You can you can judge for yourself how well I did or not. I'll put an asterisk and next to last year because it was that peculiar year that we had. But still, like I said, you were doing things. Data takes steps to the balance sheet. I think I'll give myself a C or D on this one because we were just way too distracted in our government to deal with things like this. Although I will still assert that data meets all the criteria for being there and it's a matter of time. The motion and sensor based time series data. Yes, I believe this was the data that that did explode. I think it's so important to trend that it's going to continue to be enough to be a trend in 2021 as well. So I'll get myself maybe a B on that one. This intelligence interfaces up people. Yeah, somewhere around an A or B on that one because we got away from reports in large we started pushing data directly into business processes. It's really great to see. And of course, artificial intelligence began to do some things. ETL will be nearly automated. Well, I think the vendors put out some more things. Maybe not enough things in my view, but put out some more things to make their work more automated. I saw the adoption happen maybe slower than I would have expected. Maybe part of this is given the year we had, but you may be that's a B or C cloud object storage. Yeah. A on that one. A on that one. You all are adopting cloud object storage for just about everything. Maybe even to peril, but it's definitely being adopted. Anything that's really storing data, you know, whether it's object storage or a peer or whatever, anything that's storing data is on the up because we are storing more and more data. And this maybe gets back to the presentation I gave a few months ago on the year 2045 where we looked way ahead. Now we're looking ahead one year, but when I looked that far ahead, I thought maybe organizations would not have to store so much data, but we're definitely in the actionable future storing a lot of data. More edge AI, you'll hear that again. I think that that will be a continuing trend. The highest use will be training AI algorithm. That didn't happen. I did not take off in 2020, like I thought maybe I'll give myself the mulligan on 2020 for that, but maybe a C or D on that one explainable AI. I think a lot did happen in explainable AI a lot of initiatives by the AI vendors to accomplish this. But I'm going to bring that back as a trend for this year and say that not too much happened, not enough happened really to give myself a good grade on that one. Kubernetes and containers, A on that one, all my clients, all the clients I talked to, companies I talked to, or doing Kubernetes in containers. So this is really strong and it really took off last year and it will continue to take off. I think it's really enough to be considered a trend this year as well, hybrid databases, those that are operational and analytical at the same time. They kind of peeked out, peaked up over the surface, but there's still a lot of stagnation in terms of database usage and choices that were made there. So moving on now, if I get into the 2021 trends, these are some things to look out for that maybe it's unique to this year. But these are some things you got to look out for and I'll give myself an out here by saying that if these things happen in the wrong way that 2021 will be slow, maybe even slower than 2020. I don't think so. I think we're starting off on a good foot in terms of, well, not on the 10 terms of pandemic. But in terms of vaccine rollout, there is hope we're learning to live with the restrictions around us and keep things moving resiliency of corporations. If you didn't have it, you have it now, or you didn't survive. So, or you're looking pretty, pretty pale. Prioritization of forward factors, you're able to look forward and get back on track and get back into the roadmaps that you had before and start delivering the things you were planning on before. And then some, you know, that's going to be great continued preparedness awareness. Yes, we learned about how prepared we were for things. Didn't we in 2020 and I don't have a great handle on where you are in terms of preparedness. We're great in terms of that and move forward and somewhere not. And so I just don't have a generalization about how prepared we are as businesses for things, things like a step or any of the events like what happened in 2020. But as we continue to learn about that, that will affect what we do in terms of the top trends for 2021 and beyond. Now this is the gratuitous slide that everybody is saying something about for 2021. I think it is true in this case. I think we remote work will definitely continue. I see some plans to, you know, return people maybe mid-year is what I'm hearing. Despite whatever, you know, the situation is, we'll say, but I think we've learned to do remote work. It's working. It's working and some percentage of it will stay off site for the long haul. So hopefully that's good for most of you led by cloud computing capabilities, strong tech spending rebound in 2021. Yeah, I think 2021. The CXOs are ready to release a plug gate. So get ready. Get ready. I hope you have a great foundation in place for the onslaught of project work in the information and analytic area and around information and analytics. There's been a lot of development in terms of storage providers, strong storage growth, like for example, AWS storage revenues approaching $10 billion who would have thought that a few years ago. And they're doing things like providing SAM for the cloud. I think Pure was actually first with this with their cloud lock store in 2019. AWS is adopting on that. Also, things like services that let you programmatically set your service level agreements for IOT and throughput and automatic hearing and replication, automatically moving data to colder storage tiers, things like this is happening around storage. There's a lot of development there and we're taking advantage of that artificial intelligence, Kubernetes approaches and automation are driving corp tech budgets and driving them higher. Leading organizations are increasing a focus on AI and ML and this begins the several trends around AI and ML that I have for you. That's going to increase significantly, but just for AI and ML will increase significantly this year. And by the way, I will not profit everything I say today within my opinion, but of course it is. Some companies look for opportunities to scale back mission critical in 2020 that those organizations are probably not the ones that are moving forward hard into artificial intelligence. Like I say many times, you can't skip levels of maturity and this is a maturity level. And if you took a step back, you got to get step back to level before you're going to move up into AI and ML. You got to get your data together. You have to get your architecture together your people together and so on your text backs, which I'll come to a little bit later. You have to have data scientists as well. And I see the number of data scientists going up. Had a client, a huge client, of course, they have 100 data scientists. I mean, real data on board. And I think it's paying off. As long as these scientists are doing data science type jobs. And they're not stuck doing data wrangling type jobs, which many of them are doing the data wrangling because our data architectures are not in place. And so another trend has got to be to get our data architectures in place so that we can take advantage of data science. We're going to see a wider range of AI and ML use cases with particular focus on things like customer experience on things like automation. And we're going to still continue to look to deliver clear business return on investment with our AI project. So hopefully you have a good grounding in business return on investment because guess what you're going to need it. Most organizations will have a lot of models in production, like an expansion of their models, 25, 50, 100 models in production. And what I see is a growing gap between the have and the have nots when it comes to artificial intelligence. And so therefore it's hard to do trends because these trends are going to apply to one or the other, most likely the ones that are the have. But those are the ones that are going to survive and thrive, I believe, in the years to come. Jumping on some of these trends, maybe now's the time to make sure that you are one of the have the next time you think about it. Combining human expertise and AI as we shift towards deeper data driven decision making collaborative AI is kind of what it's all about right now in terms of where we're going with our AI solutions. So collaborating between what humans provide best and what machines provide best. It's not all we're not trying to do everything all machines right now, maybe one day. But right now we're complimenting and augmenting human human capabilities, never replacing them. And we're looking at new ways of orchestrating how tasks are performed. We call this human AI hybrid solution. So look for those types of solutions this year. Model deployment will take center stage. Yes, I said this last year, but I'm ready to say it again. And I believe it's going to be true. Data scientists are going to continue to wrangle and that's that's a huge point in terms of being able to move forward with model development. Let them do their model development and get out of the data wrangling business because we've grown our architecture maturity models are getting more sophisticated. So the data, but the day wrangling is increasing in this continued challenges to data maturity as a result. So once we define a use case, we're able to put that into scale production. This whole cycle takes most organizations probably about a month to go from use case to production. And that's a that's a time that's a that's a long time. That's a long time. And that's a that's that's too much time and we're going to have to, you know, get that get that down. And the way I think the way that I can think about it best in terms of how we're going to do that is with the discipline of ML Ops. And so organizations are going to continue to struggle without ML Ops. And I do say that's the trend really organization struggling without ML Ops, maybe knowing it, maybe not knowing it, maybe knowing it and trying and still having a hard time with it because it is hard. And if you if you don't have DevOps in place, it's not much harder because this is the introduction of new operations processes to the organization. So the discipline of ML Ops, I think it's going to be a trend. I hope it's more successful than what I'm thinking though. And I think that's going to be a struggle point for the year, but a very important one in those few top in organizations that can break through ML Ops are going to reap a lot of rewards for it. More edge AI. Yeah, we have 25 billion or so connected devices growing through projected 75 billion by the year 2025. And we've all been using them knowingly or not. And now because the enterprises, our enterprises have become quasi software development factories anymore. Turning out applications and building mobile applications, supporting IoT enterprises have jumped into embedded databases at the edge in a big way. So databases at the edge, not just data at the edge and also AI baked into the chips at the edge doing AI AI processing at the edge in real time on maybe limited data, but on real data on real real time data. And so, for example, mobile airline applications do a lot in this way, smart city type of work. Those are good barometers for the future. And there's a growing need to infer more data and then make decisions without sending data to the cloud. Just making the decisions right there on a limited set of data, which means you have to have a data ecosystem in your enterprise where analytics are developed based upon large amounts of data. Maybe back at HQ and continually pumped out to the edge for use in edge type processing. Yeah. Edge AI is going to be huge. And you have chip startups out there like Samba, Nova, GraphCore, who else, Sinti Ant, Wave Computing. These are appropriate for edge processing. And these are startups that you need to keep an eye on. They are definitely providing some interesting technology for the enterprise, high performing AI chips. And there isn't a cloud vendor out there that isn't focused on edge AI on this edge computing. Finally, demand for intelligent edge applications is rising rapidly in the automotive, smart factory, and smart home industries with widely available efficient edge ML development tools and semiconductor companies launching new ML features. The adoption of edge ML applications will become a major trend. Now, explainable AI. Yes, it's back. And yes, I do believe that over time this trend will evaporate. I send you back to my presentation on the year 2045 and the high intense competition between the U.S. and China and things like having to explain AI, maybe holding us back and we won't let that happen. So this is hot though. This is hot for the foreseeable future. And given the interest in better interpretability and explainability of machine learning models, cloud providers will invest in enhance their ML offerings to offer a full suite of responsible ML and AI capabilities. Most AI systems right now, they're black boxes. We need to be able to explain the governance, the services, and the lives that produced machine generated decisions. And with the incoming administration, I might have, I would feel, I feel even stronger about this being a trend for the year. Data lakes. You're building them up for sure. You're asking about them a lot. That's for sure. Maybe too much. That's another question, but you're building them. And the rise of the lake house is the, is the real trend here for 2021. And that is, maybe you have a warehouse, maybe you started or built one or more data lakes. And now it's a matter of you want to, you want them to work together. And what that combination is primarily is the ability of the warehouse to reach into the cloud storage as necessary. And these structures also live on a pipeline with the cloud storage serving as staging for the warehouse. So that's the symbiotic relationship of the so-called lake house or whatever you want to call it. Now data lakes have almost become synonymous with cloud storage. And early data lakes utilized Hadoop or HDFS that many jumped in didn't jump in then, but jumped in when cloud storage presented a better option. And you've told me resolutely that cloud storage is a better option for you for the data lake. I agree. And data lakes are strong, a strong trend. But again, you know, it's one of many strong trends when it comes to data. Certainly hybrid solutions is the strongest trend here. The ability for some data to be in the cloud, some data to be on premises. This is going to continue with us for quite some time. As enterprises grow more comfortable with the automation process, the automation of data integration will grow. And that will increase the number of lakes, warehouses, on-prem environments, off-prem environments, and so on. And all these efforts around the data lake. And then we can shift our focus to management and access. And when we can do that, we're going to do more. So you may recall that the ETL vendors promised that it was going to clean up the spaghetti. It's going to clean up our tens of thousands of scripts that we have doing things. And to some degree, it did. But to a large degree, it just gave us as enterprises more capabilities to add to the spaghetti. And so we are more complex than ever. That could be a trend as well. We're at hopefully the peak of our complexity. And at some point, things will simplify. In 2021, let me tell you, things are going to get more complex before they get less complex. And so be prepared for that. And at the same time, we cannot be sitting and waiting for things to clean up a little bit before we move forward with things. We just can't afford to do that. That's why it's important to be raising the foundation all the time. Raising the foundation on our technology stacks. Technology stacks a few years ago used to be pretty simple. There was source data. It came into a data warehouse sitting on a database. And it did that through a data integration tool. Then we put a business intelligence tool on top of that, didn't we? Wasn't that lovely? That was it. And now we have so much more. So like I said, it's gotten more complex. ETL, yeah, that's part of the deal. But streaming is becoming more interesting. And we are finding more and more use cases, especially around that edge data and IOT data, things like that. Where streaming makes the most sense. We're talking here about Confluent Kafka or stream sets, tools like that. And for the data lake, you know, could be Databricks, EMR, HD Insight, Dataproc, for your data warehouse, something like a Vertica. We're big fans of Vertica. Machine learning, like data IQ, data robot, the operational database, you're going to have to have that. That plays into this whole implementation of machine learning models approach. All right. That could be Vertica as well, something like that. Data security. Now we've got Okara, Prevacera. We need that security over the top. We need that security to be able to expand beyond the cloud and our own premises environments. And we need to be able to set profiles and make them work throughout the enterprise. And these tools help you do that. Data governance tools like Calibra, Elation, Informatica, Waterline. There are others. Workload management tools. Yeah, some of us need that in our stack now. A new relic maybe, unravel, pepper data, things like that. And finally, storage. We need to take that under control, especially our on-prem storage. It could be like a pure. And now I talked about ML Ops and DevOps is also going to be a strong trend for 2021. The idea behind DevOps is continuous delivery, continuous delivery. DevOps teams will focus on bolstering feedback that follows between development and operations teams to introduce better visibility into the process. And I always say the more visible, that more visibility that we have, the more we can see insights and see better ways to do things. When you're looking at a very small piece of the pie and everybody's looking at a small piece of the pie, or maybe it's like the parable of the blind man and the elephant. Everybody's touching some part of the elephant and thinking it's this or that. And it's really, you know, somebody's got to be looking out for the whole thing. And DevOps helps that person look out for the whole thing. And it helps all of us really when we get into it. And so we're going to see more people realize that they need to put more efforts into their DevOps pipeline processes and validation in the new year. Now I touched on ML Ops, but let me go into that a little bit more here because this is a really strong trend. Again, a trend that we're going to have our challenges with, but we're going to try. We're going to try. And I hope, and I know some will succeed. Some more succeed than I'm thinking, but I hope some will succeed. Some of you are just, you know, you're holding your breath that you get that first, that first ML model into production. Okay, I get that. I get that. But once you do that, and you want more, and you don't want it to be starting from zero, starting from scratch, start to think about ML Ops early enough. And it applies the DevOps principles to ML delivery. The ML process primarily revolves around creating, training and deploying the models. Once trained and validated, models are deployed into an architecture that can deal with large quantities of often streamed data to enable insights to be derived. And the result is an ML-based application. ML success, however, is not a given. It requires people, processes and platforms that can operate in the responsive agile way. Organizations are looking to operate today. And that is an approach that we call ML Ops. And I believe organizations that use third-party ML Ops solutions are going to save time and spend less data on model deployment than those that try to build their own solution. So there's one vendor area that I think is hot, or will be hot, ML Ops. Third-party ML Ops solutions can help organizations save on infrastructure costs and speed up model deployment and reduce the operational burden on the data science teams, less wrangling for them, less confusion for them, more streamlining their work, which is very important in 2021. I am seeking all the time about streamlining the work of the data scientists because I find that if they're able to focus, they can make major impact, major impact on our organizations. I want to facilitate their work. I want to let people like me and us non-data scientists do the work that they can do, but they probably shouldn't be doing. And that's a big focus of my consulting as I go into the new year. Automation. Yes. When I'm asked, okay, William, where can we deploy ML Ops? Or where can we deploy ML? We're AI, you know, in the new year. I start by saying, well, what can be automated? Let's start with some of that. And that is, I mean, that's a great place to get your feet wet. And it delivers ROI. It may not be the biggest ROI ever, but it gets you, gets that ball rolling, I think. And so if we can't think of anything better, you know, that is definitely out there. Automation with the help of bots today. So any ETL product, for example, getting back to ETL and data warehousing, any ETL product worth its salt should be doing this, should be doing a lot of this automation that we see here. And I'm afraid data integration, for better or for worse, is going to be one of those things that gets highly automated as time goes on here. In 2021, organizations are going to also turn to automation tools like robotic process automation, which will increasingly be leveraged with other intelligent tools like text analytics, document understanding, and more. So look for things that you can automate within your enterprise in 2021. Open source. Yeah, maybe a different, maybe we're going to have a slightly different relationship to open source in 2021. A lot of vendors out there, they're re-architecting their products to open source, and they're becoming winners in the market, not necessarily based upon their technology, but on how they execute. And how, and their customers, these are some of the things that are going to define some winning vendors in the new year. And we're going to be adopting those winning vendors. Maybe that goes by definition, but more and more. To address the talent wall that exists, which is the natural part of our ecosystem, and it pushes us. But to address that talent wall, we're seeing the rise of a hybrid open source business model where open source data analytics companies are, you know, monetizing closed source out of the box capabilities that deliver open source innovations while requiring less time and resources for enterprises to get their value. So you might look at something like what an encoder is doing or what Rancher Labs is doing to demonstrate what I'm talking about here. Very impressed there. It's still pretty easy or it's easier than ever, I would say, to adopt competitive software. And so this is driving the vendor marketplace, our sponsors and everyone in that space really to do better. And end users understand what a vendor means when they, they must understand what they mean when they say they're open sourcing a service. For example, will there be an open source community that will support the software? Just how will it be supported? So there are still questions around this, but it's a strong trend towards open source. And finally, Kubernetes, Kubernetes data analytics fact goes Kubernetes. Yes, for both open source and commercial. You're all are doing that now. And one reason is kind of obvious. It's giving rise to the ability to spin up clusters on a task-based basis. So you might be spending these up for big data, data warehousing, machine learning, ephemeralism. Ephemeralism is a word for it. So things that have short-lived, maybe a short-lived anticipated lifespan, maybe stick around. I don't know, but the ability to just spin up clusters, spin up code and go at things and give things a shot and see what sticks. There's going to be a lot of that kind of an approach. And you got to be okay with that, by the way. If management is stuck in a waterfall approach, that organization is going to be slower to develop things. Because you got to get everything right. You got to pin it down. And it's got to be right and perfect. And technicians, I should say, technical people do not want to work in those types of environments. So we want to also get away from the server nature, the nature of knowing and being intimate with our servers. This whole idea of serverlessness is another strong trend and is hand-in-hand, really, with Kubernetes. And we're not trying to guesswork how much data there's going to be, how much programming there's going to be. We just get started. This architecture is the very substance of the revamped cloud-era data platform, for example. It's also leveraged by Google for Spark on K8 on Cloud Dataproc. Ultimately, it's enabling the new workloads of the new year. Now, another generalization of the new year is that we are at the start of general AI. And those of you that caught my presentation on 2045, I spoke at length about this, about GPT-3, and how excited I am for it, and the possibilities and so on. Yeah, it's all about text and language and communication, but there's so much there, so much possibility for summarizing things, for expanding on things, for finding similarities in language, and so on. There's going to be GPT-3 light things that comes into audio, that comes into pictures, et cetera, in this new year. It's just going to be really exciting. And a lot of you know what general AI is, right? When the machine has the capacity to understand or learn any intellectual task that a human being can. And I'm not saying where there will be there in the new year, but we're moving in that direction. And we're at the start of it, the cusp of it, right? We're starting to see that, we're at the beginning of that bell curve. We're starting to see that go up, and we're starting to see this idea of machines taking on human capabilities becoming stronger and stronger. And we will see how many years of achievement lay ahead in terms of this. But probably a decade, and then we'll be there. So it'll be an exciting decade for sure. So this is one thing to keep in the back of your mind in terms of how you provision things. It's extremely actionable. I almost didn't leave this in because it's more of a bigger deal than, okay, we're trying to formulate our workloads and architects for them, which is my main focus. But it's something to keep in the back of your mind. I've gone through the trends and hopefully some of them have resonated with you. Hopefully some of them you're like, yeah, where are you doing that? I get that. We're going to do more of that. I feel better about it. And then some maybe renewed and hopefully interesting and something to work in. But there's more maturity moving in perfectly than a merely perfectly defined the shortcomings. So anybody can throw stones and define your shortcomings. But how do you get the thing moving forward? That is where the skill comes in. Build your credibility so that you're not starting from scratch on this. You have people that will listen. Build your successes so they know this will be another one. Don't be afraid to fail. Don't talk yourself out of having a new beginning. These are things that I would counsel more and more to an executive level about how the organization should be. Don't have an organization where people are afraid to fail or have a new beginning. We're at a great new beginning here. And as we start to get back to the office and get back to normal, that's another new beginning. We're going to have that. It's going to be exciting this year. Have an open mind about things. No plateaus are comfortable for long. So I gave you some trends here. It's not just a hopefully it's not just, you know, I took in a presentation and you know, it's one of many and I'll just put that on the shelf and move on. You're on a plateau right now and it cannot be comfortable. That is not the environment that we're in. So hopefully I gave you some things that will help you to move up to the next level. That resistance is not about making progress. Everybody wants to make progress and probably move in the direction of these trends, but the questions I see is more about the journey. How long is the journey? How much will cost? How much risk is involved? Who are the people in that journey? So finally, I'm going to leave you with some winning approaches in 2021. You can't go wrong. You can't go wrong if you do these right and notice I did say if you do them right. So I'm not saying that all cloud computing is going to be great and wonderful. There are right ways and wrong ways, but there's definitely right ways now to do these things. And these are all things that an enterprise, I'd say if you consider yourself a mid-size enterprise up, you know, these are all applied. Cloud computing for sure. Also on-prem hybrid computing doing the right thing there that applies artificial intelligence. I've made a big point of that. Data lakes, data warehousing made big points about that. Master data management didn't make a point about that, but I definitely believe in that as a foundational component as our development, yeah, Kubernetes, yeah, automation, data quality. Data quality remains something that, of everything on here, this drags forward from a decade ago and it still remains as strong and important as it ever was. Because we haven't solved it yet. And we continue to get more data and do more things and get distracted and deal with poor data quality. And sometimes we know that that's causing problems. Sometimes we don't. Sometimes we learn. Organizational, let's get one. Graph data, yeah, that's huge. Organizational change management for all the things we do. Bringing the people along with DevOps and ML Ops, yeah, for sure. Data catalogs, yeah. Data governance, data lineage, data lineage. How about that? How about knowing where all that data comes from? That's going to be increasingly important in a world of compliance. And I see that as part of data governance. Data governance needs to be actionable. It needs to be visible. It needs to be doing things for the applications that are doing things for the bottom line of the business, not just sitting in an isolated corner of the enterprise that the executives would say, we have data governance, I didn't know we had data governance. Or, the worst, we have data governance, what are we getting out of that? Okay, that's not the type of data governance or data anything really that we need in 2021. So I wish you success in this new year. And then hopefully I've added a little bit to that year, to the planning for the year. And keep checking in with us here every month for an advanced analytics presentation next month. I think I'm talking about data integration. I can't wait. That's an area that's seen a lot and has a lot happening in it that I want to share with you. But for now, I'll turn it back to Shannon and maybe we have some questions for us. We do, William. And thank you so much for another great presentation. And I'll invite Jeff and Miraflops to join us in the question today as well. I'm diving in here, ETL automated. Does it include error resolution? What was that word, Shannon? Area, what? In ETL automated, does that include error resolution? Yes. Okay. Good question. Yes. I would say it does being able to resolve processes that have, if you can map out, you know, how you would resolve things. That's something that ETL tool should be doing automatically. Is that waking you up in the middle of the night? You know, the proverbial wake up in the middle of the night, if you will. But yeah, definitely error resolution within and really anything that has to do with coding of any stripe is getting more and more automated. So, you know, GPT-3, for example, in that whole area can do some automated coding based off of human language. And so this is an area that, you know, we see working its way into many vendor offerings. Perhaps, you know, Jeff can speak to some of those areas within Vertica or Miraflops about some of that within Pure, but I definitely see it. Yeah. The only other thing that I believe in some of the Vertica, it's more around ELT. So being able to add a lot of built-in analytical functions or data preparation, we certainly don't believe that we can replace all these incredible ETL tools that we integrate with, but just a different type of an approach in database. What is your view on incorporating business data stewards into data operations? Assuming that's for me, business data stewards into data operations. Well, I try to get them involved mainly in data. And I don't get, you know, in terms of the operations, I think that's part of their day job and maybe not part of when they have their data steward hat on per se. But I would like the data stewards to come from the business and be impacted by their decisions that we make and implement around their data, wherever it may exist. Data warehouses, master data management hubs, data lakes, et cetera, et cetera. And so I view their role as a data steward as more to do with the data, but I am hoping and trying to select data stewards that operate with that data on a regular basis. I hope that makes sense. I don't see it as part of a direct part of the data steward program, but I see it as part of usually their role otherwise. Sure. Jeff from Miroslav, anything you want to add to that? No, I think I'm going to cover that portion. Go ahead, Miroslav. Yeah, no, I thought that was pretty comprehensive. So one thing that I'd say is around things like that, what we've typically seen in the industry is as more of these workflows evolve, the software ecosystem to make that work smoothly also evolves. Like my sense right now is that data stewardship as part of the pipeline and workflow is not really supported by a lot of existing software, you know, systems. And as those evolve, that will become easier, but it's certainly a valuable thing to consider. Perfect. And thank you both so much. And thank you all for these great presentations and comments. But that is all the time that we have for today. And thanks to all of our attendees for hanging out and being engaged in everything that we do. Just a reminder, I will send a follow-up email by end of day Monday for this webinar with links to the slides and the recordings of this session. So thanks to Berkshaw and Pure Search for sponsoring and helping make these webinars happen. And I hope everybody has a great day out there. Thanks all. Thanks everyone. Thank you everyone. Take care. Take care guys. Bye-bye.