 Thank you very much. Buenos dias. Muchas gracias. AutoML has become, yes, a hot topic. And we wanted to explore this. My colleague Ben Lorak and I have been looking at some trends. We've been doing some research in industry and seeing about AI adoption. Last year, when I was here, I mentioned that we had done a survey. We followed up with two more surveys over the course of the year. And so I did a big report about data governance. And then the next one, the other trend that we wanted to follow up on, was AutoML. And Ben Lorak has also, he's recently left O'Reilly Media, and now he's doing some of his own work. There's a new podcast that he's doing called the Data Exchange. And for the first episode, Ben and I were talking about this topic with some other related topics. So if you want to grab the slides, they're online. If you take a photo on your phone, it should load this OK. Sorry, let me keep this going. AutoML is interesting because it's showing up in academia. It's showing up in industry. Clearly, there's a lot of indications that this is a growing trend. And I really loved the keynote this morning from Oscar Mendes, where he was talking also about how Stratio is working with AutoML. And I think that a lot of the key points that he brought out there are central to the thesis of what we've been finding in terms of our market, sorry, our industry research in this field. So as an overview, nearly all of the public cloud vendors are promoting some sort of AutoML service. As well, the tech unicorns are also developing these kinds of services that they use in their data platforms internally, Uber and others. Some of these are going open source. And we'll take a look at a survey of these open source projects as well as a survey of the different vendors in the industry. There's a lot of smaller tech startups that are promising to democratize machine learning. And also, I think the real business value there is there is a lot of hiring pain. It's very difficult for companies and traditional businesses to step up with really good data science talent. So a lot of what's happening is really to alleviate the hiring pains. And we'll explore some of that. But given all the buzz that's happened in industry, what does this really mean? So first off, just as a little bit of background, I love history of science. I love looking back at where did some of these origins come from. There's a few things that I've written over the past year or talks that I've given that really delve into some of the history of data science, but also some of the future trends. And so there's a small book out recently, 50 years of data management and beyond. And a couple of others, there's a keynote from the Rev conference, too, that I think speak to this. But really, it points back a lot to we can trace some of the origins back to John Tukey in 1961, where he was talking about empirical data analytics for the first time. And that really struck the tone for an interdisciplinary approach, which sense has grown into data science. And now, in terms of machine learning and AI, those terms do get used interchangeably a lot. So I wanted to differentiate that to some extent. You may have seen this from Madeliso, the difference between machine learning and AI. If it's written in Python, it's probably machine learning. And if it's written in PowerPoint, it's probably AI. And unfortunately, there's a lot of truth to that. But I would like to point out maybe a more effective kind of distinction. Machine learning is a lot about tools and technology. Generally speaking, when we're talking about machine learning, we're talking about some sort of differentiable system working with gradients, understanding how to find optimized points within a gradient. And that's the case, whether you're talking about deep learning or reinforcement learning or any of these. AI, though, from my perspective, what we're seeing emerge as a good working definition for AI is when those kinds of tools, not just machine learning, but other tools as well, come into play as far as changing social systems. What impact do they have on groups of people and how they behave? How does it augment them? How does it change the behaviors, et cetera? So that's kind of the working definition that I have in what I'll use in this talk in Elsevier. So when we look at where there's evidence of intelligence, where can we see intelligent behavior, there's definitely three, and I'll add a fourth one based on the recent talk. Certainly we can talk about recognizing complex patterns. So for instance, when we use supervised machine learning, typically we're trying to generalize and recognize some sort of pattern. And it's interesting because if you look at the benchmarks for this, even amongst human experts who effectively are trying to recognize the same patterns, whether you're looking in finance or really any kind of space, it's interesting that the human experts hit diminishing returns at about 95% accuracy. And it's often because that's where experts tend to start to disagree. And so what's been interesting with benchmarking in AI lately is where we can show human expertise benchmarks of human performance and then have machine learning models come in and start to approach that 95% mark. Because that's where you have machine learning effectively as good as what human experts do. But there's a deeper kind of intelligence that we see. And that's where, again, machine learning is augmenting social systems. And so I think that the real thing to look toward is not machine learning is necessarily to machine learning models to predict things, but rather machine learning models to help augment teams of people. You can think of having teams of people and automation. And think about them complementing the relative strengths of both. There's another area that I'll throw in. It's kind of a subcategory. But there's a great talk by Michael Jordan, who's one of the most preeminent professors in machine learning. He's at UC Berkeley Rise Lab. And Michael Jordan made the case that markets are also a kind of codified intelligence. And you see effectively intelligent behaviors in markets. So they're doing very interesting work at Berkeley in terms of economic studies of markets to understand learning mechanisms. Really, really fascinating work. So if you haven't caught that yet, unfortunately I don't have any pointers to it, but he had given this at the AII conference in San Jose a few months back. And then the final area that I'll talk about is really where we begin to explore and exploit very high-dimensional spaces. So as Oscar was mentioning in his keynote earlier today, people are good at maybe three to four dimensions. Some of the kinds of problems that we work with in machine learning, typically, you've got at least 20 dimensions. Some of them you have thousands. So when you get into those really high spaces, people aren't all that effective, but machines can be. And this is where we start to look at really sophisticated math, you know, tensor decomposition, things like that. So we see evolutionary software. We see retribution, or what you might call abduction, in terms of logic. We see generative adversarial systems. All of these are really non-human forms of reasoning. And these are getting a lot of traction lately. So to break it down, you know, the first one, this has kind of been a generally accepted definition for machine learning. And yeah, you know, here's Microsoft where they beat the human standards for speech recognition, speech and text. And again, they were doing about 96% accuracy to get that. And then in terms of augmenting social systems, what's interesting is we have a long backlog. And I know I've shown this a lot before here, but in the context of this, certainly there's a lot of backlog of understanding the intersection between social systems and automation. And this goes back into the 1950s, really fantastic work by people like Fernando Flores and Roberto Matarana, and also other people summarizing them, like Renov Glanville. So I definitely recommend checking that out as kind of a blueprint for AI going forward. In terms of these non-human forms of reasoning, though, these kinds of high-dimensional spaces, certainly we have some pretty interesting things going on there. If you haven't seen this, I may have shown this last year, but if you haven't seen everybody dance now, it's a fascinating study, the video especially, where they can take videos from professional dancers, music videos, ballet, et cetera, and then run, yeah, I'm not a very good dancer. So they can run people like me through that, and suddenly it looks like I'm performing like a ballerina. So really fascinating what you can do with generative adversarial networks that way. And the other side of it, too, is where we're seeing much evolution in terms of attack surfaces. This is one of my favorites. It's an interactive tutorial on GitHub by Kenny Coe out of UCL in London, looking at how to attack machine learning models, especially computer vision models using injected procedural noise. Okay, so as far as AI, there's some interesting components so far. If we're gonna talk about AutoML, we need to understand the bearing that they have, that it has on these different components. You've got deep learning, reinforcement learning, a lot of interesting work in embedded language models. We had a workshop yesterday on some of that. Knowledge graph is really rising up, and it's distinct from what goes on in deep learning, the two really augment each other. There's also another area called weak supervision. And again, I think that these are complementary approaches, and I would really like to avoid using the word predict. I think that the word predictive analytics has been used a lot for data science and machine learning, but I think it also brings with it trouble. The word predict sort of implies that you have a crystal ball. And if people, especially business leaders, come into working or trying to understand how to work with machine learning, and they're thinking that AI is some kind of crystal ball, we all get in trouble. So instead of prediction, I would like to focus on more of how people make decisions, and how we can make informed decisions, whether it's for executives or for customers. I think this fits pretty well with what some of what Cassie was saying in her keynote too. So I'll just go through these quickly, but we'll have the slides if you wanna dig in. I like to try to provide some primary sources. So deep learning has been around for a long time, but we really made the breakthrough in terms of 2012 with AlexNet. Reinforcement learning, you know, it goes back to notions of optimal control theory, again, that have been around for many years. But this has been relatively recent. Of course, DeepMind did really well. Ray is an interesting, very interesting project out of UC Berkeley Rise Lab. I predict that Ray will become a very substantial open source project. And another thing is if you really wanna dig into what's current with reinforcement learning, I highly recommend some of the lectures by Danny Lang, who's at Unity 3D, the gaming company. They've been doing fantastic work with reinforcement learning. Language models, again, we talked about that yesterday at the workshop. There's so much work that's happened in this field. It broke open in early 2018, so it's less than two years old, and it's totally changed how we approach natural language. But there are trade-offs. It's also very expensive to do the latest and greatest, so there are coming to be cost and energy kinds of trade-offs. Knowledge Graph is something that's a little bit, it's been a little bit harder to define. Certainly I can point to some contemporary examples. I think the Fabio Patroni article there, language models as knowledge bases, that's pretty interesting. We're seeing now where we can take language models and use them, like for instance with Document QA, where you can establish a kind of question-answer interrogation of the model. And so it's a way of developing knowledge bases, knowledge graphs. And also I'll point to a really great article from John Bohannon at Primer AI about constructing knowledge bases, knowledge graph. Primer AI, last week, week before last, broke the world record as far as a state-of-the-art and named entity resolution by doing some of this kind of work. John Bohannon led that. And then finally just as, again, as background, when I talk about weeks of revision, I think there's fantastic work coming out of Stanford and now U-Washington and also Waterloo where some of the lead professors in AI have been working on this project called Snorkel. The idea is to have more mathematical modeling, if you will, for the lineage of what goes into constructing a dataset. Because the interesting realization there is that any dataset that we have has human judgment in it. At some point somebody made the judgment to collect the data and to collect the data in particular ways and make particular kinds of trade-offs. And they made judgments about how to name certain kinds of measurements, fields, or columns that were being collected. So when you start to understand in more mathematical terms, more functional terms, how those judgments have gone into assembling a dataset, then you can begin to make more intelligent decisions about how to balance the mix of what goes into a training set for machine learning model. And I believe that this will become especially important in the context of bias and fairness and ethics. Okay, and one person I really highly point toward if you wanna dig into this in more detail is Lucas Beowald who had created a company that was called Figure 8. They were acquired at least a year ago. But focusing for a moment here on deep learning, deep learning is difficult, largely because of the costs. And Lucas has some great advice. Also they built their company off this advice. Number one, an interesting realization that algorithms are not nearly as important as datasets. We have a large backlog of algorithms. And if you look, there's this great study that Lucas was pointing to and I've got the illustration at the bottom. If you look at breakthroughs in AI and you look at the time from when the algorithm was first published to when the breakthrough was recognized, in this study, the mean time was 18 years. But if you look at the time between when the dataset that was required for that breakthrough first became available and the breakthrough being recognized, the mean time was three years. And so across the board we can point to the fact that we have a large backlog of algorithms but it isn't until we really get some sort of dataset that can prove the case and make that concrete. So thinking about the emphasis of datasets to the points there are about active learning and transfer learning as ways to leverage deep learning models in ways that enterprise companies can take more advantage of or startup companies can take more advantage of. You don't have to be Google, Amazon, Facebook, et cetera, just to train deep learning models if you're leveraging transfer learning and active learning. And so this really bears down on auto ML. Really, if you want to approach the most contemporary sophisticated machine learning using deep learning models there are really three ways to go about this. One is you build your own but because of the costs and because the expertise required that's limited to really what we would call the hyperscalers, the cloud vendors, probably a dozen companies. There might be a few more that are adding to the list but it requires a big investment. And unless you own a public cloud your company is probably reluctant to make a very large investment in the hardware that's required especially because the hardware is evolving so rapidly and will be changed out. So what most people in enterprise are doing is reusing models that have been trained somewhere else. And so by the good graces of Apple and Google and Microsoft and others we're able to take those pre-trained large scale models and then do some fine tuning to apply transfer learning to use them for our use cases. A third point though would be can we leverage services? Will the cloud vendors and others provide services through auto ML so that we can take advantage of their infrastructure and be able to customize models for what we need? In the breakdown, when we looked a year ago at the industry adoption of auto ML it was single digit percentage. When we looked six months later it had been gaining quite a bit. For now reuse through transfer learning is that's really gonna be the norm and I've got a sampler of different tutorials if you want to dig into that. But auto ML is what comes after this. So in terms of the business case as I mentioned Ben Lorick and I had been doing industry surveys and we completed three for O'Reilly. They're all free available downloads if you click on the slides it'll take you to the download pages. So amongst these one of the things we found was looking at what the priorities were for enterprise companies in terms of their machine learning adoption where were they spending the money for projects? So priorities were definitely at understanding what's going on inside of models but then right after that we see automated model search hyper parameter tuning basically getting into auto ML kinds of features. Oh yeah by the way and the respondents basically in the latter survey that we did half of respondents were looking toward incorporating some sort of auto ML. When we break it down according to sectors it's interesting finance had less of auto ML I think probably for some very interesting reasons finance tends to want to have more control and more accountability over things not usually the earliest adopters but healthcare actually had a pretty large uptake in this and also within IT itself there's a lot of work in auto ML. And another thing is when we did these surveys we were really looking at what we would call a contrast study. We were looking at companies that have been deploying machine learning and production for five years or more and then we were looking at companies that haven't even gotten started. What's the contrast between those two types of organizations? What has the more mature practices what have they learned? And so here is part of this contrast we can show that the mature practices are doing much more work with auto ML whereas the companies that are sort of in between stage that are just evaluating they're starting to get there. Obviously the companies that haven't started in machine learning yet they're at zero percent. So what this indicates is that auto ML is something that once you get into a practice of machine learning and production in industry you find that there's value in auto ML and it may not be apparent up front but it's a good thing to be looking at. Another thing that I should add from the surveys is we were surprised about funding levels when we did our surveys we thought okay how much are people going to invest in AI? And we thought 5%, 10%, 20% and then just have a catch all bucket for 20% or more. And to our surprise for the companies that are at a mature stage that have been deploying machine learning for five years or more 43% of them said that they were investing 20% or more of their overall IT budget into AI projects. So what we're seeing there is that there's a gap between the companies that are doing work with AI versus the ones that are not and the gap is widening. And when you look at some of the next slides we'll show what it takes for a company to go from zero to becoming competitive as Oscar was mentioning earlier this morning that's the substantial transformation. That's the kind of thing that doesn't happen overnight probably takes years. So in my estimate here these are the hazards if we take a look at sort of a survival analysis view the ones that aren't even getting started yet they're usually stuck because they lack the technical infrastructure. Too much tech debt, data is siloed they really can't get access to the data they need for their data teams. Also another big thing is where the company culture just doesn't recognize the need for machine learning. A lot of executives and board level members they grew up with six sigma. They grew up thinking that uncertainty is a bad thing and now they're having to confront using probabilistic systems that embrace uncertainty and they don't understand it. And another point too which may be changing but certainly in the business units the line units there's a lack of having people who are on the product side who really understand how to translate from the technology to the business use case. So having those line unit product managers really understand what's needed for machine learning that's a big lack in industry. So given these hazards we'll come back and analyze this a bit when we look at some of the later slides. But key takeaways here the firms that have realized return from machine learning they're doubling down on investments. The firms that haven't even gotten started yet they're gonna take years to go through transformations to become competitive. And I don't think that they have years. And I think that what that leads to is probably for about 50% of enterprise firms they'll be looking at mergers and acquisitions. They'll be at risk at some point in the not too distant future. The first mover advantage though the companies that again they're the hyperscalers the cloud vendors they have the big AI teams et cetera. They're monetizing their leads. And AutoML services are one of the ways that they take advantage both of their lead as well as the liabilities that the other companies have. So let's explore that. Now in terms of AutoML I think it begs some definitions. It's still a lot of it is at research stage. So this is a business talk and I wanted to do an analogy from business. And admittedly this is a very gross generalization but in business you can think of workflow as I have some pre-qualified leads. I go out and I do my business development. I find some ways to offer some terms. And we decide on statement of work. We get a purchase order. We go and perform the work. We get the work accepted. We go and invoice them. We collect on the invoice. And that's a way of thinking of kind of a workflow in business. So machine learning has a very similar workflow where we've got data that we're working with and there's some outcomes that we're driving towards. On the one hand we're often in data science we're working toward executive decisions. A lot of what Cassie was talking about. On the other hand we're working toward customer experience and how does machine learning fit into products. I'll really focus on the latter. But it applies for both. So let's say that this is a kind of idealized machine learning workflow. You start out and you do some data preparation and in data science of course it's always a lot of the work, maybe 80% of the work. And then you go toward the feature engineering, feature selection and then from there build some models, optimize the learners and the models. From there evaluating the results which is another area of very hard problem. And then from there you get into integration, deployment, scaling, et cetera for the use cases. So let's break this down. In terms of AutoML, each one of these stages now has vendors and open source projects which are providing AutoML features. And some of the vendors are really looking at the entire workflow, the end to end life cycle. So at the first part we can talk about a couple things. One of these is data cleaning. And how can you start to automate what's going on in data cleaning? I think one of the better projects that I've seen so far is called HoloClean. It's out of University of Waterloo but it's part of the Snorkel project. Another thing that comes in at an early stage is to use meta learning. Which is to say, can I start out a project by using what has been learned about previous configurations for similar projects? And so I think that meta learning has a long way to go. It's a lot of data mining, the institutional history of an organization. Next part though, feature engineering. Feature engineering of course is a very hard problem. It typically requires a lot of depth in statistics, probably a lot more than most data science teams will have. One of the better experts in this area, I would point to Gavin Brown at Umanchester. There's an excellent paper they have on the stability of feature selection and how that leads to workflow reproducibility. But there are features now for beginning to automate the feature selection, the feature engineering. And this is very promising because it's one of the areas where AI can augment where most of the people doing the work are probably finding it to be very complicated maybe out of their depth, myself included. Training models, when it comes to having your data prepared in a way that you can begin to train different models, this is often called hyperparameter optimization. So the idea is, how can we control the hyperparameters that go into building a model? And then once we've developed a number of different models, we'll go and compare them. This is where traditionally the first starts of AutoML had come out of projects like HyperAlt. And a lot of people still think of AutoML just as this part. One of the messages that we got out of doing this industry research was, no actually AutoML is the end in life cycle. There's much more than just this optimization part. But it is where there is a lot of substantial cost savings, particularly in deep learning. So some of the companies there, I mean, as far as like really sophisticated hyperparameter optimization for deep learning, there's companies like Determined AI that are I think having a really interesting offering. Once you train models, what's interesting is that the evaluation of different machine learning models is difficult. The different algorithms used to train learners have different means of evaluating. It's very tough to across the board compare these apples to apples because they're different kinds of evaluation. Sometimes you're looking at R squared, sometimes you're looking at Genie. Depending on what sort of algorithms you're working with, they're entirely different kind of metrics. And if you don't understand the subtleties on what those metrics mean, when it comes time to deploy and put this in production, you can get in real trouble. So having more automated means to augment how you do the evaluation, that's very important. And the other thing that's coming into play here too is where once you've created a number of machine learning models, you may have several that are relatively similar performance, but they're good in different areas. They have different strengths and weaknesses. And so that's typically when you would use ensembles. And some of the auto ML services are beginning to provide ensemble creation, which I think is a very natural thing for this step. And then finally, as you start to move models into production, there are a few different areas there that are very important. So for instance, SageMaker has had very sophisticated services on AWS for doing auto scaling. And that's certainly part of moving models into production. There's a whole other area of ML ops that concerns this, but I think this is where the machine learning workflow starts to hit into the operations side. The other thing is a lot of machine learning these days is going into embedded products. They're not necessarily running on a cluster somewhere. They may be running inside of a smartphone, or they may be running on an edge device, IoT. So for that, we're looking at issues of model compression, distillation, and other ways to take a big machine learning model that's been trained on a cluster and shrink it down to fit into the form factor of the embedded device, both for size and for power and still making sure it has the kind of accuracy that's needed. So those are the categories of what we're seeing with AutoML, but there's one real problem here. When we look across the board at what the companies are offering, I do see a lot of claims, especially some of the smaller vendors. And oftentimes they'll say that they're going to democratize AI or machine learning and that they're gonna optimize models for you, but they don't explain exactly what they're optimizing. When we use the word optimize with machine learning, we could mean many things. Even if we're talking about accuracy, I can spell out a few precision recalls, specificity, perplexity. They all mean different things depending on the use case and they can produce wildly different results if you confuse them. But there's more to it than accuracy. There's also a matter of understanding the uncertainty and being able to report that and leverage it. There are different types of costs. There's the cost it takes to train the model. There's the cost in terms, I mean just in how much does it cost to run cloud computing or your own data center to train a model? But there's also the cost in terms of time. How long does it take to go from training a model to getting it out to production? So we can think both in terms of monetary cost and time and those trade-offs. There are many other kinds of areas, certainly model drift, fairness and bias, privacy. All these come into play in terms of how do you optimize a machine learning model? So I think that one of the key takeaways here is when there are claims in industry, we should be pushing back and we should be saying you don't want to just optimize one dimension, you probably want to optimize about 20 dimensions. And along with that, one of the problems of comparing AutoML is there's really no good public benchmarks yet to evaluate different offerings. I think this will evolve, there's certainly some academic steps toward this, but minus having that it will be difficult to compare the different vendors. So the landscape is if you go to the link here, we've got a public spreadsheet, Google Doc, and we've been listing the vendors and also the open source projects. So for instance, we have several vendors here who are doing really sophisticated work with AutoML, certainly Astratio and DataRobot, Microsoft, Google, IBM, Amazon, I know they all have really interesting offerings. And what we're trying to do here is break out who's offering what. So for instance, with DataRobot, they have a very interesting approach. It's the end-to-end life cycle, they're augmenting what a data scientist or a programmer would be doing in their daily work, whereas some of the others I mentioned determined AI, they're more focused on just a particular kind of offering optimizing deep learning hyperparameter selection. So I would love to get feedback about this. We were trying to collect more market, or sorry, industry intelligence about this. So if you have anything to offer, please email me or catch me on Twitter or so. And also just tracking the open source projects and what they're doing there. I think that's one of the key messages coming out of this talk, I'll have that at the end. So if you want to go into more detail, I know this is supposed to be a business talk, but if you wanna see some of the real detail about AutoML and what the latest research is, I've got a couple pages here of recommended articles and papers to read if you wanna do more of a deep dive. The first one is probably the best, sorry, awesome AutoML papers by Mark Lin, and he's updating that regularly, so if you go back and check it, every week there's new stuff there. But that's probably your first go-to stop to see what's the latest. And certainly for an overview, I think that Microsoft paper about what Microsoft learned, how software engineering had to change because of machine learning. I think that's also a really great study. And another point here too, what we're talking about in terms of AutoML, perhaps maybe it doesn't happen as a service per se, maybe how it begins to get integrated into the workplace is projects like this. So this is where DeepTab9 had worked with embedded language models, GPT2, doing very sophisticated natural language work applied to GitHub repos. And so if you can learn to predict natural language, sorry, I used the word predict, if you can learn to understand sequences of programming by learning from GitHub repos of source code, then you can build out auto-completion, very, very smart auto-completion for programmers. And I have a hunch that we'll see a lot more of this in the near future. Certainly there's a lot of program synthesis, projects like auto-pandos to automatically generate pandos and related things for SQL. So I think we'll be seeing more of where this is where the smart gets injected into the workflow. Another thing that's really changing this, I've shown this slide before, but I can't understate it, hardware is changing and evolving so rapidly. It used to be that we talked about software engineering and we would say that process is very general. You have agile and it applies to a wide range of different projects. And then software is also general, abstracted. Hardware, it's way down low, you hardly ever mess with it. So, you know, focus on the process and the software. But over the past few years, that's entirely flipped and now hardware is moving faster than software and software is moving faster than process. And the companies that don't understand that are mostly bewildered. The companies that do understand that are really taking advantage of it. And the hardware that's changing is not just in terms of processors, but we're seeing really intelligent front-ends on memory fabric, what's going on with switches, just across the board in terms of edge computing, low power ML. But certainly the processors too. If you look at like the wafer scale that's come out of Cerebrus recently, where they have 1.2 trillion transistors on one chip and the memory is 40,000 times faster than the largest GPU. And there are some projects that are really using this, like Jupyter and Aero, some of what's coming out of Ray as well. And so I think one of the punchlines here to really deconstruct what's going on with AutoML. This was a talk from Jeff Dean and he was saying how the current bottleneck, if you will, is that to have a solution for doing machine learning and production, you have to hire the right machine learning expertise, you have to have the data, and you have to have the computation. And what he's saying here is that if you go back to the hazards, where it is that companies get blocked in terms of moving machine learning and production, a lot of it has to do with the talent crunch and they just can't hire up enough of the machine learning expertise. Now the flip side of this is that Google and other companies very similar to it have spent a lot of time acquiring as much of the ML expertise as they could. So now that they've got it all, they're gonna go back and sell it to us. And the way that they sell it to us is by having us buy a lot more computation. And that's kind of the punchline of AutoML. So we can talk about that more in detail. I'm kind of running out of time, but I think that it's a little bit of a cynical view, but it's also a very realistic view. And so to finish up, I talked yesterday about data governance and Oscar was talking about the importance of data governance. And I'd like to show how that fits here. So there is a talk if you wanna dig into about data governance, there's a lot more resources there. And so here's the notion, when you introduce machine learning into production, one of the first things that happens is that your operations change, your business operations change and the kind of metrics, measurements that your operations team needs to collect are very, very different. I did a track at Oskan this year. IBM had sponsored an ML Ops day and we had fascinating case studies from Capital One and Comcast, GitHub and others. Probably the best out of that was a talk by Donald Miner who he's been giving a webinar in other talks entitled Vital Signs, where from an operations standpoint, what changes with machine learning and what do you need to measure? What do you need to think about as an operations team? So ML Ops is emerging. Meanwhile, we're seeing since 2018 a kind of global reckoning about data governance and data governance is being taken much more seriously now. So data governance is another way of saying getting your data in order so that you can leverage it with machine learning. So we really see where these three thrusts start to co-evolve. Number one, you do have to have the data governance in place, it's a non-starter if you don't. Number two, you have to embrace ML Ops or you won't be able to manage your operations well. But if you do those things, then you're in very good position to start to take advantage of auto ML. And again, I think that Oscar had similar remarks about this with what Stratio is doing. Certainly if you talk with Data Robot, they'll have a very similar view. But I would say look at these three areas as co-evolving. I think eventually they'll become one general category. And to show evidence of that, in particular, I'm really interested in watching this, these are companies which are not the hyperscalers. They're not the cloud vendors. They're sort of a younger generation. So Uber, Lyft, Airbnb, Netflix, LinkedIn definitely falls in this category even when Microsoft bought them now. But what's interesting is when I went out to do some of this research, I found that all of these companies have projects where they collect metadata about data set usage, the business usage of their data, and then they build knowledge graphs to describe, to represent that metadata. And number one, they're doing it because of compliance. GDPR was sufficiently vague and so they had to respond. This is one way to respond. But then number two, once they do it, they realize there's business upside. They have a lot better view on where is their money being spent in data science? How can they train people or provide better tools to offset some of the repeated costs? And Lyft in particular has found pretty incredible cost savings on that. And the other part is just it allows them to get a global perspective of data sets being used in their business and how they can then perhaps even launch new business lines. Certainly in case of Lyft and Uber and companies like this, they've made their money by monetizing data sets in the statement. And so if they can find new ways, new combinations of data sets, new ways to monetize those, that's how they grow their business. Most of these projects are open source. So I think that there's some excellent vendors in the AutoML space. There's also these supporting open source projects. And with that, I'm out of time. So thank you very much. I look forward to answering any questions. Do we have time for some questions? A minute. A minute, okay, one question. And then I'll go outside. I'll take any other questions. Anybody? If not, I'll be outside. Yes! Can we predict how many questions there'll be? No, sorry. Oh, why that? I mean, it's gonna happen. That's a good question. So yeah, I mean, the prediction is going to be kind of a mechanism. But I think it's dangerous. I think instead, we need to understand how uncertainty works. It's not simple. It's not one dimension. So if we can look forward in forming decisions, I think that's a better sort of cognitive framework to go into the business applications. If you're expecting new prediction, you're expecting a crystal ball. And I mean, I think that it may be just a matter of time too, because we do have a lot of people, how shall I say who are my age or older who are on board or directors? And when they heard the word prediction, they think about something that's too simple. Yeah, great question. Okay, I'll get outside. Thank you much. Great, I hope I didn't go over too.