 scale tracks. So my name is Ryan. I'm based off of Malaysia. So it's absolutely good to be back here in Singapore as well as in South Asia. So my last South Asia session was in Bangkok like three or four years ago. So it's good to see something in their faces. Okay, so I'm going to talk about the predictions, but just a quick, you know, you know, quick overview of Moscow. You probably have seen this in a keynote in my colleague presentation. My SKL has been around for 27 years now. It's a very solid transactional database. And because of all this user, Moscow has gotten, I mean, it's been, you know, really solid and stable. And from the Moscow team, we have just added, you know, a great innovation into Moscow, native, built for the cloud, heat wave database engine. So this is what I'm going to talk about today, the cloud based Moscow heat wave engine. And not forgetting, you know, lots of open source application supporting Moscow that make it really, really popular. Okay, so Moscow heat wave, it's one of the greatest innovation from the Oracle Research Lab that we have added a special plug-in to the InnoDB engine. Bear in mind that this is a cloud only database, managed database in the Oracle cloud, as well as in AWS. And if you're running Azure, you can also take advantage of heat wave using the fast speed interconnect from Azure to Oracle cloud. So essentially, heat wave is a special plug-in to the InnoDB engine. As you know, InnoDB is very, very good in transactional workload. It's really stable. Customers have been telling us that, you know, the replication is rock solid. The replication, you can really do massive hyperscale application. And for that, there's something we still want to improve, especially on the old lab workload, which traditionally doesn't really work really well in InnoDB because of, you know, the role-based, you know, RDB and mass design. So what we have done is we have designed this InnoDB heat wave engine to be able to help you to automatically convert the role-based into hybrid columnar format and then making use of the elasticity of the cloud to height the scale, the database engine for you. So this is introduced in Oracle cloud in 2020. Today, we can scale up to 64 nodes, which means that you can create a heat wave clusters with 64 nodes in the cloud and it's capable of, you know, building a data warehouse of 60 terabytes. So you can move 60 terabytes of the data into the cloud. So essentially, when you have your transactional data product, sales information sits in the InnoDB, you can push this data automatically to the heat wave cluster and data will be split and shattered, distributed into the heat wave cluster. Okay, so in the columnar format, and when you run your end of the query, it will be automatically pushed down to the heat wave clusters and the data is stored in the memory. So you get distributed processing as well as the in-memory processing to give you the power. Okay, so you can take advantage of the multiple nodes in the cloud. And one of the reasons why we choose to do this in the cloud is because you can scale up and down the resources as opposed to if you run this in the on-premise, you would need to procure and buy server as opposed to doing this in the cloud. So that gives us a lot of flexibility in adding new features and functions into the heat wave cluster. Okay, so for the interest of today's topic, I'll talk about machine learning. Yeah, so machine learning is not something new, right? So every day we've been using the apps, Facebook, Brad or Uber. So you can see that all these apps really make our lives easy. For example, go to Facebook, you see the news feed from your favorite friends. Things that are important to you pops up in your news feed. And that's all because of the machine learning model. In the McCurry talk yesterday, they talked about how they use the machine learning to do, learn to ranking is to move those content that's matter to the user to the top. So that's essentially what people are doing today, right? Using machine learning to help build an intelligent system. Booking.com, right? So recommendation system so that people can quickly book a package on Booking.com. So they want to turn clicks into sales by having this recommendation system. Also Uber Eats to be able to send you notification the right one at the right time at the right user to the right user. So all these are done using machine learning. And if you are in a different industry, there are many different cases, banking, the fraud, retail, telco, and other industries, you know, I'm just going to fly by this. So just last week coming to here, I received this SMS. I'm sure you receive all this SMS as well, right? So this is obviously a spam, right? So it, can you see? So say thank you for your payment for Tommy for, you know, this 2000 ring it. And very quickly, by looking at this, you know, this is a spam, right? It's a personal mobile number, right? And the language is not, you know, properly structured and the contact number is really not really a contact, the bank official contact number. So this can be a good use case for telco. So every day, you know, we see all this and machine learning is going to help us. But it's not that easy to implement machine learning because it takes a highly skilled person to do it, right? Data scientists, you need a person that's really good with Perl Python, really good with machine learning algorithms, be able to, you know, get the right data, structure the data in a way that are good for learning. So these are the four machine learning pipelines that is common to all the machine learning tasks. And in order to get a tune model, you really need to do an iterative way to get the model right. And that's not very productive, especially, you know, if you have many, you have lots of data and you need to come up with a model in good time. So that's, okay, I have a quick demo here. So this is to show you how a typical machine learning process look like. So you're going to have data in the data frames and then you would build a model, right? And then here you need to choose a model, right? The right algorithm for the data set so that it makes sense to come up with the model that you like. So just go through it. And you see that in the model there are lots of parameters you need to tune in order to, you know, generate an accurate model. So that's typically how you would approach a machine learning with a traditional machine learning pipeline. So in HeatWave, we are automating this tedious machine learning task into AutoML. So this is an Oracle Research Lab technology. So it, instead of you doing all this task, you automate it into the AutoML engine. And with that, we don't really need to have, you know, very high Python skill, but leverage on the SQL knowledge that you have. So you can use SQL to invoke AutoML to create a model. So these are the four things that we innovate and put it into the AutoML. The more one is that, you know, in the pre-processing, we'll make sure that data set is complete. And a second step, you know, is to be able to sample the data into a good size for training. The goal is really to come up with an accurate model in a short amount of time. And the third one is we have a proxy model that we can start with. Instead of iterative, you know, getting this model, we actually have a proxy model to start with and fine tune along the way to get to the tune model. So with that, we have the engine within the HitWave database to allow you to do machine learning. So these are the three algorithms or three different type of categories that we support right now. Classification, regression, and forecasting. Soon to be available is the anomaly detection, which is going to be available very soon. So you can tackle different problems using all these algorithms. And yeah, so if you work with Scikit-learn, these are the algorithms that we support. And if you run AutoML, the engine will pick, depending on the categories, which algorithm will give you the most accurate in the shortest amount of time. So putting this together, doing machine learning in HitWave, it's really, you know, having a set of data loaded in the HitWave engine and essentially four SQL calls to implement machine learning in HitWave. So you can see that there's an ML train, there's a scoring for you to test the accuracy of the model, and then loading the data into the HitWave class, loading the model in the HitWave engine. And then depending on whether you want to do it in a batch way or in a real time prediction, there are two different functions that you can use. Essentially these are stored procedures that you invoke on the HitWave. To be more detail specific, these are some of the commands that you can use. I'm not going to go through each one of these, but if you're familiar with SQL, this is just fitting the value into the parameters. Predictable, explain role. I'm going to talk about explain a little bit so that you know the explain function is very important in the machine learning model. Okay, so I just want to run through a few illustrations to show you what happened in the backend. So when you have the data loaded into the InnoDB engine, when you execute the ML train, the data will be loaded into the HitWave node in a cluster, and then the training will happen in the HitWave node. And once the model is trained, it will then store into the model catalog. And then you can start using the model to predict by fitting it a new dataset. Okay, so when you run the predict table, it will get the model from the model catalog, load it into the HitWave node, and then it will get the result back to your application. Same goes for the explanation. Again, the model will be loaded, and it will explain why a prediction is made in a certain way so that it tells you what are the significant features that is used to making that prediction. Okay, so because this SQL base you can actually reuse, or you can use any of the notebook, Jupyter, Zapplin, or any SQL tools to work with HitWave machine learning. Okay, so I have an application developed here to show you how you can use this. So there's a banking marketing data which the model is trained to predict which other customers is likely to subscribe to a new product. So that gives the call center a good list of potential customers going to sign up because they can turn the potential into a customer so that you get the maximum return out of the call center. So the model has been trained, and all I need to do is to load the model into the cluster, and then I can show you the explanation capability in HitWave. So I'm going to do a show model explanation. It tells you, okay, for this particular customer, why does it sign up the product if it gets the call duration with these customers? And you can see that, so there's other parameters such as the bang rate, the URI ball rate, and also the number of years this person is employing is important, making this prediction that this customer will sign up with the product. So with this information, the call center can pick and choose the customers that will get likely to subscribe with the product. And in terms of scoring, you can check the accuracy of the model and the many different top of the metrics you can use to check the model. And here you can run this prediction in bash. For example, I'm creating a random table and I run the model on this table, and it gives me a prediction of which customer is going to sign up with the product. Okay, so let me just go to the next. Okay, so putting this all together, using the actual SQL, you have an idea of how easy it is to use this in HitWave. And so I'm just connecting to HitWave clusters, and this is the same dataset that I showed you. So connect it. And these are all the models in the model catalog. And so I'm just going to load the bank marketing model. Just load it. And this is the prediction on the table. So we just specified that the test dataset, the model which is loaded in the HitWave clusters and the table that will store the result. So that's how easy it is to use the model with SQL. Okay, so we also did a benchmark against others. You can see that because in the in-memory clusters and also the auto ML technology, we are way faster than Redshift ML. And in terms of cost, it's just 1% of the cost of using on Redshift ML. So with that, very quickly, I'll give you a quick overview of what you can do with HitWave. Besides, it's a very fast OLAP engine. But also you can run machine learning on HitWave. And if you want to try out, because this is on the cloud, you can actually sign up on Oracle Cloud with a trial account. You get 200 US dollars or 500 dollars sync in 30 days, whichever come first, to try out HitWave and machine learning. So with that, thanks for your time. And I'm here to ask a question if you have any. I see a bunch of questions. Here first. Hi, cool technologies. So how much influence do I have on the architecture of the model itself? If I say, okay, I want this many layers, I want skip layers, this many layers and so on. Do we have any more influence than just, okay, I want to classify? So the question is how many different types of algorithms? So today we support classification, regression and forecasting. And normally detection is coming. So yeah, the list of the algorithm, it's pretty much those three that we have currently. We're talking about model training, but then how about model serving? So I see that this is a very notebook driven environment, which is great and all for prototyping, but then for actually putting this model into production, how does that work? Okay, so it's all SQL driven. So once you have the model, it stores in the model catalog. So in order to consume the model, you need to load the model into the HitWave clusters. In terms of the application, it's just a simple SQL to invoke the model to do prediction. So either you could do a batch kind of predict on the table, or you could do a row based kind of prediction. So it's all through the SQL. So in the application, PSP or NodeJS, you know, use a master connector, connect to HitWave clusters, load the model and then run the stop procedures for prediction. Sorry, I have an architecture question regarding how this works. So are you actually doing the querying and the model training separately, or are you actually able to do some form of distributed training on like, you know, a cluster of having like, you know, one of those cluster systems basically? Right. So HitWave is a cluster system. So then you run the ML train. So it's actually distributed into the cluster node and then do training because data is distributed. So you'll do distributed training. So it's all done by HitWave. It's all done by HitWave automatically. All right. Thanks. One last question. So will this be published to open source or how much of it already is? Good question. This is not open source, yeah. This is, yeah, only in the cloud. So that's why, you know, you can try out the HitWave with a trial account. All right. Well, thank you very much for this wonderful talk and answering all these great questions. Thank you. Thanks.