 We'll start when he gives the time. Thanks to you. So good evening everyone. My name is Ajay Sik. I'm from corporate sales at Talent. Responsible for driving talent personas in ASEAN. I mean, again, I'm with this. We love this more about you guys, helping in your journey with talent, how talent can really help you. Integrating big data in real time. So, before we start, quick question. How many of you are working on talent for, say, six months or more? Raise your hand. Any experience working on talent six months or more? Anyone? No? So pretty new like me. I joined Talent a month back, so we're selling in the same boat. Okay, first of all, we keep talking about big data, Hadoop, Kafka, so many things coming up these days, right? What's the reason behind it? I mean, things were going pretty well five years back. All of a sudden, we're talking big data, Hadoop, Kafka, Spark, MapReduce. The thing is, there has been a paradigm shift in terms of how data is being generated today. There are so many different types of devices. Everyone has got a mobile, tablet. Everyone is online, right? So, 90% of today's data was generated in the last two years. Can you imagine the speed at the data being generated today? It will continue for next year, next five years, 10 years. Now, relation database or data warehouse, I mean, they are still working today. It's not obsolete. But when you're talking about unstructured data, because YouTube, hundreds of thousands of videos are uploaded every hour. If you're talking about Amazon, they're selling 450 items every second, right? People are uploading YouTube videos or using images. When it comes to unstructured data, it really becomes difficult for a traditional relation database or data warehouse to store the data and then to process it and then do it in real time. So, that's a huge challenge any IT developer, admin or ACI is facing today. Now, if you think about it, at the same time, technology is evolving every day. Talk about Hadoop was invented a few years back, then came MapReduce, now it's Spark, and it will change in every one year or two years. It brings a big dilemma for every IT executive, right? On one side, there are so many challenges they have to face. On another side, if they have intelligent developers like you working on latest technology, you are spending six months to nine months learning one technology. Within six months, there is another technology. So, how do you keep a pace with this innovation? That's a big challenge. Now, if you talk about five years back, then most of the decisions are based on experience. You have a data warehouse, you do a typical ETL tool, some kind of head coding, reporting. You will do reporting, it will go to decision makers that make a decision. The product is not doing so well. Maybe we need to do a better promotion, maybe we need to reduce the price, but that will turn into lower margin. More marketing spend. Now, if you talk about today, if you take an example of an e-commerce company, the biggest challenge e-commerce vendors are facing today is shopping card abandonment. Now, how do you deal with 90% of people are abandoning the shopping cards, right? Now, just imagine if you're talking about Amazon or Lazara, if they can improve by 1%, instead of 90%, only 89% people are abandoning the cards. It will lead to huge increase in terms of revenue for that company. Now, how do you do that? You have to make sure that you're leveraging the latest technology and you're addressing those challenges. You are processing the data in the real time. For example, if I'm trying to buy someone from Amazon, then right away, if there is a chance I might move to another website, right away Amazon will maybe pop up something for me. Okay, why don't you try this? I mean, chances are more that I'll end up buying something from Amazon, which will lead to more revenue, right? Now, why talent? So, when we talk about talent, the first thing comes into mind is its open source, so it's not that expensive. Yeah, of course, its open source, it will bring the TCO down. It will help increasing the ROI, but that's not it. Its open source means we have a big community supporting you. Whatever new innovations are there in the market will be the first to bring it to you. For example, you'll start working on talent now. You'll be part of that community. For example, Spark, we have 900 plus connectors, so you'll have access to the latest technology. Something goes wrong. You have any question. There is a community forum. You can raise your query there. Instead of, you know, working on a propriety solution where you have to rely on the vendor, here it's supported by the community. That's why talent is the market leader today in big data integration and cloud integration space as Pogartner and Forester, right? Now, of course, any vendor will talk about their company and that with that, okay, we are the best. Just to give you a few examples, since we are talking about big data real time, one of the biggest companies, GE, right? Everyone is aware of it, one of the biggest conglomerates. So they're using talent for GE Electric, GE Transportation, GE Healthcare. Now, how did they leverage talent in real time big data integration? So they have, if you talk about GE Electric, so they have wind farms, wind turbines around 22,000. Now, the challenge was how to make sure any turbine doesn't go down because if there is any downtime, it will lead to loss of revenue. So what they did is they integrated with talent to make sure that they're integrating in real time. Now, so every turbine is generating about one terabyte of data every day. Now, think about 22,000 turbines. Now, they process this with weather forecast. For example, the wind direction will be this way. So they have to adjust the direction of the turbine which will lead to more energy production. At the same time, they're doing predictive maintenance. How does that help GE? Because there won't be any downtime when it comes to any turbine. So again, GE, their objective was we just have to improve the process by 1%. They call it power of 1. If we can improve our current production by 1%, we are happy. This 1% improvement led to 1 to 2 million of savings each turbine every year, which is about $2 billion over a period of 15 years. Now, think about that. Just doing 1% of improvement led to 2 billion of savings, more revenue. Now, who doesn't want to do that? Every company wants to do that. Another example is, if you talk about Air France KLM. So they had this challenge. I mean, every company wants to go towards 360 degree view of the customer. I mean, it's a very fancy term, right? We want to know everything about the customer. We want to understand them better so that there is less customer churn. Customers are happy. They stay with us. They don't go with the competitors. By using talent, real-time, big data integration, what they did, one is if I'm flying to, say, Thailand or I'm flying to Indonesia, depending on buying behavior, one is they'll tell me which travel destination I would like to go to, and they'll give me a good pricing, right? So I don't move to another airline. That is one. Another is if I go to the airport, I'm supposed to be at the gate number five or six. Gate number changes. Right away, I'll get a text on Facebook because they have connected the social media. Right away, I'll get a text. Okay, gate number has changed. Go there. Number three, many times it happens, right? We put our stuff at the back of the seat. And at the time of landing, we forget about it because everyone is in so much hurry. If you forget something before you check out, before even you go to the security, right, you'll get a text, okay, you have left something behind. Please come and collect it. Now think of the effect of all this on the customer, how happy the customer will be. Of course, if I lose my iPad or something like that, if I get a text from KLM, at least 50% chances are more that I wouldn't go to another airline. So these are some of the examples, like how real-time, you know, big data integration helps companies in making data-driven decisions rather than relying on experience. We have facts available with our decision makers, how businesses can collaborate with a technical team, right, and make better informed decisions. I mean, so that's pretty much an overview. Now I'll just pass it on to Sangju. He is our technical consultant. We'll walk you through the technical process. Thank you. Thank you, Vijay. Okay. Okay, so what Vijay shared just now about all these Amazon, you buy things, all KLM, all this. Today what I'm going to show you is you can also do what he's saying using talent. Okay, sure. Okay, here who still do programming. Anyone? Okay, good. Okay, keep your hands up. Those of you doing programming one, doing Java.net C. Okay, do it. Okay. How about doing now, Mac Reduce? Okay, those doing here. How about Spark? Spark. Oh, Spark got more. Okay, Spark Streaming and Spark Storm. Okay, still quite a good handful. Okay, today I'm going to show you the tool. Everyone here can also do Spark Streaming and Spark Storm. Okay? Okay. Okay, overview of talent technically. Talent real-time big data. Then recommendation engine. We'll do a demo on recommendation engine like as though you go to Amazon, you buy something real-time and recommend you a product. Airplane parts maintainer alerts. We have sample data of airplane plot data. Then we send through a machine learning component. And after that, we'll do real-time recommendation based on test data. And last one, I want to create my own adventure. So today what I share with you, you all can also today go back, download and play. It's all open source in the sandbox. You just need about, thank you of RAM and i5. Then your laptop can run this thing. Okay, transforming the way we work. So just ask about the coding. So today using talent, all of you can do whatever VJ has talked about without any more coding. So we are a code generator. We generate Spark. We generate MapReduce. We generate Java code. So you know what to do on all this. You just need to know how to use all the underlying like HBase, HCatalog, Impala and stuff. Just need to know how to use this. You generate the code on your behalf. Second, learn once and apply many. So you see on the chart here, we do traditional data integration from AS4, DB2, TerraData. We also do big data integration, moving data into Hadoop, data out doing MapReduce and stuff. You also do other stuff like application integration, ESB, web services, CamelRest API, okay. We also do cloud integration, moving data into the cloud. We also have integration platform as a service and lastly, master data management for a single golden record of truth. What this point is trying to say is you do not need to install five solutions or learn five different applications or integrate five. We build everything from bottom up. So it's an Eclipse environment tool and you just want to use any of this. You just need to click, click, click. The function will change. The UI stays the same. So you just need to learn once and you can now do ESB, do big data, do MDM, do data modeling, all in one tool. It's all open source free. Next, future proof solution. We work very closely with the open source community. Our engineers also work with Google and stuff. So everybody heard about Apache Beam? Apache Beam. Okay, so Apache Beam is up and coming from Google. We also work with them closely. So end of this year, our version seven, Apache Beam will be available. So what it means to you all, you all can use Apache Beam. No need to know how to code it, but you can use it. Okay, analysis recognition, just a quick one. On the Gardner Magic Gordon, we are the leader here for data integration, the only open source guy here. Okay, for forester for big data fabric integration, this is for moving data inside big data, big data related. We are also the leader here. Okay, just summarize on the available solution that we have. We started data integration, our company 10 years ago in Paris, France. After that, we moved to US. Then we added big data, application integration, master data, blah, blah, blah. So on top here, you see, today you can download free. You can download and play on your laptop. Okay, no charge. Below here is the subscription. Similar like Red Hat Linux. These are the subscription that you get for enterprise, for your deployment, monitoring, execution, scheduling, blah, blah, blah. So you can still do your own job, but you can schedule yourself, or you can subscription, you provide this end to end. Okay, this is the Eclipse Development Tool that everyone is familiar with. So let's go through that. On the left-hand side, number one is the report story that stores your jobs, your metadata, your routine, your documentation. On the right-hand side, two, pallet. That's what I've usually shared. We have 1000 plus components and connectors, all free. No charge. You want to connect to SAP. You want to connect to Oracle, Teradata, Hadoop, Impala, Hubee, blah, blah, blah. It's all available free. Okay. Three is the perspective. You want to switch from any one of them, switch across. No need to go and uninstall or start a new application. Just select here, and you can switch between. You want to do ESB. You want to do big data. The screen will change. Okay. Four is the main design part. This is where you do your ingestion, transformation, and ingestion. Number five here is to specify the component of this guy. Okay. Question. If I want to move data out of Hadoop, how many clicks do I need? Three, five, or nine? How many steps do I need to bring data out? Five. Okay. We can do it at about three steps. Okay. First, on the right-hand side, the pallet, in this case, we find the elephant. We find Hadoop. You can select the Hadoop component out. THJF output. You can similarly do this step. You can say that you want to drag Oracle out. You want to drag Terra data out. Next, after you drag this out, you must specify this elephant. Whether it's red color, blue color, or green color. Anyone follow me on the red, blue, and green? Okay. Red is the map reduced. Blue is the Cloudera. Green is the Hotterwork. These are the three famous elephants that are running around. So you specify the elephant here. Then after that, you specify the version of the elephant. If you work closely with all these Hadoop distributors, we are always one version behind them. So if Cloudera now is 5.9, we are 5.8. So what it means to you is that you do not worry whatever the latest feature of the distro, you will have it just three months back. Okay. So after you specify, you put in your user, you put in your password, host name, blah blah blah, you can get the data out. So you today, you can get data out from HDFS, from Hadoop. No need to go and learn one month. That's the, to go and script. Okay. I also don't know how to take it out. Okay. We also have the Hadoop configuration we wizard. Okay. You know, in the Hadoop inside, there's a lot of animals running around. You've got zookeeper, Kafka, Impala, Hew, Hew base, Hitchcaddle, blah blah blah. So you don't need to configure all of them. As long as you point to the Hadoop top layer, the top guy for Hotterworks is Ambari. For Cloudera is the manager. All the underlying components, underlying animal will be surface. So you don't need to configure. So this is our Hadoop configuration we wizard. Okay. Summary of the 900 plus. Yes. Yes. Okay. The other 900 plus components, you can see here, Big Data, Business, Intelligent, Cloud, Custom, Traditional, Database, blah blah blah. Big Data components, Neo4j, NonSQL, MongoDB, Hcatalog, HDFS distribution, Amazon EMR, Cloudera, Hotterworks, Mac, Pivotal, HD. So in summary, you can see that 99.5%, you can connect anything data out there. Even if you don't find it here, we also support bringing your own JAR. If you bring your own JAR in, you just put your JAR in, you can connect. Even if you don't have the JAR, as long as the data source has its own JAR API, Java API, we can connect. Because we are Java open source. So you can move any data in and out. Okay. Common connectors. So these are examples of the 100 plus connectors that we have. We have traditional, like DB2, AS400, to NetEaser, to TerraData. So every of the components, you have input, input data in, output, take data out. I want to have slow changing dimension. I want to have stop procedure. I want to have commit close. I want to have about load. About load. And stuff. Yeah. Okay. See an example. If you want to move data from AS400, merge it with some in DB2. After that, push it to Pivotal Green Plum. So you can do it in five minutes. You don't need to go and learn how to integrate the four of them. Okay. We generate code automatically. We generate native, Hadoop, Peak, Java, or SQL code. EG, you do the component, you drag, drag, drag. The code is there you can see. So some people use these two to learn how to code. Oh, so can. Okay. No black box proprietary software. Even the subscription version or the free version, you can see all the code. No black black box. Some people take the code you want to deploy in WebSphere, WebLogic. It will still run. Okay. We build once and run anywhere. So you build already. You can run it inside different JVM also can. Any questions so far? Yes. I just want to play with the HL flag. And my brain was on collaborative work under an automated test. When did you do the work for copy data? Okay. So the question, using the open source challenges with collaborative and for test testing. Okay. In the open source version, we don't have this, but in the subscription version, we have collaboration. So we support SVN and JIT. So versioning and merging of code is there. Then for test testing in the subscription version, we support Maven and Jenkins, where you can do from deployment, testing to production. We also have debugger and we also have test creation too. But those are in the subscription. Any questions? No, I'll jump to the big data part. Okay. Talent real-time big data. Okay. Upgrade from mad reduce to spark with one click. Okay. So mad reduce has been around since Hadoop was around 10 years back. Then spark a few years back came along. So if you have already mad reduce code that you have learned, now spark came along. I need to go and learn one month to do it. No. With talent, you can do it with one click. So with one click, you can change the job that is designed here to spark. Okay. So now this is not good enough. Lately is spark streaming. So your spark batch job, if you want to convert to a streaming job, you can similarly change the property from spark streaming, spark batch to spark streaming or storm. The code will automatically regenerated as spark streaming and storm. And you can use it. Here are example where we have one on spark streaming. Okay. Big data create high quality trusted information. Okay. No Hadoop data is all there. If you want to do any of this, explore profile, monitor, past, standardized, reconcile, match and rich and share. You need to learn the spark code to how to do this. Not to worry with talent. All these components are out of the box. You just need to drag, design your job and we will generate the spark code for you or the mad reduce code and you can run. Okay. Lastly, the code that is generated is native. There is no additional black box layer whereby the code will need to be converted into something else before it runs. It's the actual Java Java code that is sent in. So you can even build the, keep it somewhere else and you can throw it in as when you want or you can use the Uzi scheduler to run it. Okay. This is the lambda architecture. I want to see this before for the real time and the batch one. Okay. So what this does is on the left-hand side here, you have the batch whereby you do all the historical click stream, the web logs. You store it here. Okay. We can process it using spark batch or mad reduce. Then after that using machine learning components, you keep the learning model here. Okay. At real time, we're coming in. The latest one now is using Kafka. So you push using Kafka. You want to store inside. No SQL, Cassandra, Baba, you can store, transform, keep it here. Real time when the click stream comes in, merging together with the recommendation of this, I will at real time recommend to the consumer. So in this case, if I'm buying a bicycle from Amazon and this bicycle is black and red, you recommend me a black glove and a red shirt. Okay. So later we'll go through this example. How you can also download and sandbox and play this yourself. Okay. Machine learning support. We also integrate closely with the machine machine learning in spark. So machine learning sparks, we have all this that is available like ALS model, classification, decision tree, K-means, K-means string. So they are scientists. They don't need to code out themselves. They can leverage, use this, throw the data in and the model is there. Then afterwards you can use the T predict to leverage on the model and the recommendation will come up. So again, no need to code. These are all out of the box. Okay. So now we come to the demo recommendation engine. Any questions? If not, I continue. Okay. So what we are going to do here just now you see this lambda. Yes. Okay. So the question is that if you need to deploy big data job, how is it deployed into Hadoop to run? Okay. So what we have here, we are a code generator for MatReduce and Spark. So in talent itself, we will generate the MatReduce and Spark code. Packages nicely, we will push it all the way to Hadoop Yarn there to go and run. So on outside it's very light. It's the ELT mode, extract, load and transform. So you do not need to have a very big server because the code generator after that you're leveraging on Hadoop to run. The validation? Visualization. Visualization. Visualization of data, yes. We do have a visualization of data. It's a newest one here. It's the data preparation, also free. So what it does, this data preparation tool allows business users to go into Hadoop, to go into Oracle, to take the data out into a nice Excel and they can do data cleansing on their own. So it's also open source, now version 2.0. You all can download and play with it. So that one exposed all the data for you to do cleansing. So what we're going to do here, we're going to do this simulation. Step number one, we will be doing the batch generating the model. After that, step 2A, we will be running this recommendation engine here. Step 2B, we will open a web socket to listen to the incoming click stream. And step 2C, after we do a matching, we will recommend it back to the user. Using AJAX. So this will be the website that we are using. So what this website is, as though you are going to a telecom, like maybe Singtel, and you want to buy some data bundle. There's a big data bundle, basic bundles and stuff like that. And the recommendation will come out here. So what this page have, you have two components. One is the web socket. When I click this, the web socket will take it and push it to the talent itself. Then after that, we have the AJAX coming back. So I'll just do a quick view code. This part here is the web socket. So later I'll run the web socket on the other side to listen. This part here is after the recommendation is completed. I'll push back the recommendation back. So this is the studio. Let me just make it more space. So this is job number one. The one is the web socket they're listening from the clicking. So we are using Apache Camel open source. So if we just click at this component here, it's listening for people clicking in. So you see there's two rows because just now I just did a test to make sure that it works. Later when you click, the rows will increase. So if we look at this component here, using Apache Camel, it's just a simple, I'm listening to my local host at 9902 with all the parameters. So this was at the website. I click our file to 9902. So after I take in the data, I process, process, process, process, I'll push it into Kafka for processing. So now I go to step number two, the real-time recommendation. So this is step number two. So step number two, what we have here. Step number one, I push the input into the Kafka topic. I extract the JSON of the Kafka. I do the recommendation product model which I did in the earlier step one. I match it. Let's check whether this customer is someone that I know. I go and check my products. Then I'll write a JSON message back to push into Kafka. So the last step is into the Kafka. I need to push it back to where the website come from. So number three, I'll extract from the Kafka. I'll do some processing. Then I'll push it back as a Ajax back to the page. Okay, I'm still here. Okay, so here you see only two rows. Let me run the demo. Okay, so I select one option. See the right-hand side here. Suggest for the product. Okay, so the plan come out here. The recommendation and stuff. So I can click something else. Here we'll change. Okay, another number change. So I click about three times. If we go back there. Okay, so five rows. Two plus three. Click three times. The input string, this is the first part. There's eight rows coming in. So yes. So you have an issue while you're running this. How will you put the bread point and divide it into one of the components to see what's happening inside this. Yes, so we have, you see this is a lock. This is the one that prints out the lock. So we can keep the lock where the information is. We also have components like a T die component. When the thing die, you want to print some message. You also have lock catcher to catch wherever normal locks or you want extreme locks also have. So this, all this information will be pushed to our monitoring screen. So the you can break them. So what you can do is you can right click this guy and you have an option to deactivate. So you'll run until here and stop. We can also do that. You can also do step by step. We have a component called debug. You can do debug step by step. See here there's a debug run. So the debug run you can run. Every time let's say the message comes in, go step by step. So you can also do that. The soft code. You want to see the soft code. You can put a breakpoint inside here also can. So if you want to see the soft code, just click the code here. So this is the soft code. So this Kafka streaming thing, right? If you want to use the soft code, you can take this whole soft code, deploy it in your environment or deploy it into harddo. You will still run. So this is Java Spark code. So if it's okay with this, I'll go to the next example where VJ has shared how we can use machine maintenance, how to reduce the turbine failures and stuff. So no problem for the first recommendation, real time streaming. So you all can do it today. Later download and go and play. Can I go to the next one? Okay, next one. Airplane maintenance alert. Okay, 24,000 sensors on the fuel system help predict failure days, avoiding grounding the airplane and incurring huge cost for the, detect repairs 10 to 20 days in advance and locate repairs 5 to 6 hours. So a leading email is using this to help to keep their plane more on the air rather than on the ground. Similarly to all the G examples and turbines and stuff. Okay, so what we have here, the example we have is here number one, training set. We have a set of training data. So what does the training data look like? This is a set of training data just generated using Excel. So these are all the data that is good. Then if you scroll down, these are the data that needs to maintain. So we take this data, throw it into a machine learning model inside. So this is the data. So this is the training data. I take the training data. I put it in my own encoder. I throw it into logistic regression model. Anybody can share on this Excel learning logistic regression model. So after into this, we have the model. We use the test data here to predict whether the outcome, this is for batch testing. And the last part, we have a real time, you want to do real time testing, we send two data in. At real time, we will tell you whether these two data from the sensors are they providing any errors. So I will run this job now. So this is the same thing that you saw just now. We will send the training data in. After that, we create a model. We use this model inside the prediction here. So this prediction model will refer to this. So just like we shared in the slides, we got many models from Spark. You can just take and go. So we are using logistic regression model in this case. After that, we will take two data in. So if you look at this information here. So this information has two sample data. So based on these two sample data, prediction model, we will say whether this is a good or bad. Let's do the run. So this running will do the three things. You do the building of the model. You do the testing. Then after that, you will do the real time. So similarly, this one you can also see the code and see how it's being written. So I have a local Spark. So if you're running on the local Spark in my machine. So if you go back to look at the job, you can see at real time, how many percent is ongoing. So all is completed. All these things Spark. If we just go and look at the output. So these two data just now was being sent in at the end. And based on the model, the first one is good. The second one is go for maintenance. So now with your own IoT data, with your own web log, you can do your own data modeling and stuff. You have to ask the data scientist. Any questions for this? Yes. I mean, these connectors would help them connect the different technologies together. Then he would still, let's say he's using Spark. The developer would still have to use the, you know, go into the code level and then do the Spark, you know, the part about cleansing data and then matching. Okay. So the question is, these two very good to join all the data together. But if they want to take this do inside the Spark for all the matching cleansing and stuff, do they still need to write Spark code? Okay. So we have a lot of components on the right processing components that can do like filtering, aggregation, mapping, denormalization and all this. So those components are already Spark enabled. So you just need to drag them in form. Let's say you want to maybe filter the records for the employee, whether they are staying in Singapore or you want to normalize them based on the data and stuff. Components are on the right-hand side here. So you can just use the components. You don't need to go and code. So you see on this right, processing component, filtering, filter, map, replace, replicate, sorting, top, top, unique, unite, all this. So all these are Spark enabled. We just drag form. You can do your data cleansing, data profiling, data matching also have all inside Spark. Yes. How do you test how do you know how effective the machine learning is? Are that have to base on the data scientist? So you can change the coefficient, change the parameters. How effective have to add the data scientist? We just provide the model, the tool. Instead of the data scientist go and code R, he can just, okay, maybe I want to change the coefficient to something more. Let's look at this one. Sorry? We are using the Spark machine library. So you are using the library, the Spark MLib. So you go to Spark website, right? They have all the machine learning components that's there. You just need to learn how to connect to them manually. But if you use talent, you can skip that step. You just use. So, okay. I'm not a stats guy. So logistic regression, elastic mixing parameter, regulation, thresholds, all this. So based on the data scientist guy, he will know how to adjust. Yeah. Just that he don't need to code. Though they love to code. Yes. Again, the trauma. It's a trauma developer. So if he does the spot matching and also we can either write a single line of code or he comes and drags and drop here to kind of get back to it. Yeah. They can write it for a developer who is the Java developer. Okay. Yeah. Totally right. So for Spark developer, of course, you write a Spark code line of code. It's more efficient than using this to maybe draw a component then generate five, ten lines of code. The whole idea for the enterprise is that the developer who does this, he might be doing other jobs and the code then the next person taking over might have challenges looking at it and also the staff also move around quite a lot. So with this from a management perspective, it's more easier to manage the code. The next person comes in, look at the components and models. He can see more or less how this run. It's all written in Spark and all these Spark engines are in high demand nowadays. So the next guy comes in will also have to spend time to take a look at it. So it's more on the maintainability. But of course, those who like to code one, they'll still continue to code. Yeah. Yes. So for this Spark, do I support all the Spark underlining functions? Okay. We support most of it. So we need to check which one that you're asking for, whether we support it. Yeah. So like for example, the machine learning, there's about I think 20 plus, we now support up to 16. So we are constantly building and putting them in. Last year, January, we support 11. This last year, June, we support 16. So as and when there's more need, we convert a lot of the open source. People that do it, we convert and support it. So besides the 1000 plus components you see here, we have a community called exchange.talent.com. So the open source people just put, there's about 200 there. So ranging from like PDF conversion, you want to connect to their other stuff. Okay. So it's okay. I'll continue. Okay. Okay. So the two demo that is done, so I also want to play, how can I play? Okay. We are this talent big data sandbox. You can download this URL is open or you can just go and Google big data sandbox version two. Okay. So this sandbox is a cookbook step by step to teach you how to play the two demo that was available. So the two demo, like the inside got a few, few more. One, retail recommendation, the one that I share with you. Second one is a sports statistics using IoT. This one later I'll show a few slides on this whereby based on the guy playing softball running around throwing the ball, the camera will go and take this data and we're going to convert at real time, push it to website on REST API. Okay. But it starts from the digital part. This is example. This is how to for IoT. Next, click stream. Coming in real click stream and you can see the data. ETL off load processing your web web log using a big data rather than using the traditional data warehouses. Lastly, analyze our Apache web blog. So this got step by step. Teach you how to download. You need at least about 8 to 10 gig of memory and hard disk space, maybe about 20 gig. Because once you download this guy, it's a bit small, 3 gig. You'll ask you, which elephant do you want to download? So give it about one hour. Then you will download the relevant or on Docker so you can see everything is in stock. Kafka, Zookeeper, everything is up and you can play on this. Okay. So what you will see in the five example, there'll be step by step. Step number one, create the Kafka topic. Step number two, push to Kafka. You just need to click play, play, play, next, next, next, next. And you'll see the job runs. Then all the output will be in the bottom. So this example you'll see for the five sandbox. So the first one, retail recommendation, which I shared will include Kafka, machine learning and spark streaming. The IoT camera one, the softball guy who was throwing, that one will be using Kafka spark streaming. You'll see this nice screen. Every 10 seconds, this graph will change. Okay. So these will be the five examples that you'll see inside there. Okay. So that's the end. My last slide is a talent solution. We run natively on big data and in the cloud. It's the actual source code that are writing open source. Adaptable and extendable architecture. You can start small by just using data integration. Then if you want, you can just upgrade and the features will be there. You'll need to reinstall our stuff. Last one, one platform. One platform you can do ESB, web services, rest API, big data, data integration, master data management and all this. Traditionally, you'll need to install five. But this is the one. Last one, okay. This one is a user-based subscription. We base just on developer, not on data, not on connectors and stuff. Okay. Any questions? Yes. What version of Spark is this one? 2.0. Okay. So we support quite closely. So the latest is 2.0. Okay. We are 2.0. Yes. Let's say somebody is using this tool and you create an application. So what kind of deployment does it make? I mean, what kind of deployment does it make? Which we have to say deploy a group, for example. What kind of deployment? Artifacts. Artifacts. Artifacts. So you can use that example. Is it a jar or is it a jar? Yes. It's a jar. It's a jar. And you can take the jar, you can run in the even West Bay where the tree was still running. It's just a normal jar. Yes. Do you plan to explore on the Scala? Explore on Scala. So far, not yet. We are still Java based. So on the integration, we still spot at least still Java. Is it possible to integrate with Jenkins or Airflow? We can integrate with Maverick and Jenkins. Airflow. Airflow, sorry. I don't know. Maverick and Jenkins, yes. And how we handle version control? Okay, version control, we connect support, SVN and GIT. So, this is only the sufficient part. We have an admin server. So the admin server will control the user, the project. We can connect to SVN or connect to GIT. Once it's connected, all the developers that are logged in, all their versions will be controlled or will be checked in, check out all SVN and GIT. What is the prerequisites to set up this ID? Okay, what is the prerequisites to set up this ID? Okay. If you download the open source version, everything is there. There is no prerequisites except you have enough memory, 64-bit, i5, 20-gig hard disk, everything is in. Spark is also inside there. Spark local. We have a local Spark. So local Spark is there also for easy testing. No need to every time send to hardware to test. So just now you are running on my local machine. So my is a 16-gig. Yes. But if you're still getting a very set up, then we cannot connect to the internet or would it install the libraries that are required? Okay, good question. So if we install in the enterprise and we cannot connect to internet, how can we download the library? Okay, for the enterprise, we have one link. Download is about 11-gig. Continue all the JA. So that JA you put into your enterprise and you don't need to go internet and find. Local reports three, yes. Okay, one thing good. We in the open source community, every six months we are updating. So look out for the next version coming out in June. We have more new features. And in the end of this version 7, this is currently 6.3, 6.4 in June. End of year version 7, we will be integrated with Apache Beam. So those ones looking forward to use Apache Beam. Sorry? Documentation. Documentation, okay. If you want to have like the components on the right hand side, the documentations are all available there. You download it's about 13,000 page or you can just go online, each every component. If you want to document the particular job, we can have a very good feature. Right click, document the job. Then the engineers don't need to go and document manually. So let's say I have a job. Okay, let's say I have a CDC job. It's a simple CDC job. After I completed this CDC job, I close this job. Right click, generate.shtml. It's a nice zip file. HTML zip file, finish. Okay. I go to my D drive, which I just now, this was the guy that was extract all. Extract, yes, overwrite, yes, overwrite. Okay, copy overwrite. Yes. Okay. So you have the nice job being documented. Okay. So documentation on the job itself, documentation of all the components all available online. Yes. How frequent do we release new version? So we release new version every six months. So every six months new features that is requested by customers and also in the open source community, we put it in. So in the upcoming 6.4, we will be more fully integrated with Azure and also Google. Now we are quite integrated with AWS. So example, if you want to spin off an EC2 or RDS from here, we can do it. So some company uses us to, let's say, spin off a few EC2, do some churning or some processing of your data after that or EMR, then after you shut down, all via talent job. Is the API open? Yes, all the API on the right-hand side, if you want, we are using JavaJet, JavaJet technology. So you can create your own API and you can create your own API and put it here by yourself or you can go to our community exchange.talent.com, go and download and put it in. So one example, we don't have PEDF here, but in the community, somebody did PEDF, Tiki PEDF, based on Java, so you bring the component in, then you can un-PDF. To un-PDF, we don't have it. So you can develop your own component and put it in. If not, thank you very much. We are still here. You can come and talk to us. Do you have any questions?