 Silicon Angle Media presents The Cube, covering Alibaba Cloud's annual conference. Brought to you by Intel. Now here's John Furrier. Hi, I'm John Furrier with Silicon Angle Wikibon and McKeebo. I'm the co-founder based in Silicon Valley in California, Palo Alto, California. And I am here in Hangzhou, China for the Alibaba Cloud conference in Cloud City. It's the biggest cloud computing conference here in China. I'm excited to be here with Dr. Min Wen Li, who's the Chief Data Scientist and General Manager of the Big Data Division at Alibaba Cloud. Dr. Wen Li, thank you for spending the time. Thank you for having me. We have seen a lot of data in the conversation here at the show. Data technology is a big part of this new revolution. It's an industrial revolution that we've never seen before. Another generation of technology. What does data technology mean to Alibaba? Okay, it means everything. So first of all, in terms of technical speaking, it's technology handling massive real-time data and streaming data. It's a variety of different varieties. For instance, the mobile app for system knock, the customer behavior, the clinic, and the browsing of the digital image of each merchant and asking for the price or compare against another similar product. All these behaviors are translated as data. And this data will be further merged with the archive data and try to update the profile of this customer, the interest, and then try to detect whether there's a good match of the current merchant with the customer intent. If the match is good, and then we will flash this to the top priority, the top spot, so they try to increase the conversion rate. So if the conversion rate is high and then our sales is high. So DT, data technology, means everything to Alibaba. It's interesting. I find my observation here is so fascinating because in the old days applications produced data stored on drives. They go to data warehouses and they analyze them. You guys in Alibaba Cloud are doing something fundamentally different and exciting in the sense that you have data, people call it data exhaust or data in general, but you're reusing the data in the development in real time. It's not just data exhaust or data from an application. You're using the data to make a better user experience and make the systems smarter and more intelligent. Did I get that right? Exactly. This is a positive feedback loop. In the old fashioned way, you archive the data for offline analysis, for post-event analysis, to identify whether there's any room for improvement. But that's fine. But now people cannot wait. And we cannot wait. Offline is not enough. So we have to do this in real time online in a feedback version in such a way that we can capture exactly at the right moment understand the intent of the customer and then try to deliver the right content to the customer on the fly. Jackie Mao, Jack Mao, your boss, and also Dr. Wang, who I spoke with yesterday, talked about two things. Jack Mao talks about a new revolution, a new kind of industrial revolution, a smarter world, a better society. Dr. Wang talks about data flowing like a river. I mean, here, Nang-Joe is an example. But it highlights something that's happening across the world. We're moving from a batch slow world with data on that's in motion in always real time. They're not necessarily mutually exclusive, but they're different. They're data lake or data river or whatever word you want. I don't really like they were data lake personally. I think that means a batch to me. But batch has been around for a while. In real time, you mentioned streaming. This is something that's happening and it's impacting the architecture and the value proposition of applications. And it's highlighted in Internet of Things. It's highlighted in examples that we're seeing that it's exciting like the ET brains. You share your view and your project around ET brain because that is not just one vertical. It's healthcare. It's industrial. It's transportation. It's consumer. Good question. First of all, I concur with you that data lake will exist. It will continue to exist because it's got its own value because our ET brain, for example, actually emerged from data lake because it has to learn all the benchmark, the baseline model, the basic knowledge from the existing archive data, which is a data lake. However, that's not enough. Once you have the knowledge, you have the capability but you need to put that in action. So we are talking about data in motion, data in action. So in action, how do we do that? So once you have the training sample, all the training data from data lake and you're training the brain. That's enough. And then the next step, you want to put the brain coupled with the real-time streaming data and then to generate real-time action in real-time manner in pre-emptive way rather than post-eventive reactive way. So for example, in transportation and travel, TNT, travel and transportation and the traffic management. So currently, all the authorities, they have access to real-time information and then they do a post-event analysis. Oh, if there's a traffic jam and then they want to do some mitigation. However, the best scenario is if you can prevent the traffic jam from happening in the first place, right? How can you foresee there will be, there would be, there could be traffic jam happening in 10 minutes from now and then you take a pre-emptive strike and then try to prevent that from happening. That's the goal E.T. Brain in traffic management want to achieve. Like for example, you see the ambulance case and once we have the E.T. Brain receive the message, say the ambulance is going to go to point A, pick up a patient and carry that patient to rush that into hospital B. And then it immediately calculates the right routing, the driving direction and calculating the E.T.A. to every intermediate intersection and then try to coordinate with the traffic lights, traffic signal. All this systematic integration will create undemand green wave for ambulance, but in the past the ambulance is just by the siren. Yeah, this is fascinating and also I'd like to get your thoughts because you bring up something that's important and that is, I'd like to connect the dots and that is real-time matters if you're crossing the street you can't be near real-time because you could hit by a car but also latency support but also the quality of the data is good. I was talking to an executive who's laying out his architecture for a smart city and he said, I want the data in real-time and the I.T. department said, here it is it's in real-time. He says, no, that's last year's data. So the data has to be real-time and the latency has to be low. Exactly, I completely agree. The latency has to be low if unfortunately in the current I.T. infrastructure very often the latency exists you cannot eliminate that, right? And then you have to live with that so the I.T. branch accomplishes the fact in fact we have our own algorithm designed in a way that it can make a short-term prediction so based on five minutes ago data the data collected five minutes ago and then it can project the next five minutes the next ten minutes what would be the data and then use that to mitigate or to conquer to offset the latency so we found that to be a good strategy because it's relatively easy to implement and it's fast efficient. That's when the fascinating conversation I'd like to give you thoughts on connecting that big data conversation or data conversation to this event. This is a cloud computing event we at the Cube and SiliconANGLE and our Wikibon research team we go to all the events but sometimes the big data events are about big data that do whatever and then you have cloud talking about DevOps and virtual machines this conference is not just a siloed topic you have cloud computing which is the compute it's the energy it's the unlimited compute potential but it's also got a lot of data you guys are blending it in Is that by design and why is that important? It's by design actually you cannot separate cloud from big data or you cannot talk big data without referring to cloud because once the data is big you need a huge computation power where does that come from? Cloud computing that means the data intelligence all the value has to require a good technological tool to unleash the value what's the tool? Cloud computing for example the first time IBM come up with a smart planet, smart city that's 2005 or 2006 around that time there's no cloud computing yet at the earliest the emerging stage and then we see what happens and the smart city gradually become IT infrastructure construction but it's not DT data technology so they invest billions of dollars in the infrastructure level and they collect so much data but all the data become a burden to the government to save to archive the data or protect the data from hacking now these days if you have the cloud computing available you can do real time analytics to unleash the value at the first moment you receive the data and then later on you know which data is more valuable which data is of less value and you know how much you want to archive our Wikibon research team put out research this past year that said IT is no longer a department it's everywhere it's everywhere it's your DT data technology it's a fabric but one thing that's interesting going back from 2005 to now is not only the possibility for unlimited computing is that now you're seeing wireless technologies significantly exploding in a good way it's really happening that's also going to be a catalyst for change what's your thoughts on how wireless connects in because you have all these networks you have to move data around it has to be addressable you have to manage security that's a heavy load what do you do how are you guys doing that okay very good question we faced this challenge a couple of years ago we realized that because in China in Chinese domestic market the users they are migrating from PC to mobile and then they create the mobile phone has a wifi interact with a lot of AP access point so how do we recognize our tracking and recognize the idea identification all this stuff create a huge headache to us and this time in this conference we announced our solution for mobile for mobile cloud so what does that mean so essentially we have a cloud infrastructure and product design in order to do a real time integration and do a data convincing of the mobile data I mean by mobile and wireless as well wireless means even Bluetooth or even IoT our IoT solution also supported that so this is a evolving process in a way the first solution problem is less than perfect but gradually as we are expanding into more and more application scenario and then we will augment the solution and they are trying to make it more robust you guys had a good opportunity and Alibaba cloud certainly met with Karen Liu about the opportunity in North America and the United States around the world but Alibaba cloud and Alibaba group and Alibaba cloud has had a great opportunity almost a green field almost a clean sheet of paper but you have a very demanding consumer base here in China they are heavily on mobile as you pointed out but they love applications so the question I want to ask you with and I love your thoughts on this how do you bring that consumerization velocity the acceleration of the changing landscape of the consumer expectation and their experience to small businesses and enterprises okay very good question so using a large customer base and demanding customers in China trying to help us to harden our product harden our solution and reduce the cost the overall cost and the economy the economy of mass scale economy of scale and once we reach that critical point and then our service is inexpensive enough and then the small and the media SMB, small and the media business they could afford that and in the old days SMB they want to have access to high performance computing but they do not have enough budget to afford the supercomputer but these days now because our product our computation product the cloud product big data product are efficient enough so the total cost is affordable and then you see that 80% of our customer at least 80% are actually SMB so we believe the same practice can be applied to overseas market and bring the best practices of the consumer and the scale of all the cloud to the small media enterprises and then they buy as they grow you don't have to buy a lot up front they buy on demand as they need that's the cloud the benefit of the cloud the compute is great and as you got greatness with the compute power it's going to create a renaissance of big data applications where you see that what is your relationship with Intel and the ecosystem because we see you guys have the same playbook as a lot of successful companies in this open source era of you need horsepower open source what is Alibaba's strategy around the ecosystem relationship with Intel and how are you guys going to do with partners yeah first of all so we're really happy that we have Intel as our partner you know most recent big data hacks on for the medical for the medical AI competition and we just close that competition that data hacks on okay very very very fascinating event Intel provided a lot of support all the customer all the participants of this data hack they do their computing leveraging on the Intel's product because they do the image process and then we provided the overall computing platform okay this is a perfect example of how we collaborated with our partners technology partners okay beyond Intel actually in terms of the ecosystem first of all we are open building our ecosystem we need partners we need partners from pure technology perspective and we also need partners from the traditional vertical sectors as well because they provide us domain know-how once we couple our cloud computing and big data technology with domain know-how the subject subject matter expert expertise we'll together the marriage will generate a huge value okay that's fantastic and believe me open source is going to grow exponentially and by 2025 we predict that it's going to be an obvious step from the Linux Foundation to build amazing work to see the cloud native foundation I want to get your thoughts on the future generation you mean open source the future generation is using open source if they're younger you guys attract younger demographics in your employee base you have a cloud native developer now emerging they want to program the infrastructure as code they want to just they don't want a provision servers they want the street license work the project the brains have to be in the infrastructure they want to be creative yes you bring in two cultures together right and you got AIs a wonderful trend machine learning is doing very well right how do you guys train the younger generation what's your advice to people looking at Alibaba cloud that want to play with all the good toys yes machine learning you know AI right they don't want it necessarily baby yes they don't want they don't want to configure switches right yes very good question actually this is related to our product strategy so in a way so like today we announce our IT brain so we are going to release this and share this as a platform to net all the creative mind creative brains ok people trying to leverage on this brain and then do the do the do the creative job rather than worry about the underline the infrastructure the basic stuff so this is the part which we want to share with the young generation tell them that unleash your creativity unleash your imagination don't worry about the hard coding part and we already built the infrastructure the backbone for you and then imagine anything you think possible and then try to use the IT brain try to explore that and we provide the necessary tool and the building blocks and the APIs and the APIs as well yes ok so I want to give your thoughts something important to our audience and that is machine learning yes the gateway to AI yeah AI what is AI software using cloud temple argue that AI hasn't really yet come on the scene but it's coming we love AI machine learning is where the action is right now yes and they want to learn about how to get involved with machine learning so what's your view on the role of machine learning because now you have the opportunity for a new kind of software development a lot of math involved right that's something that you know a lot about yeah so is there going to be more libraries what's your vision on how machine learning moves from a bounded use case yeah to more unbounded opportunities because if I'm a developer yeah I want the horizontally scalable resource of the cloud yes but I'm going to have domain expertise in a vertical application yes so I need to have a little bit of specialism in this way and it's going to be up this way yes okay let me go this way so first of all for the young generation or for people who are really interested in AI or they want to work on AI my recommendation first of all you've got to learn some mathematics why because all the AI's on machine learning is talking about algorithms and all the computer and also the optimization how to speed up the convergence of the of the algorithms right so all this math is important okay and if you have that math background and then you have the capability to judge or to see next which algorithm or which machine software is suitable to solve the vertical problems very often the most popular algorithm may not be so you've got to have the way capability to differentiate to see that making the right choice okay this is the first recommendation and the second recommendation try to do as many toy examples as possible try to get your hands on now don't stop at looking at the function specification and oh this is a function input output but you need to get your hands dirty get your hands on the real problem so that you can have a feeling about how powerful it is how bad or how good it is once you have this kind of experience and then you do have a capability you gradually build up the capability to make a right choice this is fascinating Dr. Yes, scientists and creatives yeah UI we're seeing evolutions in user experiences more art right so culture is important but for the machine learning of the algorithm is sometimes you have to have a lot of tools right if you have one tool you shouldn't try to use tools for other jobs so bring this up how should a company architecting their business or their application look at tool because on one hand is the right tool with the right job but you don't want to use a tool for a job that's not designed for to your point tools what's your advice and philosophy on the kinds of toolings and when to engage platforms relationship between platforms and tools Okay then push this way so this is a decision based on a mixture of different criteria together so first of all from technology perspective and secondly from the business perspective I would say if you are you know if if your company is critical competence it's technical stuff and then you got to have your own tool your own version if you only rely on some existing tool from other companies and then your whole business actually is dependent on that and this is the weakest link the most dangerous link right so however very often to develop your own version of the tool takes forever and market wouldn't give you so much time right and then you need to strike a balance how much how much I want to get involved for self-development and how much for in-house development and how much I want to buy in and time and time as well yes so and another one is that you got to look at the competitive landscape if this tool actually has already existed for many years similar product in the market and the problem is not a good idea to reproduce or reinvent and then you got to why not buy it you take that for granted and I think that as a fact and then you build a new fact right so this is another in terms of a maturity of the tool and then you need to strike a balance and in the end in the extreme case if your business your company is doing an extremely new innovative first of a kind of study our business our service you probably need some differentiate and that differentiate probably is a new tool final question for you for the audience in America in the valley what would you like to share in your own personal perspective about Alibaba Cloud that they should know about okay or they might not know about and should know okay so for because I work at for 16 years to be frank to be frank I know nothing I knew nothing about Alibaba until I came back so as a Chinese overseas I had I'm so ignorance about Alibaba until I came back so I can I can predict I can guess more or less in the overseas market US customers they probably know not that much about Alibaba Alibaba Cloud so my advice and from my personal experience I say first of all Alibaba is a global company and Alibaba Cloud is a global company we are going to go global it's not only a China a Chinese company okay so first of all we are going to serve customers overseas market in the entire world Europe and North America and Southeast Asia okay so we want to go global first and second we are not only doing the cloud we are doing the blending of the cloud and big data and vertical solutions I call this VIP V for vertical I for innovation P for product so VIP is our strategy user and the VIP the innovation is based upon our cloud product and the big data product and data is at the center of it data is the center of this and we already got our data technique our data practice from our own business it was just a big commerce and you saw with some hard problems the ET Brains yes is a great playground of AI opportunity yes you must be super excited yeah yeah right you having fun? yes a lot of fun very rewarding experience a lot of dreams really come true well certainly when you come to Silicon Valley I know you the San Mateo office San Mateo right right right this is the cube coverage Alibaba Cloud I'm John Furrier with the Alibaba Cloud with Dr. Van Lee thanks for watching