 Hai, terima kasih kerana melihat saya di sini. Terima kasih, Jinmeh, Donor, dan organisasi saya. Ya, jadi basically I'm here today to well not try to sell the company I'm working for. But I think we're working on some stuff that's pretty cool. And to give a little bit more background, I used to work in a research institute. And basically in Singapore, I2R, we did automatic speech recognition. And what that means is that basically I sit around in write papers and build publications and many mentor students and it can be very far removed from actual applications that like actual things that you actually build product itself. So going from there, I had a chance encounter with a friend that was starting a company. Well, he was vice president of a company in Silicon Valley. And he said, we're building this really cool chip over there and why don't you join us and make something useful in our life instead of writing papers all the time. So that's what I did. And so I'm in this company called Mervumind. And our company is interested in building, and that's pretty much a whole team in Santa Clara. And there's 20 strong, but we have a situation where we have a sort of a full stack kind of approach to people working on hardware, designing RTL, doing a board design, software stack applications, AI engineering and models, the whole shebang basically. And being going from something that's very academic to something that's really start up and it's a life of death situation, every single minute counts. You've got to make product or you're going to get basically all the job. And that's very exciting to me. And also I feel that there's a chance to build something that's useful. So one reason why I decided to move, this is going to get a bit technical. But one reason why I decided to move is because of the founder, we had Dr Ren, who used to be a distinguished scientist in Pytook and he did a lot of breakthrough work in heterogeneous computing and he did a lot of stuff on being one of the first people to start using GPUs to train AI models. So what is Mervumind? So this is going to look a bit less like a slide deck in a couple of minutes, but basically we're a full stack AI company and the core of our company we're trying to build is AI chip. And you may see a lot of companies nowadays that say that they're trying to build an AI chip, but what does that really mean? And if you look at the current applications that available today, things that you can do with... Yeah, if you look at what comes to mind when you talk about artificial intelligence, what can you do today, right? And many things are not really possible unless you have that compute and the data to train the models to do it. And we believe that with the technology that has more... by creating technology enable people to actually bring artificial intelligence models just from very platforms with a lot of compute to something that's more to embedded in nature. So maybe I'll just skip through this one. So I think the key thing is that if you look at what people claim to be AI chips today, there might be a whole bunch of different companies that are building neural network accelerators. On one end, you have things that may be very low power, but at the same time, if you have not a lot of power, you can't do a lot of performance. So you can't do a lot of compute. So the amount of compute that you can do is really sort of... amount of compute processing power you have sort of controls like how much, what can application can do. So if you look at... on the other end, if you have more higher power profiles going from very low power to very high power all the way down to GPU card from Nvidia, 250 watts can do a lot of processing, but then it's very costly. So it's very costly into the power budget. So you cannot imagine if you have a drone that's flying around, you can't stick a GPU card in it and go around and start doing computer vision at real times of analysis. And that's why I believe what we're trying to do here is trying to find a sweet spot where we can actually provide a lot of computing performance at a low power budget and that enables more applications to come about. So that's why we're really trying to build a company, chip at the core of it, and from there we hope that it can enable more applications. And it's been an exciting journey because after joining the company for like about a year and a half and starting from almost nothing getting to the point where we have a first prototype with the FPGA and followed by the ASIC, the application-specific IC and getting the chip back from Foundry, every day is a lot of excitement. There's like the boards break, there's lots of firefighting going on and every single moment I feel that I will try to bring this chip towards product, towards the market and almost to the point where we are ready to show some cool applications and start to talk to people to see that we can enable more exciting applications to come about. So this is where we are trying to place ourselves with low power but the performance of GPU class kind of machine. So if you look at what NVIDIA saviour can do maybe, we are looking at a power profile that's less than it but equivalent amount of performance. So the performance numbers here are tera-ops, trillions of operations per second and a lot of tricks people are used to actually get that number if you run a GPU for instance, usually it's like a 16-bit or 32-bit floating point calculation and to get a low power profile you might do things like you might do quantization, you might do some certain kinds of custom architectures and you'll get that performance and what is it we do that makes this possible. So I think this is probably the key slide in explaining what we are trying to do here. If you look at the evolution of computing, you start with traditional CPUs like general purpose computers then you go more and more and more specialized going from the CPU to DSP chips to GPUs increasing amount of parallelism that you do with your compute. So like for GPUs you can have lots and lots cores and graphics processing units you have a lot of hardware to basically run a lot of stuff at once. It becomes more and more specialized so at the expense of not being able to run any general purpose algorithm you have very specialized hardware that allows you to run certain kinds of applications much faster and what we claim in our we actually file and got a patent on this that with the Dover tensor chip the ASIC that we designed we're going step further by going with so much specialization that you can get an equivalent amount of compute but much less power but what kind of specialization we're talking about this is a neural network accelerator really and if you look at what people are doing in artificial intelligence they're basically deep learning a lot of neural networks a lot of topologies are very similar if you look at computer vision for instance everybody is doing a convolution network you really don't need a lot of stuff you just need a 3x3 convolution a 1x1 convolution RESTnet roughly that kind of architecture and if you just build something for it specific to that and just target one thing do one thing really really well and then that's all we need so yup so starting from last year we started with the evolution of the hardware there's a lot of iteration that involved with designing the RTL and we started with FPGA cards and we built we basically built like an entire neural network processing core into the vertex and then even then last year we also demoed some applications at CS and that was exciting for us and then last year near the end of the year we got our first version so with the same logic we worked with one of the boundaries to build ASIC application specific IC and the chip came back and today we have this guy so it's hiding under the fan here it's memory banks here and whatnot and it plugs through a PCI interface to this main motherboard so this is one device that I brought back to the client today and but I don't have it with me but it's at my client's side and this with the core chip is able to run pretty much most computer vision and types of neural networks and so how this works is basically with the current device that you saw let's see so that's like a whole CPU with a host memory and then a general purpose computer will actually take care of the rest of the control logic for applications talk to our device here which is a PCI board and we have an onboard DMA controller and our core this is the stuff that runs the neural network processing and on device memory to hold all the weights and for your neural network and we also develop an application workflow so this is my main job really nowadays just develop an application workflow and build a runtime so that you can actually make it simple to use this stuff because if you have a new piece of device it's useless if you can't actually make full use of it and for in terms of artificial intelligence really what you want is a lot of machine learning stuff is basically you have the data set you have a specific problem or a solve let's say you want to be able to recognize cats for instance just collect lots of pictures of cats then gather a training database train the model and then from here we provide a tool chain to compile everything that will run on the hardware and then after that a benchmark can run so we have like this entire workflow that we're trying to set up as well so some example use cases that we've had so far I think last year we've had then as a full stack company we not just do the trip we also do everything all the way up engineering model training as well and sometime last year we started off with building this smart endoscopy system there's a new clip here but I'm going to skip it basically what it does is there's this doctor in like this China busy bunch of doctors in West China currently the place but in Chinese hospital and you take an endoscope and you stick it into somebody's colon it sounds a bit gross now but then you get live feed from inside and basically you run at 30 frames per second and try to classify and figure out whether or not the person the patient you're staring at has defective colon or not so all this runs in real time and that's as possible with the hardware acceleration okay, I'm going to skip this then CES last year was January that was a really hectic time in Las Vegas building real time classification for this is basically off the ImageNet visual classification task and then we benchmarked it then with the first version of our hardware we were looking at basically about 120WP desktop GPU with about almost half the performance and then today this is the second column ASIC version which is a bit more powerful so we'll talk about some applications mostly we've been doing a lot of computer vision applications just a clip of a demo so this is real time object detection that we're running about 60 frames per second running through our chip everything is just redid here this is basically a YOLO model so it's open source and we just compile everything and stick it in our chip which runs and even though largely we we survey only certain kinds of architectures mostly CNNs 3x3 convolutions element wise sums so these are topologies like ResNet and VGG type of networks we're also looking into using these types of topologies to support many other applications because the topology is one thing as long as you have the data you can train it so we can also probably do stuff like pretty much everything else you can in AI as long as you write topology data you can train and do the system it's just an interesting one so let me stick to the next one okay so I don't have a live demo of this unfortunately but so super resolution is one other interesting use case that we've been playing with so basically if you have a lot of high res if you have a lot of low res images you can train you can train a new network that takes low resolution images and tries to fill in the fine structure of that image and generate a high resolution image so how does this work basically if you have lots of let's say 4K content down sample it train a neural network where you feed in the low res down sample data and try and generate the high high resolution image so if you have lots and lots of pictures of cats then you train a neural network then it's going to understand what cats look like and if you have a very blurry cat you'll be able to draw the whiskers for you basically so that's the general idea and we're seeing possible potential applications where if you wanted to have you have a very big nice expensive TV or 8K TV but there's no content so what you're going to do we'll take stuff from YouTube whatever content you have and feed it through a neural network and generate generate that content for you and all this is done in the fly life using the hardware and with the vision of the founder that we're working for, the vision of the startup having a chip that's called having the compute ability to enable more interesting applications and we want to have this vision where we want to enable IOT not just IOT but also what we call intelligent so not just a lot more expensive and interesting application computer vision type of applications can all be moved from the cloud to the edge so if an example is if you look at what we have for instance speech recognition today a lot of stuff is all run off the cloud you don't really have the kind of compute in your phone to actually run everything locally for things like large vocabulary speech recognition or like a real time objective detection and we're seeing that by having a platform that can have sufficient compute to run the models that you really want and you can do all these things on your drones or things with a small power budget so this is what we're trying to look for and hopefully we'll have a dev kit out for people soon then you can play with it so that's what I have for you questions so very recently okay Q&A is later as well I think most things will hang around so let's go through all the talks hang out and talk