 Hello guys, I'm Ken, I'm from AI Singapore, so today we are at FOSS day number 4, I'm going to talk about RPA today and basically I developed an open source RPA2 called TagUi. I was previously from DBS, Hewlett Packard, ADP, and ASTAR, I spread out. So while I was DBS, I basically worked in an automation test automation job. So what I do is to write test cases automation such that whenever there's releases, there's an upgrade. Instead of a manual tester, you know, sitting in rows to test the application, placing thousands of tricks, I write programs to do that automatically. So I was there for one and a half years, then I decided to leave the company to basically go to Eastern Europe with my wife and develop an open source version of RPA, so that's my story. After I came back, I joined AI Singapore as an engineer and to continue developing the open source RPA by adding ML capabilities, EL capabilities while keeping it open source and free. So I'll start with a demo, what's RPA? RPA means robotic process automation, and if I summarize it in one line, it's basically trying to replicate a human behavior and interaction around user interfaces, basically the front end of the applications. So those will be mouse clicks, key prayers and so on. So some time ago, my wife wanted to change her new phone, so she wanted to pick the best number for M1, and you know, from M1 website, you can actually select the numbers and see what's the least of numbers available. So I used a tool that I made and basically run the automation to get the least of nicely the formatted Excel spreadsheet of the available numbers from M1 and also the price for each number if you want to purchase. So let me show you an example of that. So I run the automation, but I just start by clicking on the icon, now it goes to the M1 website. Let's say you want to select a number that's ending with 8. So you notice that over here, you have selected 8, and then now, yes, automatically select search and in the back end, it's actually trying to grab all those available numbers and their price for each number. And you can execute this tool for command line, from API, from web service. It has an in-built web server already. Of course, now I'm running in the verbose mode. You can run in the choir mode where you only output the results one. So it's running through, this is the demo in the actual case, actually running through a few thousand records for six hours. So after running, you will see that it starts to click Outlook and actually start to do the automation of the UI to send the email of the spreadsheet to myself, for example. So all this typing is actually done by the automation tool. It's done by visual recognition. Basically, I look for the toolbox, the subject, and type in stuff inside and attach all these actions that are actually automated. So this is a so-called RPN, the industrial term of RPN. So now I send the mail out already, but I think NUS mail will take some time to send out. So while it's sending to my mailbox, I'll show you an example of RPN from a commercial perspective. So UI path recently got funding for Series B and they got like $100 million over million funding. So their valuation now is more than a billion already. It's a unicorn. So they developed commercial RPN software. So you see their version of RPN is somewhat similar like what I'm doing, just basically replicating the user interaction on the UI on the front end. So there isn't any back-end equation. It's clearly on the front end. So in this case, they are trying to automate the entry of invoices and so on. Yeah, you see all this mouse clean key press and keystrokes. What is automation? So let me go back to my mailbox. I should have received this mail that I sent out just now. Whoever attached the lucky M1 numbers. So I opened the file. I can see the numbers that's been extracted from the website. And this is an example of a simple personal use case. In businesses, they can use RPN for other more rapidity to combat the stuff like maybe using invoices, dealing with CRM entries, data entries. So all the stuff processes can be automated. Incidentally, our user recently is called Yu Chen Huang. It's from NUS, ECG of Systems Science. So in fact, I saw this the other day when I went out, when I came home, I saw like 10 email requests from him on support and suggestion. So basically last time I was going through it, making new comments for upgrades and so on. And I finished at 7.30 AM, catch a few hours of sleep and here you can see he has so many other replies now. So I think he's one of the earliest adapters for this tool and really great to be able to help him improve the tool to let him do what he needs. And incidentally he's also pairing up with National Library to implement this tool right now in some of the processes. There's also an accounting firm that I'm working with now to implement RPA in their work processes, but it's under NDA. So I cannot say the company name and what we're doing right now. Globally, okay, you search tag UI. You can go to the GitHub project page. And yeah, so this is the project. 2000 Overstars and maybe top 0.01% of the GitHub projects. Yeah, so basically this is the tag UI repository. The name you can search tag as in tagging the user interface. And you see the logo, I'm a fan of open source. So this thing may look like it's meant to represent two things plus it's like a robot with two eyes. And the next thing is you can see actually, I'm not sure whether it shows up on the screen. It's actually a letter L here and this is a letter T. A YLT because I'm a fan of Linus Stores. So I put that as an Easter egg into the logo. Yeah, but it starts totally, yeah. And you see behind the screen, there's only three stickers. GitHub and Tux, Polinax and my organization. Okay, so yeah, so basically these two can do stuff like taking human languages. There's 20 over human languages you can write your automation script. Malay, Chinese, Hindi, some Eastern European languages, some European languages, Chinese and so on. It converts it into automation code. Visual automation, character recognition. There's a code messenger for recording your actions, integration with Pi, sorry, R and Python. Yeah, basically easy to integrate with API course. So let me show you a little bit more. So this was where the tool started. This was when I joined DBS. That's me three years ago. I was like 25 kg heavier. Yeah, so ever since I joined DBS, I lost a lot of weight, working too hard. So this is when my first contact with UI based automation because over there is my first contact with automating the test cases and user interaction on the front end. And when I was there, I thought maybe there's more value in doing all this UI automation on production system because over there, I only a lot of touch-staging development system, never production. So I think it's more exciting to do on production. So I was actually becoming the team lead for the new agile team, but I know that I got to find something more exciting to do, so I decided to pack my bags, go to Eastern Europe with my wife for a year. And this is Serbia. I was there most of the time. This is Nikola Tesla Museum. Although Tesla made his name in US, it's actually a Serbian from Eastern Europe. Slav like home food in Hungary, Budapest, attending some of the meetups for machine learning and AI. I was in Chiang Mai for a while because it was too cold at some point, so I couldn't take it. We just spent too much Chiang Mai then we fly back. So most of my time was spent just coding purely from the first line of code in December 2016 was open source right from the start. I find it's a great way to, as a distribution channel so that I can get feedback iterating iterating. So there has been 13 releases so far. The code has been used by thousands of users. I don't track because open source, I can't really track the downloads, but for MPM is over almost $20,000 now, assisting installations about 1000 plus from peripheral data, because I don't really track accounting estimate. This was when I was coding from, I basically code in Mac using a BI, so I just pointing over the windows. Yeah, it's cross platform, so you can use some windows, Mac OS and Linux, which is not usual for commercial RPA2s. Commercial RPA2s have solely focused on windows. This was when I was at Code King, Spain in Chiang Mai. It was 4 AM, I pretty much a night person, so you just go through the night and go home at the, like I say in the morning, you sleep, yeah. This was when I was in Serbia, to visit a very beautiful city. Yeah, my wife, then I got back, and then my story with AI Singapore started when I got back. I spent a few more months to finish up the code until it's production ready. Then I met this director of AI Singapore, Mr. Lawrence, I was trying to tell you, I'm gonna go back to work for soon, because doing open source, I don't have income for one year plus, so my service is big all the time. He was saying, I'm forming a new team, would you like to join me? So that informal meetup came in, informal interview, and that's how I joined AI Singapore. You can find out more about AI Singapore from our website. Basically it's a government funded organization which tries to make it easier for organizations in Singapore and public sector in Singapore and private sector in Singapore to enjoy the benefits of machine learning and AI. Because usually AI machine learning has very high entry barrier in terms of hardware, in terms of your talents, in terms of your skillset. So we want to make it easier for people to do that. So there's a lot of initiative if under this project, there are three trust, three different trusts. I belong to the one under industry innovation, which is to deal with a close, more closely with the private sector. So as part of my job there, my mandate is to continue developing this tool, keep it open source and free by adding machine learning features. So that's part of my role there. So basically I will be the one that's conducting workshops, doing training, doing marketing stuff and development and support for the user request. Okay, so let me go straight directly to what's RPA. So maybe just a background. So yeah, basically robotic process automation is automation on the UI layer. It's purely software automation. It's not really physical robots and so on. Although now I already purchased on my own account a robotic arm and some camera. I would like to integrate this type of automation with the physical world. Then it's more exciting to me. So I'm trying to do that at my own time right now. So robotic process automation in reality is not real physical robot. In deployment, you probably see a role of laptops running their DM wear or whatever that's running an automation in the background. This is what I meant. Mouse clicks and keyboard entries. Our group research is one of the earliest. 2001 it started. There are two unicorns right now. Group research and UI path. They are both robotic process automation software providers. The market size is 0.5 billion. Right now it's going to be 0.5 billion in a couple of years time. Sorry, four years time forward. Opportunities and challenges. The question is people ask why now when up here it's been around for so many years? What the reason is it's close association with AI and machine learning. So it becomes a potential for certain tasks to be automated, especially those tasks as a huge volume and rapidity in nature. So that is why right now in adoption OCE banks, insurance in Singapore's context for example, DPS, Sintel, JPMorgan, NOL, Service School, those are the companies that are using right now. And a large part of the reason why it's all the large enterprises is because of any entry price to use RPA. Typical installation will cost more than 100,000 per year. So unless you are a large enterprise, S&E can't really come up with a missed out budget to do this type of automation. So I thought it would be nice to have an open source alternative. So I started developing the code and hope that maybe for S&E who want to enjoy the benefits of this type of automation that's emerging, but can't have the budget to purchase those expensive software and expensive consulting firms, then at least they have an alternative in an open source world. These are the commercial vendors. This slide actually, if you search TechUI and go to the repository, you can see here under Prezi Slides. So you don't have to take notes. I mean, you can just Google TechUI and from the homepage you can see Prezi Slides, which is what you're seeing here. And you also see an RPA workshop, which is a two hour workshop I conducted last week at my office. Basically it's a run through of in an hour. How to use it? What can it be used for? Yeah, if it's guided, it'll be great. If it's not guided, you can run through by yourself. It should be pretty straightforward to understand, especially if you've got a developer background. Okay, let me close it for now. I will not run through the video demo here, but just to let you know, where's my Chrome? Yeah, but just to let you know there's a Chrome extension for it, which you can download on Chrome website. And basically it lets you record your actions, what you do with the websites and help you to find a website identifier and so on. It'll make it easier for you to write auto-management script. Basically, you start recording and you stop, and it's an executable script that can run directly with the software already. Okay, let me close this and close this. So I'll just skip forward. And this is some other demos, but I'll just do it live to be more interesting. So I'll skip through these as well. So these are the key features of Type UI. The architecture diagram, for those of you who are technically kind, basically it's built on a suite of open source tools that are mature and established, well supported mostly. I build integration to Chrome directly to a native CEP so that I don't have to go through another library. I also have the most shortest direct path of the automation tool to Chrome so that I have a better control. Otherwise, if there's a bug in the midstream, I can't change that guy because it's always those contributors. And what happens if he'll become the bottleneck of my tool? So as fast as I can, I want to build direct integration with what I'm working with the other software and libraries I'm working with. Same for R and Python. There's native integration with R and Python and a schoolie for OCR and visual automation. So let me just go back. These are some of the examples. So I will go straight into doing demos. So it's a CLI tool, but basically you can run it from a desktop icon, you can run it through an API call, you can run it through a shadow window, touch scheduler or Linux Mac, you can run it from Chrome tab. Yeah, basically from various channels you can run it. And he has an in-built web service server already so we can just enable it right on Linux and Mac OS. Okay, let's go to samples. I'll try to run a Yahoo sample. I run it using Chrome. Then by default, it runs endlessly, but you won't be able to see what's going on except the text output. So let me run it in the foreground so you can see what's going on. So in this example, I'm automating the process of going to Yahoo, search for GitHub, taking some screenshots and go to Dataco and text some stuff. So all these actions are actually automated. So I'm capturing a screenshot right now. In the back end, it's doing all this stuff. There's log files to check, exactly what's going on, when's the runtime, how long it takes. And now it's going to Dataco to type some other stuff over there and capturing a screenshot, okay? All these samples are already inbuilt as part of the repository. So right there, if you go to flow sample, you can see one, two, three, four, five, six, in my, what does this sample do? What does it teach you about the tool? So it covers pretty much the different functionalities of the tool. I'll show you an example of GitHub. Sorry, I should run it in the foreground so you can see what's going on. So this example, the tool goes to GitHub and tries to download repository to my hard disk. So it goes to GitHub and all these actions of clicking download and saving the file is automated. And then double Z. So the file, this is automated. And then at the end, it does a API call to show how easy it is to make an API call. It's a one-liner basically, API blah blah blah and then you get a response and so on. Of course, you can make complex API calls by adjusting the best supported as well. Okay, let me do another one. For those of you who are in website testing, you can do stuff like test automation. I'm doing a test, sorry, I should run it in Chrome so you can see in the foreground. So now I'm going to Yahoo, do some validation of search and results. So when I run it in the test mode, it automatically checks all the test cases that you want to test and the results and whether it's path or fail. There's a negative test case at the end, so you'll see. Okay, so let me open that file. What I want to show you is if you run it in test mode at the end, you will be able to see a XML file which you can pass on to your CI tools. So let me drag it over here. So you can drag it, you can sync this XML file with your tools such as Jenkins, Circle, CI and so on for your test automation needs. So there's some other examples. I want to show you integration with Python, okay? Because it's hot. Basically, you can code directly Python code in this automation script itself. So this is Python code, you can call it by saying PY, Python code you want to run and get the result or you can do stuff like PY begin. Then you have a huge chunk of Python code with your machine learning library and then you show the results, okay? So I run it from the Python code. So an example of such use cases will be you get some inputs on a website or application. You train a machine learning model for example for loan approval, loan applications and then you get the decision. And from the decision of the machine learning model it takes other action downstream some other application they want to automate. So in a sense, you can funnel an end to end process from the beginning with inputs, some decision making and outputs. So let me process. An example will be our demo, something like this. So it has integration with R for those of you who are data scientists. So what happens is in the back end, I write R library, basically a lot R interface to do the syncing communication to TechUI. And same for Python, when running Python I actually doing a backup process for TechUI. Same for the screen for visual automation. The last demo I want to show you is the let's say maybe Chinese, okay? So you can write in 20 over languages. It can be as simple as changing the default language you want to use. Basically because I build the tool and how you do translation is use Google translate and capture whatever words and so on. After I build the tool, I use a tool to build itself to generate the 20 over languages by going to Google translate and you'll get the results back. But I only manually check for Chinese and English. The rest I'm not familiar with those languages. So I try to do a demo for the same Yahoo but in Chinese. So it tries to run the same first example Yahoo but now it's running it in Chinese. So you see all the Chinese characters here. Let me queue it and I change it to some other languages. Let's say I want to run in, I don't know, maybe German, let's say German. So I can, for a script that's written in Chinese I can easily find it in the third language. German, Vietnamese, Serbian, Polish and so on. So it's very straightforward to use and easy to maintain for users in their own natural languages. Or this one will be German. So I have a couple of minutes left. Can take some questions. Otherwise, later on at the backstage, I can have a sort of one-to-one conversation. Any questions? Any questions? So I saw the video before. It seems like that you guys built up PHP. Yes, in fact, in order to make it work I got to learn PHP, the translation engine is in PHP but the other machine code is in JavaScript. I've got the right shell and batch and Python for visual automation and there's Python integration and R. Is this what you said for using PHP? Easy, straightforward. I can get the stuff I need directly without adding a lot of stuff to it, dependencies. I usually try to minimize dependencies. I have a question. So I have done a lot of similar work using phantom.js and then shifting to nightmare.js. How is it different from nightmare? Okay, this is the underlying framework here. It's Casper.js and random.js. Nightmare.js is based on an electron, basically. But the newer one will be puppeteer for more up-to-date releases. So what's the difference? I guess those are more for test automation in general but for this tool, I try to make it such that for example, if you go to the home page you see how to use it, under cheat sheet, you can see all these English-looking steps. Basically, the goal is to make it as easy as possible to convert an idea, a flow that you want in simple English, in your language, in English, Chinese, whatever, into an automation code. So let's say you have these few lines of code, language, at the end, what you get is hundreds of lines of JavaScript code that runs. So basically, if it's a translator that runs, basically. That's the goal. Whereas for those tools, you basically still need to do your programming and your coding. Also, I have a question for AI Singapore. So is it supporting people all over the world to come and under build our lines? Right now, mostly, we are supporting the inter-release Singapore local system to try to build the capabilities locally for sure. So it's sort of English language where you can talk to it in your own language. I wonder if it could be useful for people who maybe are not able to use a keyboard or do a mouse so easily. Have you tried putting in an NLP? Or has it sort of a microphone in front of thank UI? Okay, yeah, that's a question. In fact, I designed a CRX system so that you can actually run tech UI in the conversational tough syntax. But there's not a lot of uptick in this right now, but I assume that this can be something that's used for people who want to voice control, I think Alexa or maybe Google Home and so on. It's possible. But I haven't come across an actual music is somebody coming up to me saying, oh, I want to develop this. But yes, it has a framework already that's supporting this type of NLP based on execution. Any other questions? Automation needs to be on-demo and need to be both for web, right, for size. Yes. So what about the mobile devices? For mobile devices, you can use using the visual automation. Let's say you run an emulator on your desktop and then you can interact with all those things. Through emulators. And it can automate your desktop apps like for example Outlook and all this, through visual recognition. So instead of calling the identifier, you give it an image, you look for the best image regardless of the screen resolution and then click on it or read text from it. So what behind, I mean for the UI device or for the UI device or this applications of how it gets the elements on the UI? Oh, okay. If it's for mobile devices, it will be through visual recognition. So based on a certain image of a text box of a button, you try to find the best fit for that element and interact with it. It is about text or image, I mean... Image, correct. And then to get the text, the image is captured and OCR is used to convert it into text. So it's all imbued. So what about multi-language? I mean for my apps, it's a six-language for different language. I must revise a different script on the set. Oh, okay. I get what you mean. Actually, you can use a common reference language, which is... It's like a DSL, a domain-specific language which you can write in normal user English. So these are very standard UI interaction descriptions. Like click tab, save, load, and download. So it will be all these languages. Okay, sure. It really is very enjoyable about automatic desks for our app. Okay, so... Compared with some other platform like we tried, it's a blue stack, you know, a blue stack, right? No, but the... I don't know. Blue... Recently, if you are doing a test, if you are doing test automation, I recommend take a look at Cypress. Okay. They are going to... I think they are going to be a leader in this field. Okay, I will take a look. Need a head question and we will thank you for being so much. Thank you so much, Ken. Appreciate it. Next step, yeah.