 Hello everyone. Thank you for coming. And we are very excited to share our research at DAFCA. Our presentation today is to talk about the security of smart speakers. We found and exploited some vulnerabilities to attack some smart speakers, such as Xiaomi AI speaker and Amazon Echo. First, please let me introduce my teammate and myself. New Xiang is a security researcher at Tencent Blade team. He has found several vulnerabilities in Android. And he was the speaker of Hegging Voice. And he worked with us to complete all the work of whole study. Unfortunately, he's not at home today. And this is my teammate, Chen Wenxiang. He's also a security researcher from Tencent Blade team. And he's on top of 100 what had the next of MSRC. And he wrote a book called, and what I had to talk about the web browser security. And my name is Wu Huiyu. And you can call me Nicky. I'm a security researcher at Tencent Blade team. And also about a hundred winner of Gigapam and speaker of Hegging Voice and POC. And here we would like to thank some people, especially Nicky and GMAXP. They gave us another support. Next, let me introduce the Tencent Blade team. Our team comes from Tencent Security Platform Department. Now, we are focusing on the security researcher of AI, IoT devices, mobile devices. In the past two years, we have found more than 17 security vulnerabilities for many companies, such as Google and Apple. And you can contact us at blade.tencent.com. Before we start our presentation, let's take a brief look at the outline. First, we will give an introduction to smart speaker. And then we will talk about the attack surface of smart speakers. Then we will share the details of remote attack Xiaomi AI speaker and how to break in Amazon Echo. At last, we will summarize our research. Smart speakers is the most popular smart home devices in the past two years. Amazon, Google, Apple and some Chinese vendors has released their own smart speaker products. The Amazon Echo family is the most popular smart speaker on the market. It has more than 30 million users. And many people are very interested in security. So we choose our research target. Another research target is Xiaomi AI speaker. Xiaomi AI speaker is very popular in China because it can control many smart devices in Xiaomi ecosystem. Next, let's briefly analyze the attack surface of the smart speaker. In smart speaker architecture, many include three parts. The first part is smart speaker device. The second is the cloud server. And the last is mobile phone application. The attack surface of this architecture contains many parts. The first part is the hardware interface and the network of the smart speaker device. The second part is the security of the mobile phone application. The third part is the security of the cloud server. And the last part is the security of the communication protocol between them. Then I will talk about how we can explore multi-temporal vulnerabilities to remote attack Xiaomi AI speaker. These include the five parts. Xiaomi AI speaker has a built-in system based on open WRT. It is the SH service was designed by another firmware of the speaker. It can be downloaded easily by HTTP request. But we cannot replace the firmware with many of the media attacks because of the security mechanism. In addition, it opens up some network ports. And the five, four, three, two, one is the communication port of the MIIO protocol. MIIO protocol is used to configure and control smart home devices made by Xiaomi. It is an encrypted binary protocol. After analyzing the firmware of Xiaomi AI speaker, we found that the data included in the MIIO pocket was processed into UBUS commanders and is queued by Xiaomi AI speaker with route permission. So, if we can simulate a Xiaomi smart device to communicate with a Xiaomi AI speaker, we can use this protocol to execute a route commander, which is a route commander execution vulnerability in line. In order to explore this vulnerability, we first need to establish a collection with the speaker and pass the authentication of MIIO protocol. This will require us to obtain an AS key for connection. We call this key as token. This token is 16-bit history to get this token. We need to use the Mi home application to rebrand the speaker to attack the account. Then extract the token from the application's database. We found a web interface authority vulnerability. So, we can only need to get the device ID to unbend any speaker device. And we found the device ID can be obtained by sniffing in the line. After got the token, we can use some tools to connect the Xiaomi AI speaker by MIIO protocol. Then we can send some UBUS commanders to the speaker. The first commander is used to modify the drop beer configure fare. And the second commander is to turn off the drop beer password authentication. Finally, we need to start the drop beer. After executing these UBUS commanders, we can successfully access the SSH of MIIO and log in without passwords. And it means we have obtained the route permission of the speaker. In addition, we found another vulnerability in a program called message agent. Which is used to MQTT communication between the speaker and cloud servers. When a user controls the speaker's function in a mobile phone, the application communicates with the cloud server first. Then the cloud server sends the device ID and commander to the message agent. Then the message agent will execute the commander. We found the two special web interfaces. One is remote UBUS, which can call the UBUS service remotely. And the other is UBUS remote OTA. The commanders send the biocalling this interface will be executed using the system function. So which is a remote system commander execution vulnerability. There are two pieces of the vulnerability code that are analyzed using ADAPRO. One that finally calls the UBUS service. And the second finding involves the system function. Since these two vulnerable web interfaces have identity authentication, if we want to explore this vulnerability, we have to get the user's cookie for the speaking binding. In the net research, we found two excess vulnerability in account.xiaomi.com. Which allows us to remotely obtain a large number of MI AI speaker user's cookies. So now we can explore these vulnerabilities to complete the attack chain of remotely obtained the route permission of MI AI speaker. Now let's take a brief look at the first demo radio. This radio demonstrates the two vulnerabilities we mentioned. The first part of the radio demonstrates the connecting to the speaker via SSH in the night and the controlled play a piece of audio. Because the Xiaomi AI speaker doesn't speak English, so we control the Chinese meaning winner-winner chicken dealer. In the second video, the victim's smaller speaker is remotely controlled by the attack after cleaning the URL. Okay. Now we will finish the section on Xiaomi AI speaker. Let's talk about how to break in Amazon ICO. These include six parts. And then we will quickly talk about the previous four parts and the explore detail and the demo radio will be shared by my teammates. First, let's have a brief look at the Amazon ICO. In the newly released second-generation ICO family, all of the Amazon ICO devices use similar hardware and systems. So we choose the ICO. as our test devices. Amazon ICO. has built-in fair OS, which is actually a system based on Android. It turns on AC NUNIX and ASAR, locks the port loader, and it has a USB interface, but can only be used for charging. We also find some network ports by scanning. At the beginning, we tried to get the firmware for a long time, but we got nothing. So we choose to start with the hardware and extract the firmware directly from the flagship. In order to extract the firmware from the flagship, we need to prepare a lot of hardware tools, which include the hardware gun, soldering, Iran, reburning tools, and so on. Okay, this is the second demo radio. This video shows how to disorder the chip from the PCB in six minutes, and then reburning the chip. To save time, we double the video speed. The most important skill is to choose suitable temperature and be careful. When we disorder and reburning the chip, we need to choose suitable chip adapter according to the chip's data sheet. Amazon iCo uses the BGA221 package EMCP chip, which we can easily buy an adapter on Taobao or eBay. And we also can buy a universal EMMC reader and connect the adapter to the USB reader device, so that we can read and write the firmware content in the EMCP chip either name. And this is the disk partition information extracted from the flagship. It contains many parts, such as preloader, bootloader, boot image, system, and so on. When we got the firmware, we had another important thing to do. That is to modify the firmware to create an Amazon iCo. with root permission. This can help us quickly debug some vulnerabilities. Because the Amazon iCo. turned on seunicus and logged the bootloader, we cannot directly modify the boot image to root. We need to close seunicus and then add the super user binary file to the system and start. And then we need to open the system IDB function. We add these operations to a shared script that start automeg turning, so that every time the system boots, it can make sure we can collect the iCo. from its USB interface and get the root permission. After completing this, oh, okay, sorry. And this is the third material to show how to solder in the trip. After completing this modification, we need to resolder the flash trip back to the PCB. This video demonstrated how to resolder the trip back to the PCB in three minutes. And to save time, we also doubled the video speed. Before we complete these operations, we have got a rooted iCo. and okay, my part is over. Thank you for listening. Now please welcome my teammate Chauvin Xiang. Okay. Hello everyone. And now I'm going to give a talk about how to turn iCo to an eavesdropping device on the basis of software. Earlier my partner has given a great talk about how to hack into the iCo device on the basis of hardware. The physical hack is very important. Without that, exploiting will be much difficult for me. So how many steps would it take to hack into a device? The first attacker exploited and then the victim connects to the hacker. And next, the hacker do whatever he or she wants to do. Yes, the step is simple, but on a well-protected system since we are getting a little complicated. Since we already got the firmware and debug environment, we are able to check the restrictions that Amazon has set up for us. So first, please allow me to introduce protections we need to bypass first. There are three firewalls or firewall like things. The system use IP tables to allow only a few ports from accepting connections. And the SE Linux is also enabled. We managed to find a binary with high privileges to bypass it. This binary is a vulnerable program showing the picture and it has a web server activated. To communicate with it, we must pass a client authenticated TLS handshake. That means we must get the certificates and other things. But those sorts are obtained from the cloud synchronizing. In other words, we need to get the cloud synchronizing information of other devices. So attacker is always happy to see there's a web server available. That explains why we choose this binary to be our target. In the next few slides, please allow me to introduce some background information so we can go through these things more clearly. The world, WHAD, which is a whole home audio demon, it is a huge binary run set route and with network access and is able to record voice. Also the most excited part is it would open a web server. The HTTPS server runs on port 55443 and it accepts control commands. But the things are not going as we wish it to be. The HTTPS server introduced client authenticated TLS handshake. That means we must have a server certificate, client certificate and a private key to communicate with it. That sounds difficult, but we have also noticed that the physically routed device can also pass a check and communicate with other devices. So the information must be stored in somewhere. By reading the document of LiveCurl, we know we can extract all certificates and private key from LiveCurl's negotiate function on a routed device. To do the trick, the first thing we need to do is to bind our routed device into a victim's account. I'll explain this why, why we do this later. By auditing Amulon's website, we have found two XSS and both need two steps of user's action so we decide not to use them. But the XSS here is fatal. You can steal privacy or control the device with a cookie obtained from XSS. Because Alexa dashboard is a lack of modern protection such as CSP, which is a content security policy and HTTP on-line protects. Yes, by using the XSS, you can get the same result of what we'll talk later. But we also found another method, which is the quickest and easiest way for us. That is to spoof an Amazon website. We have found that every time when we log into the Alexa, there's an open ID login page and there's also a redirect parameter in the URL. By modifying that parameter, it will redirect to any domain, which is a subdomain of amazon.com in HTTPS. Since we want to mimic an Amazon's website, so we don't want to mess up with the HTTPS certificate, so we'd like to have an HTTPS downgrade redirect. And luckily, we have found the site associated redirect.amazon.com. It's validating rule allows to downgrade to HTTP. It also has some vulnerabilities that could be redirected to other sites, which is not belongs to Amazon and leaking some token from open ID. But now the only thing we need is the downgrade. Okay, we want to spoof Amazon's website inside victims' land, but there are two preconditions. The first one is the attacker needs to be joined the victims' land. And the second is we need to find an Amazon domain which resolves to a local address. An attacker can be joined into the land with that IP address. And both are not hard to be satisfied. You can get a list of subdomains from Google transparency report. Then you can disable DHCP to use a static IP address to join to the land. We have found the app service which resolves to a local address. If an attacker could join the land of victim with that address, then start a web server. When victim tries to visit app service.amazon.com in his or her browser, actually the victim is visiting the hacker's website. Also, because it has a root domain of victim.amazon.com, the cookie will be sent to the attacker automatically from the browser. To sum up, first the attacker joins the land and when victim logging from the Alexa login page with the redirect parameter set to associate redirect.amazon.com and when victim finish the login, the page will redirect to associate redirect.amazon.com. Then this site will downgrade and redirect to app service.amazon.com. This domain resolves to attacker's computer and the user will finally visit the attacker's website with a cookie sent to it. Then the attacker bind his device into victim's account using the cookie. And finally, we can communicate with other devices of victim. And the first problem is solved. We have got a device, could pass the first check. Then we will use that device to extract certificates and private keys from negotiate function. But first, I'd like to show you a simple picture about the architecture which shows how what gets the device info when it starts and why we need to bind it into victim's account first. If you have many devices in your account, they will group as a cluster automatically. When LXRD starts, it will obtain information from Amazon server and when what starts, it will get those information from LXRD. And when a device wants to update something, the LXRD will notify the Amazon server and the server will later notify all other devices in that cluster to synchronize. Because the key will change when what starts to automate the exploiting later, we choose to patch the word. The word will periodically send an HTTPS request to other devices to know if they are still online. By replacing the negotiate function to the assembly code written by us, we can dump the certificate case to a local file. It is simple and violent by any hard words. So we don't need to crack the complex algorithm. Since everything is taking place on attacker's device to simplify the environment, we have disabled all protections on the physically hacked device. The code is a little complicated, so we are not going to talk about this now. You may check our GitHub page to get the full code in assembly. So in another word, we try to dump those things to three files and use them later in attack. Now we have dealt with client authentication problem. Every time before we want to perform attack, we run the patched word to get the third and private key, then we can go to the last step to break the vulnerable program word on victim's device. Since we are going to attack it, there must be vulnerabilities to be exploited. So let's back to the binary auditing. So we have audited almost every binary, and we found the binary written by Amazon themselves are secured by design. But we have also noticed that Echo is using the very old version of the third party libraries. They are all nearly four years old. You can see the picture. It shows they are using some code of year 2014. So also Amazon tries to apply security patches to them. There are still many end days and zero day vulnerabilities. They are gold mine. Okay, it's time to dig the code. That is to attack the web server and get control of what. And how the feeling for you to audit some code written from four years ago, maybe a little relaxed, I guess, because old code with poor tests often leads to serious problems. It took me a week to find the treasure, but when we first find the exploitable function, nobody involved calls it. Luckily, Amazon updated the binary two months later, and we have found that a lot of functions are referencing this function. The root cause of the vulnerability is the library has failed a condition check and thus a lot of vulnerabilities happen in sequence. Let's take a look at the questionable code. First, you can see the content length is a user's polite code and CVT web tries to get the value from HTTP header and convert it into an integer. The ATOI accepts negative numbers as input and will return a signed integer. What I don't quite understand is why they convert the value from signed integer to unsigned here. If the variant is unsigned, the code if content length is greater than zero is actually equal to if content length is not zero. So maybe the unsigned here is a typo. And next, the if check, negative one equals to the biggest number of unsigned int, so we will also pass the check, and then the number plus one is again an integer overflow. The result is zero. MLOC zero is valid because echo system is using DLMLOC as the MLOC algorithm. As the menu says, even input is zero, it can still allocate a small buffer for you to write in. Next, in the MG read call, there's a heap buffer overflow. We'll talk about this soon. And there's a minor one, the post data bracket content length assigned to zero will write zero at negative one position. Leave the string not zero terminated. That is a potential information leak because this function is used to get the parameter's value. So it's just like the chest, a bad move might lead to a total loss. Okay, let's back to the heap buffer overflow. The dog leaders MLOC, DLMLOC zero will allocate 16 bytes. That means eight bytes of metadata plus eight bytes buffer. Well, we can write our data to. In the MG read, data read from HTTP request is written into buffer. The good thing is this function will fix the input length. So if we are giving the huge number as we did, it will fix that length to real data length remains in the socket buffer. Then it will copy the data from socket to MLOC buffer. So if we try to post the string longer than eight bytes, there will be a heap buffer overflow. So do you remember the size of MLOC is controlled by user? We can send content length to control it. If we don't send the full HTTP request by omitting the terminating return carriage and new line, the MLOC heap buffer will remain in the memory without being freed. When we need to free the buffer, we could simply send the remaining return carriage and new line. Then the connection is closed and the buffer is freed. By the way, the MG read will write anything including zero charge to buffer, so it is very convenient for us to exploit. Since we can control the content of heap, one thing we wanted most is a vulnerability to bypass ASLR. That would be good for us to do the heap buffer override and heap spray later. First, let's talk about the heap spray part. If we want to exploit it, we must try to control area to put our shell code in. The anonymous memory is a good place. Large heap allocation request calls the MLOC to use M-MAP anonymous memory. It is controlled by the M-MAP threshold variant of the MLOC. Although there's a hundred of threats running in the background, there's only we want to allocate such huge memory because the algorithm of the M-MAP, the address is started from a predictable range even when SLR is enabled. You may check the article in our reference in the last page of the slides to know the detail. In our case, we have got an address that has a good chance to be allocated. We have got this value by just run the program again and again and it is an experience value. So after we have got a buffer to put our shell code, we may also need an information leak to do the rest of things. The IoT device is different from the desktop applications. There's no screen for us to know the results. So if the leaked data is sent to us through the network, that will be great. Finally, an information leak of LiveCurl in FTP connection is proved to be exploitable. By the way, this is an end day vulnerability. So LiveCurl doesn't give the POC, but we try to reproduce this problem from LiveCurl's patch and test cases. We see to trigger this vulnerability, we need to reuse a curl handle. That means we need to use the same handle to connect to the same FTP server, not less than twice. Okay, let's back to the program logic of Word. We have found a control command. His name is download audio. Normally it would download only a single file and the curl handle is closed. But we have dug into the code deeper and found that if the extension is PLS, it will parse the PLS file and use the same curl object to download every file in it. And from the second connection, Curl will reuse the FTP handle and trigger the vulnerability. The picture shows the detail of the file. We use PHP redirect to bypass the protocol restriction of Word. Also, if the PLS downloaded successfully, Word will use the catch and will not accept the same request twice. That means if Curl's FTP function fail to leak any bytes, unless it is restarted, Word will not accept our download request again and we don't want to see this happen. So we check the code branch and found if one of the URL points to a file that doesn't exist, there will be no catch. So we can send the same request as many times as we want till we leak an address. We use a Python script to automatically adjust the payload and finally we have found the size 103 where we use a freed heap area which contains an address pointing to a function of Curl. And based on this address, we can calculate the loading address of Curl and furthermore address of every libraries. Because the LD.SO will load the needed libraries in sequence, so you can simply calculate the next or previous libraries address by plus or minus the length of the adjacent SO library. So we have everything we need, so how do we execute the code? The web server is powered by open SSL. An SSL object is created when a request is coming. So if you happen to read the source code, you will find there are many function pointers in that object. When LiveCivic Web wants to respond something to the client, SSL write is called. So all we need to do is to overwrite the SSL write pointer. And to simplify, we have found a fast way to trigger SSL write. It will send more form HTTP version. Well, this is a code in the older version of LiveCivic Web. This code only works safely in Linux. You may try whatever happens if you compile it in Visual Studio. In summary, we have got six attacking primitives. The first one is to restart the code execution. The second one is to restart the code execution. So we can call an exception in what in case we can't leak the address or stuck for a long time, we may want to restart the program and give it another try. And other five primitives are also listed here. So now all we need to do is to compile them to get a remote code execution. So now it's time to pump. We have to say it has a possibility to run. And another reason is in the last step, the memory condition is a little like to fall into a risk condition problem. We have a connection to overwrite the SSL object of another connection. If anything of the thread be written is called before the overwrite is done, it will fail. Or if any background thread calls the ccv, that's a fail too. For a four bytes testing gadget, we have 14% chance to set the PC value to it. But the real life gadget is six time longer. And the success rate is down to about 8%. But the good news is what will respond after it will, sorry, the good news is what will be respond after crash automatically. So we can do the exploit again and again. That is to entrust the hack to time. The average success time is about 30 minutes in our laboratory with about 10 tries. If we can control the PC, we need the last thing to start even dropping. That means a shell code. We use function offset from library plus leaked libraries load address to get the function address in memory. We added two handler for ccv and ccvot with infinite loop in our shell code to prevent any background threads from crashing the process. We have also now the length of metadata of the memory page where shellcodes are placed to prevent this page from being freed. We try to use the class audio recorder to record the voice and send it by TCP to attacker. And the voice is recorded in PCM format. And the shell code could be found in our GitHub page too. And finally, you can see we have dealt with every problems. What is now turning into if dropping program. Is it was dropping in the background and sending every voice data to the attacker. Do you want to watch the video? Of course, we have prepared a demo video which shows the whole story. This is a normal echo dot. And the left part is a attacker server and the right part is an exploiting script. When the exploit success, the victim will connect to us and you can see the log is shown on the left. You don't need to worry about this. The vulnerabilities we have found have all reported to the developers and fixed in the June 2018. Amazon has already automatically updated echo devices with security patches and the vulnerabilities we have mentioned are already all fixed. And you can find the code and contact information on our GitHub page and last a little head tips from our experience. The first to hack an IoT device, you need to get the firmware first and it is good to master all kinds of soldering and firmware extraction methods. And web plus binary vulnerabilities often equals to remote code execution. And the most important thing is to be patient. Your hard work will finally pay off. And so thank you for listening and thanks to my partner too. You do have time.