 Hello everyone, thanks for coming. My name is, yeah, okay. My name is Alina, and I'm Identity and Access Management Consultant, and I'm speaking to Davis Alex. He's Staff Security Intelligence Engineer at Lookout. Okay, we would like to start our talk with a small disclaimer. This talk is not based on any particular vulnerability for any specific vendor. This talk only presents a general approach to use automation with a known vector of reconnaissance. Okay, let's begin. Remote offices are becoming more and more popular. People attending meetings from home, from other buildings, or even other countries. So in order to accommodate this, maybe I should take, okay. Let me try like this, sorry, walk. Okay, okay, let's try like this. Oh, I don't know if it's working, Alex. Okay, so in order to accommodate our modern life and remote offices, almost every meeting invite now has a conference line details in it. And which means that every aspect of a company life is discussed over the phone. You have vendor evaluation meetings, interviews happening over the phone, and even C-Level presenting some financial reports. However, the larger the audience, the less we concern about attendees. For example, you have a company by town hall, or you have a session training, which can easily be joined by 100 or 200 attendees. Obviously, no one is going to check that every calling number is a legit user and has enough permissions and have enough permissions to listen to the information discussed. Okay, this was already noticed as a nice way to obtain some information. And there was some research done in this area. One of them, it's not, okay. One of them has been done by Martin, who is actually speaking today at Recon Village at 4.50. But there are not a lot of researchers, what you can do with information you will like to find. No, it's very quiet. Okay, I'll try that one. Okay, so how did I come up with this idea? I used to work as a business analyst and one of my duties was to schedule meetings. Meetings with my clients, meetings with vendors, meetings with stakeholders, tons of meetings. And everyone was asking, Alina, can you add a conference line, please? So since I was constantly sharing meeting ideas and opening the lines, at one point I had a very clear idea that meeting ID is just a random sequence of nine digits, sometimes 10 or 11. But it's a combination of random digits. So later in the year, I heard that AWS released a speech recognition framework called Transcriber, and it's available for everyone to use. It's one of AWS services. And also there was some viral video about Google Assistant making calls on your behalf, calling to different numbers and trying to act as a normal user. So one year ago at DEF CON 26, this idea came to my mind. If we have tools for automated dialing in, and we have tools for speech recognition, and meeting ID is just a random sequence of digits, can we get something cool out of it? Can we build a solution that will act as a user and call to different numbers? So that's what we are going to try to see today. So let's see how the conference line works. For example, I have a company X and I need a conferencing solution for my company. So I go to a vendor and I purchase licenses. The vendor assigns a fixed phone line to my company. And now my employees can add conference detail to their meetings. For example, I have a meeting at 10 a.m. and I add conference details to it. And then I schedule another meeting at 11 a.m. and add details as well. As we can see, the dialing number is the same for both meetings. And meeting ID is just random digits associated to this dialing number. So not bad. In order for you to join, you need to find out the dialing number, meeting ID, and optional participant ID. But let's take it a step further and look at something called international numbers. Because we live in Canada, we always try to call into Canadian numbers. And if you go to your conference and provide their website and there will be something called numbers for international users, international numbers, you will see that for each country, vendor provides one or couple fixed numbers. So it means if I'm in Canada, I just need one dialing number for all Canadian calls. Which is not bad. So it makes our contents even less. We can get even more luckier if we get something called personal meeting ID. What it does basically, it means that all meetings that I schedule, let's say, will have the same meeting ID associated with me. It's very convenient if you don't want people to reopen meeting hand-wide, or I have my details saved somewhere and just copy-paste them. Oh yeah, if you want to join a meeting like, you do it every day basically and you want something fast and easy. Or if I schedule, let's say, recurrent session of meetings, maybe daily stand-ups and I just put this, like, I schedule it as a series, then by default my conference and vendor will also assign one meeting ID to the whole series. So let's see how a user can join a call once you get the details from me. There are two general ways. The first one is using your laptop and you click on the link. What happens in that scenario? Basically, your laptop will make an API call to the server and conference ID will be passed inside of that API call. Seamless user experience, you don't need to do anything. Or if you, let's say, you want to call from your phone, if you're attending remotely or people often do it from the car, you can call from your regular phone. And what you basically need to do in that case, is just enter a number on your phone and play sequence of signals. And that signals are, they're not called, basically, dual tone multi-frequency signals. It's a special way you can dial in a number from your phone. And basically what you need to do, you need to enter a number and you need to play a sequence of certain signals, like audio signals. So, now we know how a user can join a call. We set the following goals for our project. So we were thinking, can we build a solution that will take a meeting ID, call into the meeting, listen to the information set, write it to the file or to some sort of storage, have it transcribed for our future use and we also wanted a solution that is relatively easy to build and cheap to run. Okay, at this moment I'll pass it to Alex for technical side of this talk. Okay, thank you very much, Alina. Hi everyone. So as Alina mentioned, when she came up with this idea, she already had certain AWS technology in mind. So given that I already had some prior experience using it, that became our platform of choice. As you might know, AWS has a huge number of different services to offer. So let's take a look at those few that were important for our project. First of all, it's Amazon Connect. Amazon Connect is a cloud-based contact center that is designed to provide some customer service purposes. However, we used it to make outbound calls. Also, Connect has some nice features that we'll discuss a bit later. Then we used Amazon Kinesis video streams for streaming our call to a stream so that we can retrieve the audio file at a later point. Next, we used Amazon Transcribe that Alina mentioned to get the audio file we obtained and have it transcribed into readable text. We also used AWS Lambda service by which you can execute any kind of function you need. So without a need of dedicated server. So we used that a lot. Other than that, we used Amazon Gagnita for user authentication. We used Amazon Simple Storage Provider, Simple Storage Service API Gateway to make in our calls more secure and we used DynamoDB. So, given the requirements that Alina has mentioned, let's take a look at the architectural diagram of the solution. We've split, although the whole process can be fully automated, for the purposes of this talk, we've split it into manually invocable steps so that it's easier to follow and understand. As I said, we first start by logging into the portal, into the solution using AWS Gagnita hosted UI. Once logged in, we get back the authentication token that we will be using from there on in every API call to other services. So, as Alina mentioned, we need to have, in order to join a meeting, we need to have the DTMF signal sequence. So, to generate that audio file that we will be playing, corresponding to the conference ID, I've written a Java program that I've uploaded as a Lambda function. So, I'm calling that Lambda function passing it conference ID and get the audio file generated. Once generated, it is stored in an S3 bucket. However, I need to pull it locally for the next step. One of the nice features of AWS Connect is ability to play prompts. By design, it can be used, for example, to play some announcement, some advertisement to the customer or to guide them through different menu options in the call if it's a customer service call. However, we use this option to be able to play the audio file. However, Connect does not have the ability to upload, it doesn't have nice API or it doesn't have good integration with other services. For example, with S3 buckets. So, to upload the file, an audio file to be played, you need to go to actually the page and upload the file in there. However, obviously that doesn't scale because if you are talking, if you want to join multiple meetings at a time, you won't be able to go and upload file one by one. So, what we need to do in that case, we just need to do one call to the AWS Connect, which is signed, we need to grab the token and the cookie from the packet and having that, we can basically craft our own packets and upload as many audio files to play as we want. So, once the file is uploaded, we get back the GUID of the prompt to be played. We keep a note of it and it will be used further. So, at this stage, at this step, we kind of have all the preparation done and then we can start the call. And we started by invoking this lambda function which creates and we pass all the conference information to the phone number, the GUID of the prompt to play and the conference ID. And what it does, it creates an entry in DynamoDB and it actually starts the flow of the Connect. The Connect flow. So, this gray box here, this gray area, it's actually what's happening inside the flow of the AWS Connect service. First of all, it starts the Kinesis streaming. So, all of the audio that will be heard on the call, it will be passed to the Kinesis stream. Then it executes two more lambda functions. One, we'll look up the data of the GUID with GUID to play from the DynamoDB. And the other one, we'll update our DynamoDB entry with the information about the stream where we are streaming to. So, it will be fragment number and stream name. Finally, we need to check whether the call has ended or not. Again, unfortunately, Connect does not provide a nice API to do that, so the best thing we can do is just make a regular signed request to the service and then just parse the HTML response to see if it contains the termination timestamp. If it does, we know the call has ended, otherwise we just need to wait. And then the last step would be passed the extracted audio file from Kinesis. Oh yeah, we need to extract the audio file from Kinesis and upload it to S3 bucket. And once we do that, we need to decode text using that transcribe service. So now that we covered that, let's take a look at the real demo. So in this demo, in order to simulate a typical meeting conversation, we are going to use the sensitivity training seen from the office. So just imagine that we were lucky to find the conference number and conference ID to join the meeting. So what we did in fact, we just took a different laptop, we created the meeting there and we are trying with our tool, we are trying to join the meeting. So let's see what will happen then. So as you can see, we authenticate first and we enter the meeting information, the phone number and the conference ID. As you will see, the conference ID will be ended with two pound signs. One pound will indicate the last digit of the meeting ID and the second pound would indicate that there is no pin number required. We click submit and then we generate the GTMF sequence for the conference ID. Also I forgot to mention that on the right hand side, you can see some logs that my server is producing so you can kind of get a better idea of what's going on behind the scenes here. So the file is generated, I can download it, I upload it, you can see it's uploaded successfully so we are ready to proceed to making the call. We start the call, as you can see on the right side, we are sending contact ID and from GUID and then all we need to do is to check the status of the call. So this part would be a little bit fast forwarded, obviously we recorded about two minutes worth of audio, we won't be waiting and looking at the screen, the same screen for two minutes so it was a bit fast forwarded. However, at some point we were lucky to check it and to see that the call has finished. So now we call the new lambda function to extract the audio, as you can see on the right side we are passing the fragment number and the stream name to my lambda function, which is also a Java code, Java program that would pull the audio, extract the audio from Kinesis and save it to S3 bucket. This one I didn't fast forward, so this took maybe like 30 seconds. So now that is done, we initiate the call transcription job. And this one has a luckily nice API so at any point in time you can check whether the call has finished or not and if it has you can just download the data. That is already transcribed the text, otherwise you just need to wait more. So again, this was the longest thing we had to wait for, probably took us about seven minutes I'd say. So yeah, as you can see, still in progress, but soon we can just click fetch again and see that it's completed. So we click download and here you are. So basically this is the text from the meeting. So as you can see here that transcribed text, not sure if it's visible from far rows, so it starts by, well it first it welcomes you to the provider, then it asks you to enter your meeting ID followed by pound sign. And then it says that you are in the meeting now. And it says that there is other participant in the meeting. After that, you can actually see the actual text of the sensitivity training that Michael was providing to his employees. So as you've seen in this demo, the only parameters that we had to provide to the project was the phone number and the conference ID. While the rest of the steps, we were basically checking the current status of the flow and if it's at a desired state, we will just press the next button to continue the flow. So basically it's very easy to automate. As long as you have the conference ID and the phone number, it's easily automatable. So how much did it cost us? Well, it's hard to say how much it cost us because we didn't really pay anything for it because we were still within the free tier of AWS. However, if you refer to the official webpage and check the prices of different services we used, two most expensive ones would be AWS Connect and AWS Transcribe. So you can see the prices on the slide. And if you want to put it into context, say you want to record five hours worth of audio in one day and have it transcribed into text, you're going to pay something about $15. Obviously, there are always improvement in tuning opportunities. So as with regards to this project, I can see that the following can be done to make it even more interesting or to develop it further. So since we only were working on this as POC, we used one phone number. However, you can claim as many as you need in AWS Connect and then run your attacks in parallel. Also, it would be useful to keep more information in the database, for example, which conference ID were actual and real, which meetings you were able to join, maybe how many participants they were. Also, if at some point you identify that there was the same speaker present at the same time on the same conference ID, you may sometimes make an assumption that it was his own personal meeting ID. Finally, you can always add more services to the pipeline. For example, you can add AWS Comprehend to get more idea of what the discussion was about, what was the key topics discussed in the meeting. And for example, you can pass the data to SageMaker, where you might have your own machine learning models deployed, where you can process the data as you wish. Now I'm giving it back to Raleena for some final remarks. Thanks, Alex. Okay, and of course, we need to finish this talk with giving you some advices how to protect yourself from this type of reconnaissance. So the most useful advice will be hire employees who speak with accent. So it's very difficult to transcribe the text. But yeah, to be serious, just go to the seconds of your conference and provider and check what's available there. Obviously require meeting password for your sensitive discussions. There is a second point, which is like generate and require passwords for participants joined by phone. We'll be a good one. Maybe generate new meeting ID for special, like in a series if you have a series between your VIP level, like C-level discussion. So try to generate new meeting ID for each of them. And yeah, identify guest participants in the meeting field will be also a useful one. Yeah, so just go to the second display visit and try to be aware of this type of issue. Okay, thanks everyone for coming and you can find us on LinkedIn. I think we have one or two minutes for questions. No questions? Okay. Yeah. Yeah, those are for meeting providers. So basically every company that is using that provider, to join their meeting, you would call that phone number. Just single for provider. I always do, sorry. Actually, that's what we wanted to skip in this presentation for legal purposes, let's say. But if you Google name, like name of the company you want and add those numbers, you will be able to find some in the Google. Like I did couple tests, like I was starting, like if you put a name of a bank or a big company, you will be able, because people always give them, let's say there is like a day for co-ops or for the intern, like for students, or maybe there is like a warden's day going on between different companies. So you can find them, we just didn't want to include because they were very specific to some companies. Thanks everyone. Thanks everyone.