 Right now, we have a first time DEF CON speaker, Frans Pair. He's going to speak to you about exploiting music streaming with JavaScript. It's his first time at DEF CON. Please give him a big round of applause. 18. I hope you guys are all awake. So, I'm Frans Pair, programmer at Tactical Network Solutions, and I'm going to go over exploiting music streaming with JavaScript. So, a couple of acknowledges before I start. I'd like to thank Zachary Cullop and Craig Hefner for all the help and support, and my employer, Tactical Network Solutions, for letting me learn about security without going to jail, which is great. Special thanks to Ronald Jenkins, who is an independent artist, who has given me permission to use his music in his presentation so I don't get sued by the RIAA. Speaking of which, I'd like to thank the EFF for helping me address issues with the DMCA and the CFAA. However, the decision was made to now release the original tool, which I had planned to. That was a Google Chrome extension, which would showcase all the different exploits and vulnerabilities, which I'm going to cover today. But I will be releasing an alternative tool, which I'll get into more detail later. I also want to state that the opinions and views expressed to your mind and not my employers. So, what am I going to be talking about? Well, I'm going to give you some background information of what my project is and what I've done, so you can have a context of, like, my approach and how I did it. Then I'll go over the music streaming basics, so you guys have an understanding of how it works and limitations today and what you're going to be seeing. And then I'm going to go over my security investigation process, so kind of taking you from the beginning to the end of research to exploitation. And hopefully by the end of this, you'll have a pretty good grasp on how you can do this by yourself. And then I'll go over exploit demo, assuming everything works out all right. And if I have any time afterwards, I'll talk about my new alternative extension, which I'm going to be releasing. And I'll take questions at the end if we have any time. So the end goal. Well, originally I had planned to release a Google Chrome extension, which would have all the different exports, which I'm going to show you. And the way this would work is that it would mimic the music player whenever possible, so whenever I was smart enough to figure out how to reverse engineer the code and generate the requests the way they did. Otherwise, I would just log whenever I saw some MP3 flying by, and I look at the syntax and I match that syntax, and every time something that matches that syntax flies by, I can go get with that song. And so the end result is that you have something that sits in the background, and every time you listen to a song, you can download it. But I'm not going to be releasing that. So what am I going to be releasing? Well, it's an alternative, which is it's not exactly the same. It's a forensics tool, not an exploitation tool. And what it does is it duplicates requests that it sees flying by, and it caches it in your RAM. And this is helpful for like a text dump analysis afterwards. So if you are like into doing like malware analysis without wanting to put it on your hard drive, you can do it entirely from your browser now. And this is also helpful if you want to see exactly what's being loaded into your browser. So here's the wall of shame, a bunch of different services which I found vulnerabilities in. Some of these have made fixes, some of them haven't. Most of them haven't. So we have Pandora, Amini, SoundCloud, Groove Shark, Django, Playlist.com, and Hrex. So quite a big list. All right, so what is streaming? Well, Wikipedia defines it as a way to constantly receive and present data that's being delivered by provider. So from a developer's point of view, this means that you're going to be receiving data in a really long stream. And as soon as you get the first piece, you can start processing it and displaying it to the viewer. From a attacker's point of view, this means that you're going to constantly be receiving data. And at the very end, you'll have everything you need to reconstruct the file, whether it's a song or whatever. And it's only a matter of capturing the data pieces. And once you do that, there are two technical major roblox that prevent you from playing it back. You have reassembly of the pieces, typically if you, in the way the internet works, is when you send data, it's not always received in the same order which it was sent. And so this could be more difficult depending on what type of protocol they use. And if you see any encryption, that's probably going to be stopping you because usually if you want to break encryption, it's not to get music to get someone's password. So that's going to be your major roblox. So the protocols which I've seen, typically when you have some sort of desktop application like Spotify or Pandora 1, they will use a custom TCP protocol. And this makes it incredibly difficult to reassemble because A, it's either not documented or B, it's proprietary and you don't know how it works. So this is probably going to stop you getting your tracks. However, I've noticed that some services like Last FM use HTTP or HTTPS because they don't want to write their own custom protocol. But this is typically what you see in some browser based applications. So the regular Pandora app, SoundCloud, these guys are going to be using HTTP and HTTPS. And this is because they don't want to do extra coding. The browser does it all for you, why would you want to do it yourself? And if you're an attacker or hacker or whatever, this means that you can use the browser too. Now you don't have to worry about reassembly and de-encryption. So this is why I targeted these because they're extremely easy to go after. And there are two different types of streaming which I kind of named myself. There is static streaming where you will have one URL per song. So you'll have to usually reference it by a file name and you have to know the directory and everything. And this is different from a dynamic one where you have one page and depending on what parameters you send it, you get back a different file. So if you can see here, we have a stream.php page and depending on what key you send it, you get back a different file. So it's one to one versus one to many. And this is important to keep track of when you're doing analysis. So there are two major types of music players. The most common one is Flash. This is a majority of the web players you'll see. However, they may still use JavaScript. I've seen a lot of people be really lazy and they will actually use JavaScript to pull back the data and then just pass it on to the Flash. So all the Flash does is play back. And this is typically because you have Flash libraries that are made to play back music but not necessarily customized to work with your interface. So this is how they get around it. However, some services you're going to have to decompile them. And I don't know any Flash myself so I skipped anything that did this. But it's important to know that if you are successful at decompiling the Flash and you want to exploit it using your Chrome extension with JavaScript, it runs in a separate environment due to security issues. So if there's some kind of secret key baked into the Flash, you're not going to get it from JavaScript. And then the other one you'll see is HTML5, which is kind of experimental right now because not all the browsers have full support for it yet. And you typically see this in mobile-based applications because Flash is losing support there. And this is entirely in JavaScript. So no decompiling or anything but it's minified most of the time, which means that it's obfuscated and it's really hard to read. So where's the vulnerability? Well, I already went over how the browser does all like the major work for you. So what do you have to do? Well, there are two ways of going about this. I mentioned this earlier. You can copy the requests by kind of just telling based on the syntax of the URL. And this is typically pretty easy. You know, you look at one URL and you're like, all right, there's a file name there and this is the structure and you can easily write a regular expression to do this. However, this can be suspicious if they're doing some kind of server side logging and they see two identical requests coming in within milliseconds of each other, they're probably like, huh, why, this isn't normal activity. Why is this happening? But I haven't really seen this being an issue in terms of any red flags being thrown up. But this can be limiting. I found services where they have one-time use tokens. So you use a token to stream your music and after it's been used, it's no longer valid. So by the time your second request gets there, it's not valid and you don't get anything back. And the way you get around this is to generating the requests yourself. So this is going to be a little bit more difficult. Sometimes you can tell based on the syntax of the URL what variables are needed and how you get them. Other times you have to reverse engineer the code and figure out what they're doing. But if you are successful, you get past the limitation there and it's undetectable when there's sessions. So if they have sessions, then it looks from the server side that two requests are coming in from two different people in the same IP address and they just happen to be listening to the same song. All right. So how do you go about doing this? Well, it's important to keep in mind that you've got to do breath before depth. You don't want to like dig yourself in the first thing you see and waste two hours and figure out that's the wrong thing. You want to keep track of all your possible options and then take the path of least resistance. And you want to remember breath before depth. Okay, yeah, I did that. Okay. Once you keep that in mind, you want to locate the music file in the network traffic. So you can do this in Chrome by opening up the developer console and then going to the network tab and you can see all the traffic flying by. And there's going to be a lot of traffic so you want to filter based on exit charge traffic and possibly sorting by type. And the reason why you want to do this is because typically when music is loaded in a streaming service, especially internet radio service, the music isn't loaded when the page is sent to you originally. Like with Pandora, the songs are actually loaded after the page has been loaded. This is because they want to have time to look at your recommendations and figure out what song to give you next. Also, they don't know how many songs they're going to be listening to before you go away. So they have to load this after the fact. And this is done through Ajax, which is going to be showing up as XHR traffic. And you can also sort by type like looking for audio files because that's probably what you're going to be finding. Once you find the actual request, you want to inspect any parameters in that request. So headers, any kind of parameters that they send in the URL, stuff like that. Then you want to find out where those values come from. And there are many different locations where these values are going to come from. The first place, you want to do the easiest to the hardest. The first place that's easiest is the page URL. Sometimes the song ID is in the URL of the page you're on and you can just use that to get the song. After that, you might want to look at the page source, do control F, look for the name of the parameter, you might be able to find it. Then you might want to look at local storage, possibly cookies as well, because I've seen with services like Groove Shark, if you have a playlist or something, they will send the whole thing to you at once, so you don't have to keep making requests to find out what the next song is. And then at the very end, you want to look at JavaScript because that's going to be hard to read and you're going to have to figure out what someone else's code is doing. And when you have everything, you can attempt to replicate the request. So kind of based on the text of the request you've seen as your example, take the parameters you have and generate the same thing. So first target is a mini. This is a really great first target. They're flash-based service but they use JavaScript to load and they have almost no security. I was able to exploit these guys without looking at any code. So this is the page with the network traffic. I've circled the network tab and at the very bottom you can see that we have an audio slash mp3 file which is actually what we're looking for. So if you wanted to take the easy way out, you could actually right click this and open it in your tab and then download it that way. So this is the cheap way out. Okay. However, I was trying to show you guys how to automate this with JavaScript. So we're going to do more inspection. So looking at the actual request, we see there's actually only one parameter and I took out all the other headers because those are the standard headers that your browser sends. But we had this FID and I'm going to go out on a hunch and say FID stands for file ID because they typically name things like this. So now we look for the FID. So the first place to look is going to be in the URL. And sure enough, it's in the URL. You're like, great. We have everything we needed. Now how do I duplicate what they did? So you go and look back at what they did. So you go and look back at what they did. So you go and look back at what they did. And you go and look back at the original request and you can see that they actually have this weird subdomain thing going on. And I found out with deduction that the first four characters of the FID are the first four characters of that subdomain in reverse order. So it wasn't very difficult to figure that one out. And you can easily replicate this using JavaScript, but I'm not allowed to show you guys any exploit code. So our next target is Groovshark. And this is quite a step up. I'm going to show these guys because they have HTML5 and I was kind of wondering how that would play out in terms of difficulty. And they use several factors of authentication and the JavaScript is minified, so it's hard to read, which makes it really for the faint of heart. It's not for the faint of heart. So you want to make sure that you keep track of what you're doing the whole time. Know what parameters you have, what you're looking for, what your next target is, what you don't get lost because it can get confusing. So I went ahead and I'm going to tell you guys that you didn't want to have a JavaScript beautifier because it makes the glob on the left look like the glob on the right. And while you still have characters like underscore underscore p, at least you can have proper spacing and you can read functions. So that's great. And here this is I skipped the network request itself and I've highlighted that they have, that's the URL they have there and all they have is stream key as what you send to them. So off the bat you think, hey, this is pretty easy, I need one parameter which is stream key. So that's what I'm going to look for. So you also, I looked at all the traffic to see what was coming and I found this more PHP file and there is a get stream key from song ID and the method I highlighted because that changes all the time but we know what it is because it says up there. So we only need four parameters. I say only because it actually is easier than it seems. And while I was looking at more PHP, I found this get communication token method which uses a secret key. And I found this get communication token method which uses a secret key and I like secrets. So I'm going to keep this in mind. So what do we need? Well from the very beginning we know that as soon as we get the stream key we can get the song and we know to get the stream key we need to call more PHP with this get stream key from song IDX method and we need to pass it to these four parameters and more PHP has a secret key. So I'm interested. So this is the JavaScript and this is what I get and in the very first line you see this window.gs.tpl and I'm guessing gs stands for groove shark. So I'm like cool they're storing stuff in the JavaScript environment. Let's see what they have. I find this window.gs.config which has the session ID. So we're like all right that's one down. What else is there? There's actually this model which turns out to be the entire playlist which you have saved in memory and every single song in this playlist has an ID which is the song ID. So right off the bat we were able to find two of the parameters we needed just by looking at the very first line of the JavaScript file. So not bad but we still need the rest of the parameters. So I do so this is kind of keeping track of everything. So I search for the UUID because it was easier and sure enough I find a function that takes no parameters which is good news because now I don't have to find any more parameters. I can just copy this function and every time I need a new UUID I just call this function. So this is an easy copy and paste. So now we're left with token and this is where it gets a little bit more challenging. So I do a find this f dot header dot token which turns out to be the token being put into the header of the request that we saw. So looking at it there's a bunch of stuff that we need. So going top down like you should read code there's this r dot last randomizer which is equal to o and remember what I said about functions that take no parameters we can just copy and paste them and sure enough that's the function right there. So that's taken care of. We need this r dot rev token. So I do a control f for rev token and I find rev token which is equal to n. N is equal to GUI flubber which is the secret key which they hope no one would find. Unfortunately I shouldn't have put it right on top of okay. So now we really need the current token because if you recall method was just a method we were calling the URL with and we had that documented. So now we just need the current token. So I just searched for any instances of token because I couldn't find any instances of current token. And yeah this is where the secret key comes in. Because get communication token returns the token that we need for this request. And so now we're on a hunt for the secret key. Which control f shows that it is the hex md5 of the session ID which we found on the very first step. So we already have everything that we need. And to just recap we needed the stream key and we got that by finding these four variables which were just in the JavaScript and the secret key was needed to get the token. So we have everything and with this information you can generate the request. But I can't show you an exploit code so we're just going to go straight to a demo. All right. So this is Django.com they are a very small music streaming service and it's like an internet radio station. So the first thing I'm going to do is open up the developer tools and go to the network tab. And you want to do this before you actually play the song because if you do it after you play the song you're not going to see it in the network traffic because it's already been loaded. So it's important to keep that in mind. So it plays which is good. And I'm going to filter based on XHR traffic down here. And at the very bottom we have this audio slash MPEG file which turns out to be what we're looking for. So I'm going to click on it and look for anything that we might need. And it turns out that this is a statically a static streaming website. So we just need this file name. And it's important to keep in mind here we have this weird directory thing going on which is also the first six characters of the song ID. So if we need that we have it right there. And because I don't really know where to look for this file ID because it doesn't have a specific parameter name in front of it so I can't do a control F. I'm just going to look at the other traffic that we filter on because there's only four other requests so it's not going to take us very long. And you can see here that this responds with a bunch of JavaScript files like JavaScript. And it has a song ID here but it doesn't actually correspond to the 08, 06 one that we have so that's a dead end right there. But if you look here this page actually returns the URL without the file name. The whole URL. So this is our target. And if you look at the headers they take quite a few parameters. So we have first time which is equal to 1. That's true. So that's going to be a binary flag. You can probably lie on that if you want to. An SID which I'm not really sure what it is. A version number which is probably going to be the same every time. SUW which I'm not sure what it is either. And CB which I'm also not sure about. But at least now we know what names we're looking for. And at this point I actually noticed that I'm glad that Chrome has this but they have this initiator column which tells you exactly what script and what line made this request. So if we click on this it will actually take us to this line. Which if you notice this is minified JavaScript. So you're not going to be reading this very easily. And I've went and gone ahead and beautified it so now we can actually inspect the code. So if you recall the URL back here was it was the streams URL. So we're going to look for any instance of streams. And sure enough it takes us straight to the line that creates the request. And we have this underscore JM station ID and some parameters which are set right above it. And so we can see first time set it to one. We have this SID which is apparently session ID. We have the version number here W apparently stands for whether the sign-up window is visible or not. And so we have everything and then CB here is apparently the date and time. So we have everything we need and I already went ahead and wrote a one line JavaScript which will generate this for us. So this bits out the URL that was generating the new song locations. So I'm going to copy that. And it turns out they actually patched the service three days before I came to DEF CON. And this is freaking me out. Apparently they're doing some weird thing with checking your session but I found out to get around it is you can just refresh the radio station and then it will work. So here we have the next song that would be playing. And if we actually just keep refreshing this it will give us a different song every time. So we can actually get every single song in their music library. But I did want to show you guys an exploit through my Chrome extension. So although I'm not going to be releasing it I can show you guys what it looks like. So as you saw there was a pop-up and I clicked on it and it takes us to my Chrome extension and you can select it and then hit download. And it's right there. So not very difficult. So that was a Chrome extension but I did say that I was going to be releasing this alternative tool. So what is this tool? Well I'm tentatively calling it browser shark until I get sued. But I already bought the domain name so I'm good. Basically what it's going to do is if you recall earlier I mentioned that I had a method where I would copy any URL that matched the syntax of the song and then go retrieve it myself. Well I decided that wouldn't it be cool if I could record all my traffic going through and then cache that to the browser. And what happens is now I can I can just go to Google and all my traffic will show up right here. And what you can do with this is you can actually analyze the hex of the request to make sure that you're not getting any malware and it's nice enough to tell you what type of file it is. So if they're lying to you tell. And like I said before this would be really cool with forensics and stuff like that. And I'm planning on doing more of coding so you can do a little bit more with the hex editor. So this is a tool that I'm going to be releasing and I'll have a location to download it at the end of my PowerPoint. So things I learned. Downloading music is inconvenient. I found that after I had music I didn't know what to do with it so actually now I honestly just use Spotify because I don't like having to deal with files. But services were fairly easy to exploit. I think with all the different services which I listed at the very beginning I found exploits in them in three days total for all of them and the hardest ones was Grooveshark which took me a whole day. Pandora actually was surprisingly easy. And it was impossible to it's actually impossible to completely streaming. Inherently at some point you're going to have your music on my computer and I own my computer and even if you use encryption you have to decrypt it so you can play it back and at that point you could copy the files so inherently you can't protect streaming. And some things you should know people have bad security this is a shocker and some people will patch their code others will not this is the beast of security this is just the way it works and the same web traffic logging will work with video streaming services too some of them not all of them people always ask me if Netflix will work no, Netflix will not work but if you go to some sketchy Chinese music streaming websites I'm pretty sure this would work as well but that's a topic for another day so I did a case study originally the very first target I found which if you guys aren't familiar with is a British music streaming service and I found the vulnerability and I emailed them I got no response I made this Chrome extension I got no response but apparently they were able to fix it without my help so good on them and these are some things I noticed after they fixed it they secured it heavily they capped the bandwidth to match the playback speed so it's actually impossible if you wanted to download the whole music library it wouldn't be possible because as long as it would take to play back all that music which is years so that's a good way to prevent people from stealing all your music they also have one-time use tokens like I said earlier once your first request is made your second request is no longer valid so you can't get the music and they also had it I also tried to do this really weird sketchy thing where I would make sure that my requests would get there at almost the exact same time like I did 10 seconds on my fake stream but then it would cut out because they only allow one stream at a time so that was pretty good and I couldn't exploit this and if you wanted to it would take a huge amount of time it really wouldn't be worth it they have hundreds of lives up to you skate a code and the bandwidth cap makes it so you can't really so some mitigations using current technology the one-time use tokens is definitely the best way for people from like I showed you before, right-clicking and opening a new tab and then saving it because the second request won't work I've also seen people use RTMP E streams so this is Adobe's protocol which they use and E stands for encrypted I just want everyone to know that just because your protocol has an E in it doesn't mean it's encrypted I've seen many services which have used regular RTMP non-encrypted traffic and just put the E as the protocol here and also returning the song in pieces really helps as well so SoundCloud actually did this a couple days before DEF CON but they named all the pieces in numerical order so that doesn't make it any more difficult for me to put them back together and for future proofing you can take a look at the HTML5 audio tag with DRM support and these guys from Virginia Tech wrote a paper on it I haven't really looked much into it because I know inherently that nothing's going to work but if someone's interested that's there and so these are the references I actually have uploaded the browser shark thing to the Google Play store whatever so that's the long URL I made it bitly if you trust me enough to click on it I assure you it's the same link I'm also putting this project on GitHub because I want it to be open source it's empty right now but I'm going to be putting stuff on there in the next couple days because I don't trust DEF CON Wi-Fi and then there's my blog I sometimes put interesting stuff on there sometimes don't, no guarantees and that's the paper and the JavaScript beautifier so there's my contact information if you want to talk to me and I'll take any questions now if anyone has any yes okay so the question is how I dealt with renaming in the exploit extension what I did was I wrote a script which would hook on to the page and I would use jQuery to take the file name and then the artist from the page itself because they provide that information so the user knows so I actually spent a good like several hours trying to synchronize all the songs together but that's how I did it yes because I don't like losing all my money to lawsuits apparently the way it's written is I can get sued for $20,000 per count of trafficking and they have millions of songs which would turn out to billions of dollars which I don't have yes I tried doing this with Spotify this guy in I think Norway or something did the same thing I was doing but happened to release the code for it several months before I gave this talk so they fixed it so it does not work with Spotify yes there is no DRM in any of these I think Pandora did a better job of that some of them allow users to upload their own songs and you will get really weird ads as your cover art so it just depends on the service yes there is no DRM in any of these these are just like straight like you can play it back like as you saw I was playing it back in my music player when I downloaded it any other questions alright cool