 We are happy to introduce Andrew Logan and his Defcon 30 and his parents, I want to mention his parents, are happy to have them here. He will be talking about ghost helicopters, so please give a big round of applause for Andrew Logan. Thank you all so much. It is an honor, and I'm so thrilled to be here at Defcon, my first Defcon. So I want to talk to you about tracking the military ghost helicopters of Washington, D.C. And I'll start by taking you on a tour of Washington, D.C., where you can see the Army National Guard practicing their rotor wash techniques over peaceful protesters, or the Park Police practicing the same techniques on ice skaters on the National Mall. If you want to spend a quiet day inside, well, if it's around inauguration, you can enjoy the Department of Energy sweeping for background radiation at 150 feet above your dwelling back and forth all day. And if that wasn't enough, we also have a top secret Army mission, and the night vision and electronic sensors directorate is based in nearby Fort Belvoir, so they stay pretty busy even at night. These are just a few of the helicopter operations over the district, but so began my helicopter obsession. Kept up late at night finding a little historical information the following day, but finding a lot of Twitter users with the same questions who could not identify a helicopter if they saw one. When I looked at libraries that would help me work with aircraft data and Twitter, Python was an obvious choice. There are many flavors of Python, as you know, but as a novice coder. I decided to go with bad Python. Some of the tenants of bad Python use an outdated version. God, I shouldn't tell you guys this. Command line editor is always good. Don't consistently name your variables and functions, leave your API keys in every file, and an absolute minimum of white space. This does help when you're debugging on your phone. And with that, I built a basic ADSB bot, not dissimilar from the famous Elon jet, originally using free data from open sky API and mapping with matplotlib. Now my wife has a master's in geography, and when I asked her what she thought of this map on the left, she said, looks good. She's very generous person. Now we're using Google map static API, custom icons, and we use data from ADSB exchange, which is much better. I highly recommend ADSB exchange. Anybody in the audience knows them. They don't filter military, and they don't filter private operators for profit. And a little background on ADSB, if you don't know, it's a transponder protocol that has been required in many of the busier air spaces since early 2020. And it's part of FAA's next gen plan to pack more flights into smaller spaces. It stands for automatic dependent surveillance broadcast. Automatic in that it does not require the pilot's intervention. Dependent on a number of sensors. Surveillance in that it reports those sensors via broadcast on 978 or 1090 megahertz unencrypted for reception by ground stations like us. Well, it's actually intended for reception by the tower and rebroadcast to other planes, but we can receive it. So at the bottom we have a chart on the evolution of transponder codes and their capabilities as taken from this DOD and Government Accountability Office report on the risks to military using these transponders. So it's no surprise also in this report you'll find the code of federal regulations, exemption from transponder use for agencies missions for national defense, homeland security, intelligence, and law enforcement purposes. And I asked the Government Accountability Office directly. This is how I got on my first watch list. Who makes the decision as to whether a flight is sensitive or not? And they told me the FAA defers to the agency to make that designation. So as you can imagine, with the military self-exempting itself from transponders, it leaves very little for flight trackers in the DC area. What we normally get is in the triangle, medical, news gathering, state and county police, and park police. Everything else is a ghost helicopter for the purposes of this discussion. And so our flight data pyramid emerges with ADSB at the top with the highest data quality, and only in the DC area is it true that this provides the lowest data quantity when we're talking about helicopters. And it didn't feel right having a pyramid about government surveillance without this, so you have that. Now, we can start to chip away at this pretty easily by looking at the President's daily schedule, which is public. And we can scrape this with beautiful soup or selenium, but they let you download a CSV, so we just do that every morning at 8 a.m., parse for relevant terms like joint base Andrews, where the helicopter flies to, and then convert those into cron jobs that tweet later in the day. And voila, you have automatic presidential arrivals and departures, unless they take the motorcade, which does happen. Now, this type of tracking what's publicly announced is also useful for things like military night at the Loco Ballpark. And when you're surveilling the military, and before we go any further, it's good to set some ground rules. In researching this, I relied on a talk by the National Geospatial Programs Office on how they make the decision to redact data from public data sets. So they consider first, is this data potentially useful to an adversary? And second, is this data unique before weighing the risks and benefits of redacting that information from the public? They note, efforts to safeguard useful information that is readily available through open sources or observation are unlikely to reduce vulnerabilities. On the other side, we have basic tenants of good journalism. I try not to do things automatically, ADSB bought aside, provide proof of user specific interest whenever possible, and thus establishing the public interest, and source good data and link to primary sources whenever possible. In 1997 court case defined what a journalist is for the purposes of journalistic shields, and our use case is a little different, but I find it informative. So they say a journalist is engaged with the intent to use material sought, gathered or received to disseminate information to the public, and that such intent existed at the inception of the news gathering process. All of this is to say, as long as there is no smoky back room discussion where we're trading helicopter secrets with no intent of surfacing this information to the public, I should be within my First Amendment rights for the duration of this talk. And I do a lot of reporting on helicopters and often the military. These are two of the most infamous, famous. They're the most popular operators in the DC area. The Air Force First Helicopter Squadron responsible for VIP evacuations, but well known for flying very low and making emergency landings on local playgrounds in these Vietnam era Hueys. And on the right, we have the Army 12th Aviation Battalion who's responsible for transporting three and four star generals to and from the Pentagon, as well as the aforementioned night flights. Now, I want to try and exercise at this point. The event on the left was a diversity recruiting event, and we recently saw the operator on the right doing an honorary military flyover at Arlington Cemetery. I'd like you to raise your hand if you believe either of those would require an exemption from transponder use as a sensitive flight. Oh my God. Okay, that's what I thought too, but no transponder on either of those. So it would be easy to look at what we're doing so far on Twitter and say, well, you have an enormous data problem if you can't see these aircraft. But I wanted to approach this differently. I thought if we have hundreds or thousands of people looking at our Twitter feed on their phone that is sensor rich and high quality camera, and we could just get them to square their hips and raise the phone above their head and crane their neck back, well, we could solve the data problem. And it might actually have health benefits too. So in October 2020, we started CopterSpotter, which is a platform for DC residents to self-report helicopter activity over their location. They do this with a couple pieces of data. The time, date, and geolocation provided by Twitter and either a photo, video, or the type of helicopter if they know it. Now, this is a typical copter spot on the right, and you all could be forgiven for thinking that this is a black-hawk helicopter carrying armaments over a civilian population. These are, in fact, auxiliary fuel tanks, but it is a common concern amongst DC residents. And our bot retweets a spot like this with a map, a corresponding icon, and probable operators, in this case for a black-hawk helicopter. There are several. And this creates kind of a rolling feed of helicopter activity throughout the area. And I'm sure you all would all say, that's great, but how do you get people to actually do this? Well, you make it a game. And here's a recent leaderboard. Our users frequently joke that one day this could be tradable as copter coin, a pseudo-currency that can only be earned by taking photos of helicopters. It puts it much more in the tangible realm compared to a lot of cryptocurrencies. Quickly, I want to tell you how Twitter handles location tags now. Used to be that you could tag a specific latitude and longitude on a map. And since 2019, they've in favor used four square locations, which gives you a bounding box by default, your city, which is too large for our purposes. So we ask users to tag a specific park or building, and then we make sure that that is no wider than two miles in either dimension. Here is just one of the few problematic locations, a park where the bounding box far exceeds the bounds. And once you have all of this location data, you can start to solve some of those problems. This is the copter spotter map by my colleague Sam Reese, and it lets you filter our dataset by time, date, and operator. So you can find out who that helicopter was over you last night, last week, or last month, even if it was the military. This also lets you submit directly to our dataset via a form in case you don't want to give your Twitter location to whoever our Twitter overlord, maybe by the time you're all watching this on video. So residents really took heart with this project over the pandemic. One of our spotters, Cassie, told us that it made her feel like she was included in a group that she didn't even know existed. And I recently found out how that felt here at my first DEF CON, so thank you all. Don Bayer, one of the members of the Congressional Quiet Skies Caucus recognized my efforts, and we even got a spot from a guy in a Black Hawk helicopter. That's pretty cool. Not everyone was so excited, but when someone calls you an anarchist and says the FBI needs to be watching you because of what your project is doing, everyone in this room knows you're on the right track. And we collected over 10,000 copter spots in 2021. At the top you can see the small portion that would normally be available with ADSB data. The gray area is the unknowns after we looked at the photos and tried to correlate operators to have all of them. And then 50% of our data is military operators. We do acknowledge a shortcoming with our dataset, which is that they are voluntarily submitted, so they do not represent a measured random sample. However, you can reasonably expect that the actual number of events is higher than what is being reported. We did notice a drop-off, however, over several months. Unique users were going down and unknown spots were going down as well. So people were getting better at identifying helicopters, but we were serving more of a core group of spotters. And I wanted this to engage the whole DC area and serve them, and I looked for ways to use the existing data to that end. And of course, once you have thousands of photos of helicopters submitted from the user's perspective, the next step's pretty obvious. You create a computer vision program to automatically identify the helicopters. These are two V-22 Osprey, part of the Marine Presidential Squadron and our computer vision program identified these with a 73.9% confidence when selecting from 19 distinct operators. If the user also includes location, we add them to the dataset and map it, as we did with this Coast Guard MH65, part of the Blackjack Flight Interdiction Program that keeps small planes out of the NCR. This is done with the aid of Roboflow, which is a great startup and has been a great partner for us. They help beginners and small teams get started with computer vision and they have amazing tools for collaborative annotation and they can even train and deploy your model nearly code free. They also have a Roboflow universe that is a repository of open source models that you can use for your next project and our data is included. So I look forward to seeing all of your helicopter identification program soon. So with that, there's one box missing from our flight data pyramid here and it has to do with the fact that visual identification is not that usable at night. And so we looked for someone who has more data that we could access and we started capturing radio calls. Now, we thought about doing this with Live ATC, but turns out separating by silence is not quite the same as squelch on a radio if you're trying to get individual calls. So instead, we deployed a Raspberry Pi on a balcony near DCA, volunteers. He knows what we're doing. And we use the excellent open source app RTL-SDR Airband to capture individual radio calls, upload them to an FTP server and tweet them at a sub-account from a sub-account called UFOs of DC. They're unidentified, but you get to identify them and someone did below noting the call signs, waypoints, routes and zones that were mentioned in the radio call. Now, the main account picks this up, builds a map of the described features and adds operator information and then uses FFMPEG to tie the original audio file to this image to create a video in what I think is a really powerful tool for citizen journalists to create stories from a difficult to use data set. Quickly, I wanna show you guys our Raspberry Pi. This was a very early version. We've since upgraded to an AirSpy HF Plus. And the main thing I learned is that you cannot trust the Raspberry Pi's built in Wi-Fi at all. So instead, we use a GL Inet router and the good cloud interface allows you access a layer above the Pi and you can use something like this Shelly command line accessible switch accessed from the router to hard reset the Pi in the event that it crashes. And with the completion of that box, we've completed our flight data pyramid. Yay, where do we go from here? Well, I'd like to look at connecting disparate data sources, whether that is the DC police tweeting an incident report that includes location and tying that to ADSB data to say this helicopter's above you right now and this is what they're looking for. Similarly, I'd like to look at news gathering helicopters and scrape local news sites just for the trending stories and identify aerial photos with computer vision so that we can say this is likely what they're over your house for. We're elevating helicopter air to air radio directly to the main feed as well. This is often two news gathering helicopters talking about how they're covering a story or two police agencies talking about an active pursuit. And I'm interested in noise monitoring. The DCA airport has a noise monitoring program but it's only along the river and it's not publicly accessible via API. But I'm really curious to ask you guys how would you apply something like this to crowdsource a difficult to use data set or give citizen journalists the power to make stories from that data. And we have a demonstration this weekend. If you go to helicopters of LV on Twitter, you will see the ADSB bot in action mostly covering sightseeing helicopters over the Las Vegas region but you can tweet your photos at it for automatic identification. And I'm gonna hang out at five PM right outside Caesars by the sign and if you wanna come do five minutes of copter spotting say hello, maybe look at code and give me your ideas. I would love to hear them. Thank you all so much.