Upload

Loading icon Loading...

This video is unavailable.

Where Did This Code Come From? Discovering the Provenance of Program Binaries

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to like GoogleTechTalks's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to dislike GoogleTechTalks's video.

Sign in to YouTube

Sign in with your Google Account (YouTube, Google+, Gmail, Orkut, Picasa, or Chrome) to add GoogleTechTalks's video to your playlist.

Uploaded on May 2, 2011

Google Tech Talk (more info below)
April 22, 2011

Presented by Nathan Rosenblum, UW-Madison

ABSTRACT

Where did this binary come from? How was it compiled? What language did the programmer choose? Who wrote this code? These questions rarely occur to most computer users, but for analysts working in forensics, reverse engineering, and software theft, they are of paramount importance. The provenance of a program binary --- the specific process through which an idea is transformed into executable code --- can provide valuable insight, yet it is in the very domains where such information would be most useful that it is least likely to be available. At the University of Wisconsin, we have investigated techniques to recover these provenance details from program binaries, filling in the gaps in the production process. Provenance recovery occupies the intersection of program analysis, security, and statistical machine learning research; in this talk, I will describe probabilistic models of provenance in the context of compiler toolchain identification and both closed- and open-world solutions to the difficult task of program authorship attribution: picking out stylistic characteristics of executable code that reveal the identity of the programmer. Our work integrates a range of machine learning techniques, from support vector machines to conditional random fields to metric learning and large-margin clustering. I will discuss how we leverage large-scale computing resources to solve scaling problems in model training and inference, and how our work on provenance recovery creates opportunities for research into the social structures of the underground malware economy.

Nathan Rosenblum is a doctoral candidate in the Computer Sciences department at the University of Wisconsin-Madison, under the supervision of Barton Miller. His research interests include systems, security, program analysis and machine learning, particularly when these areas collide. Nathan's current work focuses on discovering characteristics of programmer style in executable machine code. He sometimes remembers fondly the world outside of his office.

Loading icon Loading...

Loading icon Loading...

Loading icon Loading...

The interactive transcript could not be loaded.

Loading icon Loading...

Loading icon Loading...

Ratings have been disabled for this video.
Rating is available when the video has been rented.
This feature is not available right now. Please try again later.

Loading icon Loading...

Loading...
Working...
to add this to Watch Later

Add to