Large image databases and small codes for object recognition

Loading...

Sign in or sign up now!
Alert icon
Upgrade to the latest Flash Player for improved playback performance. Upgrade now or more info.
12,557
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on May 15, 2008

Google Tech Talks
May, 8 2008

ABSTRACT

With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non?parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Web. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32x32 color images. Each image is loosely labeled with one of the 75,062 non?abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest?neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class?specific Viola?Jones style detectors.

In the second part of the talk, we present efficient image search and scene matching techniques that are not only fast, but also require very little memory, enabling their use on standard hardware or even on handheld devices. Our approach uses the Semantic Hashing idea of Salakhutdinov and Hinton, based on Restricted Boltzmann Machines to convert the Gist descriptor (a real valued vector that describes orientation energies at different scales and orientations within an image) to a compact binary code, with a few hundred bits per image. Using our scheme, it is possible to perform real?time searches on our Internet image database using a single large PC and obtain recognition results comparable to the full descriptor. Using our codes on high quality labeled images from the LabelMe database gives surprisingly powerful recognition results using simple nearest neighbor techniques.

This talk will be taped

Speaker: Rob Fergus
Rob Fergus is an Assistant Professor of Computer Science at the Courant Institute of Mathematical Sciences, New York University. Originally from the UK, he has a undergraduate degree in Electrical Engineering from the University of Cambridge. He then did a Masters in Electrical Engineering with Prof. Pietro Perona at Caltech, before completing a PhD with Prof. Andrew Zisserman at the University of Oxford. Before coming to NYU, he spent two years as a post-doc in the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. William Freeman.

Category:

People & Blogs

Tags:

License:

Standard YouTube License

  • likes, 2 dislikes

Link to this comment:

Share to:
see all

All Comments (10)

Sign In or Sign Up now to post a comment!
  • i like it ^________^

  • THAT WAS A GREAT PRESENTATION! AWSOME WORK

  • Thumbs up if you didnt watch the full hour of the video

  • all those stupid questions about color in the end.... arrrg, these people who don't understand RBM's...

  • Screw you

  • 01010011 01010101 01000010 01010011 01000011 01010010 01001001 01000010 01000101 00100000 01010100 01001111 00100000 01001101 01011001 00100000 01000011 01001000 01000001 01001110 01001110 01000101 01001100

  • Because Google produces a lot of tech-related videos.

  • Because Google bought YouTube, more evidence of how those who own the media want to monopolize it.

  • why am i stumbling upon so many google videos on youtube? >_>

Loading...

Alert icon
0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more