Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on May 15, 2016
Saurabh Gupta: Scene Understanding from RGB-D Images
In this talk, I will talk about detailed scene understanding from RGB-D images. We approach this problem by studying central computer vision problems like bottom-up grouping, object detection, instance segmentation, pose estimation in context of RGB-D images, and finally aligning CAD models to objects in the scene. This results in a detailed output which goes beyond what most current computer vision algorithms produce, and is useful for real world applications like perceptual robotics, and augmented reality. A central question in this work is how to learn good features for depth images in view of the fact that labeled RGB-D datasets are much smaller than labeled RGB datasets (such as ImageNet) typically used for feature learning. To this end I will describe our technique called "cross-modal distillation" which allows us to leverage easily available annotations on RGB images to learn representations on depth images. In addition, I will also briefly talk about some work on vision and language that I did on an internship at Microsoft Research.