Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Jun 18, 2015
The tremendous recent progress on ImageNet challenge has demonstrated the power of deep learning for 2D object recognition. In this talk, I will discuss how to go one step further to enable deep learning to embrace the full complexity of visual understanding. First, beyond 2D, we propose to learn a 3D shape representation of objects using a convolutional deep belief network, which enables 3D shape completion and RGB-D object recognition. Second, beyond a single object, we propose a holistic approach to feed a scene image with many objects to a network and obtain a significant performance improvement over the state-of-the-arts in scene recognition. As an application, I will demonstrate how to use this holistic approach for autonomous driving to replace the typical huge, long, and complex pipeline by a single network from end to end.
Jianxiong Xiao is an Assistant Professor in the Department of Computer Science at Princeton University. He received his Ph.D. from the Computer Science and Artificial Intelligence Laboratory (CSAIL) at Massachusetts Institute of Technology (MIT). His research interests are in computer vision, with a focus on data-driven scene understanding. He has been motivated by the goal of building computer systems that automatically understand visual scenes, both inferring the semantics (e.g. SUN Database) and extracting 3D structure (e.g. Big Museum). His work has received the Best Student Paper Award at the European Conference on Computer Vision (ECCV) in 2012 and Google Research Best Papers Award for 2012, and has appeared in popular press in the United States. Jianxiong was awarded the Google U.S./Canada Fellowship in Computer Vision in 2012, MIT CSW Best Research Award in 2011, and two Google Research Awards in 2014 and in 2015. More information can be found at: http://vision.princeton.edu.