Alert icon
We're changing our privacy policy. This stuff matters.  Learn more  Dismiss

Looking at People

Loading...

Sign in or sign up now!
8,391
Loading...
Alert icon
Sign in or sign up now!
Alert icon

Uploaded by on Sep 9, 2008

Google Tech Talks
September 8, 2008

ABSTRACT

There is a great need for programs that can describe what people are doing from video. This is difficult to do, because it is hard to identify and track people in video sequences, because we have no canonical vocabulary for describing what people are doing, and because phenomena such as aspect and individual variation greatly affect the appearance of what people are doing. Recent work in kinematic tracking has produced methods that can report the kinematic configuration of the body fairly accurately and fully automatically.

The problem of vocabulary is more difficult. I will discuss a generative activity model that allows activities to be assembled from a set of distinct spatial and temporal components. The models themselves are learned from labelled motion capture data and are assembled in a way that makes it possible to learn very complex finite automata without estimating large numbers of parameters. The advantage of such a model is that one can search videos for examples of activities specified with a simple query language, without possessing any example of the activity sought. In this case, aspect is dealt with by explicit 3D reasoning.

An alternative strategy for dealing with aspect and individual variation is to build discriminative methods applied to appearance features. The difficulty here is that activities look different when seen from different directions. I will describe recent methods that make it possible to transfer models --- that is, to learn a model of an activity from one view, then recognize it in a completely different view.

Speaker: David Forsyth
David Forsyth holds a BSc and an MSc in Electrical Engineering from the University of the Witwatersrand, Johannesburg, and an MA and D.Phil from Oxford University. He is currently a full professor at U. Illinois Urbana-Champaign, having served 10 years on the faculty at UC Berkeley. He has published over 100 papers on computer vision, computer graphics and machine learning. He served as program co-chair for IEEE Computer Vision and Pattern Recognition in 2000, general co-chair for CVPR 2006, program co-chair for ECCV 2008, and is a regular member of the program committee of all major international conferences on computer vision. He has received best paper awards at the International Conference on Computer Vision and at the European Conference on Computer Vision, and an IEEE Technical Achievement award.
His recent textbook, "Computer Vision: A Modern Approach" (joint with J. Ponce and published by Prentice Hall) is now widely adopted as a course text.

Category:

Science & Technology

Tags:

License:

Standard YouTube License

  • likes, 1 dislikes

Link to this comment:

Share to:
see all

All Comments (5)

Sign In or Sign Up now to post a comment!
  • Maybe one could establish compatibility of moving body segments by modelling the dynamics. Understanding how the body parts are exerting force and torque on each other is a well understood problem in robotics. I don't know how difficult it would be to apply it to the vision domain though.

  • fucking hell

  • oh yeah, that's just what we need.

  • amazing talk!

Loading...

0 / 00Unsaved Playlist Return to active list
    1. Your queue is empty. Add videos to your queue using this button:
      or sign in to load a different list.
    Loading...Loading...Saving...
    • Clear all videos from this list
    • Learn more