This video shows a visual system developed by to recognize arm gestures.
Also, I show the environment used to extract motion and posture features to
characterize gestures.
This system was part of my PhD thesis. It was developed based on OpenCV's
Camshift, although
it was modified to use a RGB (3-channel) histogram. This approach
start by detecting the face of a user in front of the videocamera using cascade
face detector implemented in OpenCV.
Based on the job of Rehg and Jones,
I constructed skin and non-skin "general" probability functions P(RGB|skin) and
P(RGB|non-skin)
by sampling more than 30 people under different environments and 4 videocameras
(around 2 millions and 20 millions of pixels, respectively).
We use anthropometric values to approximate the position of the right hand of
the user, and then segment the hand classifying
skin pixels using the rule:
P(RGB|skin) > P(RGB|non-skin).
Unfortunately, due to extreme lighting conditions of our environment, colors
perceived by
the camera (Sony EVI-D30) were low-saturated and the system was not able to
classify the skin of the hand.
Then, I decided to obtain some color samples from the user's face to combine
this information with
the general color function previously constructed. I tested different fusion rules
based on the classic expected value P(RGB|skin) = [V_g * P_g(RGB|skin)] + [V_p *
P_p(RGB|skin)] where V_g + V_p = 1.
Unfortunately, this function failed, probably because of the needed to select
good parameters, V_g and V_p.
In order to solve this problem I used the next rule to combine both histograms:
P_f(RGB|skin) = P_g(RGB|skin) * P_u(RGB|skin),
where P_g() is the probability function of the "general" color histogram
previously constructed and
P_u() is the function of the color taken by the user's face. This function
detected fine the skin by applying:
P_f(RGB|skin) > P_f(RGB|non-skin).
P_f(RGB|non-skin) is constructed by fusioning P_g(RGB|non-skin), a general
non-skin function with
P_u(RGB|non-skin), colors taken from the user's torso.
An intuitive explanation of the behaviour of this rule is that increases
probability of the pixels that agree
in both histograms, while making less probable those which don't.
Here you can read more information of this work and its application to gesture
recognition:
Visual Recognition of Similar Gestures
International Conference on Pattern Recognition, 2006.
H.H. Aviles-Arriaga L.E. Sucar C.E. Mendoza
Available at: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1699081
Any comments and suggestions are welcome.
hector_hugo_aviles@hotmail.com
Are you recognizing the hand gestures?? Is the tracking rectangle only looking for gesturing hand?
girishthegreat 4 years ago
Hello, Thisvideo do not show the hand gestures we are considering, only the visual system. You can see this other instead named
Teleoperation of a mobile robot using gestures.
The yellow window is tracking skin color only, that's the reason can follow the left hand too.
It was interesting for me to present this video because the people working with the visual system for the first time seemed to perceive the system such as if they were using a hand glove.
hectorhugoaviles 4 years ago