Helsinki University of Technology's (TKK) WorkPartner robot and Spatial Information Interface test. User asks where object is located and the robot replies using speech, virtual environment model (shown in up-left) and robot's camera (shown in up-right) image.
The ball is recognized using OpenCV ellipse fitting. The speech commands are recognized using CMU sphinx-2 speech recognition system. The virtual environment model uses Open Dynamics Engine (ODE) based SimPartner software. The speech responses are generated using Festival speech synthesis system.
The scenario steps are basically:
1. User gives speech request: "Where is ITEM?". ITEMs used here are ball, box and hammer.
2. Robot responds either: a) ITEM is in LOCATION. b) I don't know where is ITEM? c) I did not understand, can you repeat?
3. Robot points the ITEM from camera image and from virtual environment model.
More info:
http://automation.tkk.fi/SpacePartner
@pithikoulis Our software philosophy is to have everything as plug-and-play as possible in Linux. So the speech recognition software had to exist also in common Debian/Ubuntu package repositories (no manual downloading/compiling required). Without this requirement Sphinx4 would have been definitely the choice!
yebbey 11 months ago
Why do you use sphinx2 instead of sphinx4?
pithikoulis 11 months ago