More information on each packages is written in the corresponding module's folder.
- Kinect2StreamsRecorder
- Recording of all Kinect V2 Streams at 30 fps.
- Gesture and Action Recognition Module
- The Gesture Recognition module is responsible for recognizing the gestures and actions that are performed by children both in real time as well as in offline tasks.
- Distant speech recognition module
- The Distant Speech Recognition module is employed for real time and continuous recognition of children’s speech (can also be used in offline tasks).
- AudioVisual Diarization module
- The audiovisual diarization module receives input from the speaker localization module and the kinect API in order to increase both the accuracy of localization as well as achieve speaker diarization.
- Speaker localization module
- The speaker localization module is used in order to recognize the position/angle of a speaker so that the robot can face him/her while talking.
- 3D Object Tracking Module
- The Object Tracking module estimates the 6-DoF poses (positions and orientations) of a number of movable objects that the children are expected to interact with.
- Wizard-of-Oz
- Wizard of Oz interface for autonomous, semi-autonomous, and manual mode. Allows the system to work in 3 modes: autonomous, semi autonomous and manual.
- Visual Emotion Recognition module
- The visual emotion recognition module can be used to recognize the emotion of a child based on its body posture and facial expressions.
- Engagement Detection Module
- Visual child engagement estimation.
- Speech-based Emotion recognition
- The speech based emotion recognition module can be used to recognize the emotion of a child based on its speech.
- Text-based emotion recognition
- Emotion recognition based on the child's speech (text).
- Text-based cognitive state recognition
- Cognitive state recognition based on the child's speech (text).
- Object Detection
- Visual Object Recognition based on tensoflow API.
- Multimodal Semantic Networks
- Gets Multimodal cues as input and outputs a list of most similar objects and multimodal representation for object