Research Internship: Multimodal HCI

Keywords: Eye-Gaze, Speech Recognition, Machine Translation, Mutual Disamiguation
Duration :- May 21 to July 13, 2013

Guide: Srinivas Bangalore, Principle Researcher, AT&T Labs, New Jersey
             Michael Carl, Associate Professor, CRITT, Copenhagen Business School

Description: During my third year summer internship in Center for Research in Translation and Translation Studies, Copenhagen Business School, Denmark, I worked on project: Speech and Eye-Tracking Enabled Computer Assisted Translation (SEECAT) under the guidance of Michael Carl and Srinivas Bangalore. Out of four teams, I worked in Eye-Gaze team along with two other team members. 

Our research paper of this work titled 'Mutual Disambiguation of Eye Gaze and Speech for Sight Translation and Reading' has been accepted in workshop on Eye Gaze in Intelligent Human Machine Interaction: Gaze in Multimodal Interaction, ICMI 2013, Sydney, Australia. I also gave a poster presentation of this work in TechEvince, Annual Research Exhibition of IIT Guwahati.

Poster presented during TechEvince




Video Presentation in ICMI Conference
For first part of the project, we on a new approach for better word-gaze fixation mapping to cater noise causing system errors (drifts etc.) while gaze tracking. For second part, we worked on study of reducing the errors of the two modalities, speech and eye-gaze with help of each other in context of sight translation and reading. First, to collect raw data, we conducted English reading task experiments with 6 participants as well as sight translation task experiments with 4 participants from English to Hindi, Spanish, Danish and Italian translation. Finite State Machine approach was used for integration of the two modality lattices. In reading task, we got significant improvement in both Eye-Gaze f-measure and speech Word Accuracy.



During first 3 weeks of internship, we had guest lectures and practical sessions on SMT, ASR, GUI and Eye Tracking by experts and researchers in these fields like Phillip Kohen etc. I also got exposure to many of the tools like Moses, Sphinx, Translog, Open FST, Latex, Giza++, Putty and programming languages like Python and R during span of this internship. I worked extensively with Tobii T60 Eye Tracker.
Work-Flow