Conveying Routes: Multimodal Generation and Spatial Intelligence in Embodied Conversational Agents

Abstract

In creating an embodied conversational agent (ECA) capable of conveying routes, it is necessary to understand how to present spatial information in an effective and natural manner. When conveying routes to someone, a person uses multiple modalities -- e.g., speech, gestures, and reference to a map -- to present information, and it is important to know precisely how these modalities are coordinated. With an understanding of how humans present spatial intelligence to give directions, it is then possible to create an ECA with similar capabilities. Two empirical studies were carried out to observe natural human-to-human direction-giving interactions. From the results, a direction-giving model was created, and then implemented in the MACK (Media Lab Autonomous Conversational Kiosk) system.

Download full paper: PDF