Avatar for Tutoring
- Francis Quek Professor, Center for HCI
- David McNeill Emeritus Professor, Psychology, U of Chicago
- Yonca Haciahmetoglu
- Vishwas Kulkarni
- Yang Shi
Note: Past Project
Beyond the Talking Head and Animated Icon: Behaviorally Situated Avatars for Tutoring
One-on-one tutoring and instruction is a critical component of teaching and learning. Widespread access to such tutoring will facilitate the vision of universally available educational opportunities. In the highly spatial and contextually rich interaction between tutor and student, a key to facilitating learning is to provide a sense of situatedness between the ‘tutor’ and student. This situatedness is both temporal and spatial. Spatial situatedness is obviously essential for deictic references into both the interlocutor’s space and any artifact of interaction (e.g. a graphical illustration of the tutorial content). This is especially critical in disciplines of study such as engineering, science and mathematics where space, diagrams and graphical representations are of crucial importance. The temporal relationship between gesture and gaze entities with their lexical affiliates in speech (along with prosodic information) provide information on the intentive focus of the tutorial discourse and the relationship between ‘foreground’ and ‘background’ information. We believe that these are important in assisting the student to grasp the substance of the tutorial. We posit that avataric embodiment that will assist the student in the interaction. If the student had access only to a disembodied flying cursor, she will lose track of it the moment her attention is not fixed on the cursor. She will then have to search for the deictic point, expending valuable cognitive and attentional resources. Having access to the avatar’s body will help in directing the stu-dent’s attention and cueing her to the spatial presentation of the material. This will permit the student to devote more cognitive resources to the substance of the tutorial and not to the pragmatic task of trying to follow the disembodied cursor. This technology will facilitate broader access to one-on-one tutoring by permitting the tutor and student to be at different locations. Such distribution will aid in realizing ‘on-demand’ tutoring with a distributed tutor pool addressing the understanding needs of a distributed pool of students as need for help arises. Hence the proposed technology enhances both the tutor’s reach and the student’s ability to acquire understanding.
Coupling rigorous engineering and psycholinguistic research we propose to develop a behaviorally correct avatar for an on-line tutor. The goal is to facilitate an anywhere-tutor and anywhere-student system that requires modest computation and bandwidth resources at both ends. The theory of multimodal human communication suggests that we can provide such a conduit by capturing the tutor’s gestural intent on a LCD tablet in conjunction with her synchronized speech. Our understanding of human gesture, speech, and gaze behavior is applied to reconstruct a fully embodied tutor avatar. The avatar will be situated both temporally with respect to speech production and spatially with respect to the state of the student’s interaction space. The temporal situatedness is grounded in the ‘growth point’ concept that advances a model of how multimodal conversational behavior coheres and is produced. Our hyperphrase model facilitates the scripting of tutorial conversational behavior into cohesive packages. The insight that the gestural beat serves as the basic temporal pulse of gestures that may be given fuller representational form across gestural spaces informs our approach of using pen-based interaction to provide the temporal synchrony of an animated avatar’s behavior with co-produced speech. We propose a system architecture for realizing these principles within a framework that permits both synchronous interaction and the scripting of discourse fragments. Our system employs a set of behavior templates that realize conversational hyperphrases. A library of such templates provide for various kinds of deictic gestures and a repertoire of gestures for tutorial exchange. A template is fired when the specifications gleaned from the tutor’s pen-gestures and the state of the student’s environment satisfies its executing predicate. As the selected behavioral entities proceed through the system, the template’s variables are bound to various temporal and spatial anchors to produce spatially and temporally situated behavior. We propose a set of experiments to test our system and determine its efficacy and realism.
We see this technology as being complementary to the virtual classroom’ or ‘lecture/presenter’ paradigm that has received ongoing research attention. In synchronous mode, our technology will facilitate access to live ‘distance tutoring’. Functioning in a ‘scripted mode’, the technology permits the preparation of tutorial material off-line to be used by multiple students. Transitioning between both modes, a live tutor may utilize prepared material and provide instant clarification, and intimate real-time instruction to ensure proper understanding of the material.
This project is partially supported by the National Science Foundation ITR Program: Quek, F. (P.I.), David McNeill (Psych/Linguistics, U. Chicago–Co-PI) “Beyond the Talking Head and Animated Icon: Behaviorally Situated Avatars for Tutoring,” National Science Foundation, Information Technology Research NSF-ITR Program, NSF-IIS-0219875; and, by ATR Japan, Quek, F. (P.I.) “Minimally Driven Avatars for Bunshin Interaction,” ATR, Japan, March 2000 – February, 2001