Agent-based Gesture Tracking

From TEILab
Jump to navigation Jump to search

Human Tracking using Agent-Based Architecture

Project Team

  • Dr. Francis Quek
  • Dr. Yong Cao Assistant Professor, Dept. of Computer Science
  • Seung-In Park Graduated PhD Student
  • Xiao Lin
  • Chao Peng Graduated PhD Student
  • Bing Fang Graduated PhD Student
  • Liguang Xie Graduated PhD Student
  • Pak-Kiu Chung Graduated-MS Student
  • Yannick Verdie Graduated-MS Student

Note: This is a recently completed project that is currently dormant

Project Overview

Agent-based approaches have had a long history, dating back to the 1970's when they were first introduced within the context of distributed artificial intelligence [Durfee,1989]. Since then, agent-based approaches have been applied in a wide variety of domains, from business, to game simulations, to computer vision architectures [Inverno,2001; Luck,2004; Bryll,2005]. Key aspects of agent-based systems are the agent system architecture, inter-agent communication, and agent design [Wooldridge,1995; Ferber,1999; Stone,2000; Luck,2004; Sterling,2005]. One way to think about agent-based approaches is that they provide a means of modeling that distributes the complexity of a particular problem domains across a set of agents, thereby limiting the complexity within each agent. The solution is then divided across the agents and the communication among them. Agent-based systems have often been compared to standard object-oriented approaches that also decompose complexity by a process of encapsulation within objects and object interfaces. The key difference between object-based and agent-based systems is that the agents in the latter are active and autonomous entities, thereby distributing both the process as well as the representation, while object-based systems distribute representation while maintaining centralized processing.

Agent-based approaches have been applied in computer vision and human-body tracking [Cheng,1995; Luckenhaus,1997; Shang,1999; Saptharishi,2000; Infantino,2002; Zhou,2003; Shakshuki,2003; Bryll,2005]. The typical approach is to model the processing components as agents (e.g., edge detection agent, feature matching agent, color segmentation agent, labeling), or to assign different areas in the visual field to specific agents, or to model the entire user interface as an 'interface agent'. Lukenhaus and Eckstein describe a novel application of agents to support parallel processing by which the generic agents apply a set of operators to solve the image processing task. The agents essentially serve as computing resource schedulers. Hence, in a more general sense, the agent represent the processing operators they schedule.

We also introduced our design in [Bryll,2005; Fang,2008]. Our approach departs from this process-centric model where the agents are bound to specific processes, and introduces a model by which agents are bound to the objects or sub-objects being recognized or tracked. This decomposition by object rather than process furnishes our approach with greater flexibility since the agents can apply very specific operations and features, and can be adapted for a variety of sensing environments. In this, we follow Bryll et al. The difference is that [Bryll,2005] takes a rigid bottom-up approach where features (blobs) become labeled as specific body parts, and coalitions of body parts are constructed that satisfy certain relational constraints. The problem with this approach is that if the low-level features are erroneous (e.g, blobs are merged or fragmented), the agents are unable to recover from the error. In the approach described in this paper, the 'body-part agents' are abstractly defined, and they seek 'evidence' for their existence and location by examining the low-level features at their disposal. Hence, specific knowledge is encoded within each agent, and the knowledge about the entire body being tracked is encoded in a 'body agent' that models the relationship among the body parts.


This research is to apply agent-based architecture onto vision-based gesture tracking. Our first agent-based framework was built in our paper Hand Tracking using Agent-based Framework. In the research, we prototyped the agent-based into a hierarchical structure, and the agent-based system gives us a new way to think about tracking tasks. The paper was succeed to get reward of “Best External Paper Award” on the 5th international conference on Visual Information Engineering (VIE 08).

With the succeed in VIE, we further extended our agent-based tracking system. TSAI’s calibration algorithm is introduced as one of the agents into our system, and the 3D skeleton-model is employed to do tracking in 3D space. We succeed in reconstruct 3D animation for full-body and half-body human motion gestures.

Since it’s also Bing's PhD dissertation topic, he will be in charge of updating the status of this fantastic project.


  • Bing Fang, Pak-Kiu Chung and Francis Quek, Hand Tracking Using Agent-Based Framework, The 5th International Conference on Visual Information Engineering, July, 2008, Xi’an, China. (Awarded as Best External Paper)
  • Bing Fang, Liguang Xie, Pak-Kiu Chung, Yong Cao and Francis Quek, Full Body Tracking Using An Agent-Based Architecture, Applied Imagery Pattern Recognition 2008, Dec, 2008, Washington D.C., Virginia, USA.
  • Bing Fang, Liguang Xie, Seung In Park, Chao Peng, Yong Cao and Francis Quek, Upper Body Tracking and 3D Gesture Reconstruction Using Agent-Based Architecture, submit to IET 09.


This research has been partially supported by NSF grants “Embodied Communication: Vivid Interaction with History and Literature,” IIS-0624701, “Interacting with the Embodied Mind,” CRI-0551610, and “Embodiment Awareness, Mathematics Discourse and the Blind,” NSF-IIS- 0451843.