Toward Adaptive Information Fusion in Multimodal Systems

Oviatt, S.

Techniques for information fusion are at the heart of multimodal system design. To develop new user-adaptive approaches for multimodal fusion, our lab has investigated the stability and basis of major individual differences that have been documented in users' multimodal integration patterns. In this talk, I summarized the following: (1) there are large individual differences in users' dominant speech and pen multimodal integration pattern, such that individual users can be classified as either simultaneous or sequential integrators (Oviatt, 1999; Oviatt et al., 2003), (2) users' dominant integration pattern can be identified almost immediately (i.e., upon first interaction with computer), and it remains highly consistent over a session (Oviatt et al., 2003; Oviatt et al., 2005b), (3) users' dominant integration pattern also remains stable across their lifespan (Oviatt et al., 2003; Oviatt et al., 2005b), (4) users' dominant integration pattern is highly resistant to change, even when they are given strong selective reinforcement or explicit instructions to switch patterns (Oviatt et al., 2003; Oviatt et al., 2005a), (5) when users encounter cognitive load (e.g., due to increasing task difficulty, or system recognition errors), their dominant multimodal integration pattern entrenches or becomes 'hypertimed,' (Oviatt et al., 2003; Oviatt et al., 2004), and (6) users' distinctive integration patterns appear to derive from enduring differences in basic reflective-impulsive cognitive style (Oviatt et al., 2005b). In this talk, I also discussed recent work in our lab that combines empirical user modeling with machine learning techniques to learn users' multimodal integration patterns. This work emphasizes the establishment of user-adaptive temporal thresholds for time-critical multimodal systems, rather than fixed temporal thresholds which are the current state-of-the-art. Estimates indicate that system delays can be reduced to just 44% of what they currently are by adopting user-defined thresholds, with related substantial reductions in system recognition errors. Ongoing research in our group is exploring which machine learning techniques and models provide the best acceleration and generalization of learned multimodal integration patterns, reliability of signal and information fusion, and overall improvement in multimodal interpretation. We currently are developing a three-tiered user-adaptive model to conduct on-line adaptation of a multimodal system's temporal thresholds during fusion based on a user's habitual integration pattern, which is used as prior knowledge. Implications of this research were discussed for the design of next-generation adaptive multimodal systems with substantially improved performance characteristics.

Cite as: Oviatt, S. (2005). Toward Adaptive Information Fusion in Multimodal Systems. In Proc. NICTA-HCSNet Multimodal User Interaction Workshop, MMUI 2005, Sydney, Australia. CRPIT, 57. Chen, F. and Epps, J., Eds. ACS. 3.

(from crpit.com) (local if available)