Usability Design and Evaluation Guidelines for Augmented Reality (AR) Systems: Results

Researching Usability Design and Evaluation Guidelines
for Augmented Reality (AR) Systems

Results

Results:
The usability design and evaluation guidelines are presented below in a structured format. To go directly to a specific group of guidelines, click on a link below.

VE/AR Users and User Tasks
The Virtual Model
VE/AR User Interface Input Mechanisms
VE/AR User Interface Presentation Components

VE Users and User Tasks
At the root of all insightful usability evaluations are application-specific, representative user tasks. It is meticulous examination of user performance and satisfaction, physical device support, and software facilities in support of users' cognitive organization of these tasks which not only expose critical usability problems, but promise the most notable improvements when addressed. The growth in interest of user-centered design has placed well-deserved importance of user requirements and task analysis on the interaction development cycle [Hix and Hartson, 1993]. Indeed, \the entire interaction development cycle is becoming centered around the evaluation of users performing tasks"[Hix and Hartson, 1993]. As a result, the user task has become a basic element in usability engineering.

Guidelines: VE Users

Reference

Take into account user experience (i.e., support both expert and novice users), user’s physical abilities (e.g., handedness), and users' technical aptitudes (e.g., orientation, spatial visualization, and spatial memory).

Support users with varying degrees of domain knowledge.

[Egan, 1988]

[Darken and Sibert, 1995]

[Stoakley et al., 1995]

[Stanney, 1995]

When designing collaborative environments, support social interaction among users (e.g., group communication, role-play, informal interaction) and cooperative task performance (e.g., facilitate social organization, construction, and execution of plans)

[Benford, 1996]

[Malone and Crowston, 1990]

[Waters et al., 1997]

Take into account the number and locations of potential users.

[experience and observation]

For testbed AR environments (i.e., those used for research purposes), calibration methods should by subject-specific. That is, the calibration should account for individual differences.

[Summers, et al., 1999]

For testbed AR environments (i.e., those used for research purposes), calibration methods should provide no additional or residual cues that may be exploited by subjects.

[Summers, et al., 1999]

Provide (access to) information about other collaborative users even when they are physically occluded or "remotely" located.

[Feiner, 1999]

In social/collaborative environments, support face-to-face communication by presenting visual information within a users field-of-view that would otherwise require the user to look away. Avoid interaction techniques (and devices) that require a noticeable portion of a users attention.

[Feiner, 1999].

Guidelines: VE User Tasks

Reference

Design interaction mechanisms and methods to support user performance of serial tasks and task sequences.

Support concurrent task execution and user multiprocessing.

[experience and observation]

Provides stepwise, subtask refinement including the ability to undo and go "back" when navigating information spaces.

[experience and observation]

Guidelines: Navigation and Locomotion

Reference

Support appropriate types of user navigation (e.g., naive search, primed search, exploration), facilitate user acquisition of survey knowledge (e.g., maintain a consistent spatial layout)

[Darken and Sibert, 1995]

[Darken and Sibert, 1996]

[Lynch, 1960]

When augmenting landscape and terrain layout, consider [Darken and Sibert, 1995] organizational principles.

When appropriate, include spatial labels, landmarks, and a compass

[Bennett et al., 1996]

[Darken and Sibert, 1995]

[Darken and Sibert, 1996]

Provide information so that users can always answer the questions: Where am I now? What is my current attitude and orientation? Where do I want to go? How do I travel there?

[Wickens and Baker, 1995]

Guidelines: Object Selection

Reference

Strive for body-centered interaction. Support multimodal interaction.

[Brooks et al., 1990]

[Davies, 1996]

[Slater et al., 1995b]

[Wickens and Baker, 1995]

Use non-direct manipulation means (such as query-based selection) when selection criteria are temporal, descriptive, or relational.

[experience and observation]

Strive for high frame rates and low latency to assist users in three-dimensional target acquisition

[Ware and Balakrishnan, 1994]

[Richard et al., 1996]

Provide accurate depiction of location and orientation of graphics and text.

[Wickens and Baker, 1995]

Guidelines: Object Manipulation

Reference

Support two-handed interaction (especially for manipulation-based tasks).

For two-handed manipulation tasks, assign dominant hand to fine-grained manipulation relative to the non-dominant hand

[Guiard, 1987]

[Hauptmann, 1989]

[Hinckley et al., 1994a]

The Virtual Model
Consider the vast amount of naturally occurring information we are able to perceive via our senses. As living creatures, we instinctively use this information, interpreting it to create a mental picture, or model, of the world around us. Users of VEs rely on system-generated information, along with other information, such as past experience, to shape their cognitive models. Users also interact within such system-generated information spaces, so that the information flow is essentially bi-directional. We term the abstract, device-independent body of information and interaction the \virtual model." The virtual model defines all information that users perceive, interpret, interact with, alter, and | most importantly | work in.

Guidelines: User Representation and Presentation

Reference

For AR-based social environments (e.g., games), allow users to create, present, and customize private and group-wide information.

[Szalavari, Eckstein and Gervautz, 1998]

For AR-based social environments (e.g., games), provide equal access to "public" information.

[Szalavari, Eckstein and Gervautz, 1998]

In collaborative environments, allow users to share tracking information about themselves (e.g., gesture based information) to others. Allow users to control presentation of both themselves and others (e.g., to facilitate graceful degradation).

[Feiner, 1999]

Guidelines: VE Agent Representation and Presentation

Reference

Include agents that are relevant to user tasks and goals.

Organize multiple agents according to user tasks and goals

[Ishizaki, 1996]

[Trias et al., 1996]

Allow agent behavior to dynamically adapt, depending upon context, user activity, etc.

Represent interactions among agents and users (rules of engagement) in a semantically consistent,

easily visualizable manner.

[Trias et al., 1996]

Guidelines: Virtual Surrounding and Setting

Reference

Support significant occlusion-based visual cues to the user, by maintaining proper occlusion between real and virtual objects.

[Wloka and Anderson, 1995]

When possible, determine occlusion, dynamically, in real-time (i.e., at every graphics frame).

[Wloka and Anderson, 1995]

When presenting inherently 2D information, consider employing 2D text and graphics of the sort supported by current window systems

[Feiner, et al., 1993]

In collaborative environments, support customized views (including individual markers, icons and annotations) that can be either shared or kept private

[Fuhrmann, Loffelmann, Schmalstieg, 1997]

To avoid display clutter in collaborative environments, allow users to control the type and extent of visual information (per participant) presented.

[Fuhrmann, Loffelmann, Schmalstieg, 1997]

Optimize stereoscopic visual perception by ensuring that left and right eye images contain minimum vertical disparities. Minimize lag between creation of left and right eye frames.

[Drascic and Milgram, 1996]

Guidelines: VE System and Application Information

Reference

Use progressive disclosure for information-rich interfaces. Pay close attention to the visual, aural, and haptic organization of presentation (e.g., eliminate unnecessary information, minimize overall and local density, group related information, and emphasize information related to user tasks).

Strive to maintain interface consistency across applications

[Hix and Hartson, 1993]

Language and labeling for commands should clearly and concisely reflect meaning. System messages should be worded in a clear, constructive manner so as to encourage user engagement (as opposed to user alienation)

[Hix and Hartson, 1993]

For large environments, include a navigational grid and/or a navigational map.

When implementing maps, consider to [Darken and Sibert, 1995] map design principles

[Darken and Sibert, 1995]

Present domain-specific data in a clear, unobtrusive manner such that the information is tightly coupled to the environment and vice-versa.

Strive for unique, powerful presentation of application-specific data, providing insight not possible through other presentation means

[Bowman et al., 1996]

VE User Interface Input Mechanisms
Engaging in a VE or any other computer-based system implies that some form of dialog exist between user and computer. This dialog, however, is typically not like our natural model of dialog; namely, exchanging spoken words to convey meaning. Instead, the dialog is typically orchestrated through input devices from the user's end, and highlighted, animated displays from the computer's end [Card et al., 1990]. As with any dialog, syntax and content of expression are critical to mutual understanding. An investigation of VE input devices, characteristics, and use may yield clearer comprehension of dialog between users and VEs.

Guidelines: Tracking User Location and Orientation

Reference

Assess the extent to which degrees of freedom are integrabile and seperabile within the context of representative user tasks.

Eliminate extraneous degrees of freedom by implementing only those dimensions which users perceive as being related to given tasks.

Multiple (integral) degrees of freedom input is well-suited for coarse positioning tasks, but not for tasks which require precision.

[Hinckley et al., 1994a]

[Jacob et al., 1994]

[Zhai and Milgram, 1993b]

When assessing appropriate tracking technology relative to user tasks, one should consider working volume, desired range of motion, accuracy and precision required, and likelihood of tracker occlusion.

[Applewhite, 1991]

[Azuma, 1997]

[Waldrop et al., 1995]

[Strickland et al., 1994]

[Sowizral and Barnes, 1993]

Calibration requirements for AR tracking systems should include:

calibration methods which are statistically robust,

a variety of calibration approaches for different circumstances, and,

metrology equipment that is sufficiently accurate, convenient to use.

[Hollerbach and Wampler, 1996]

[Summers, et al., 1999]

For testbed AR environments (i.e., those used for research purposes), calibration methods should be independent. That is, separate parts of the entire calibration should not rely on each.

[Summers, et al., 1999]

Relative latency is a source of misregistration and should be reduced.

[Jacobs, Livingston and State, 1997]

Devices should be both spatially and temporally registered (supports effective integration of user interaction devices which may vary in type, accuracy, bandwidth, dynamics and frequency).

[Jacobs, Livingston and State, 1997]

Match the number of degrees of freedom (in the physical device and interaction techniques) to the inherent nature of the task. For example, menu selection is a 2D task and as such should not require a device or interaction technique with more than two degrees of freedom.

[experience, observation]

Consider using a Kalman Filter in head tracking data to smooth the motion and decrease lag.

[Feiner, et al., 1993]

Trackers should be accurate to small fraction of a degree in orientation and a few millimeters in position.

[Azuma, 1993]

In head-tracked based AR systems, errors in measured head orientation usually cause larger registration offsets (errors) than object orientation errors do.

[Azuma, 1993]

Minimize the combined latency of the tracker and the graphics engine.

[Azuma, 1993]

Tracking systems (singleton tracking system or hybrids) should work at long ranges (i.e., support mobile users).

[Azuma, 1993]

Minimize dynamic errors (maximize dynamic registration) by 1) reducing system lag, 2) reducing apparent lag, 3) matching temporal streams (with video-based systems), and 4) predicting future locations.

[Azuma, 1997]

Guidelines: Data Gloves and Gesture Recognition

Reference

Allow gestures to be defined by users incrementally, with the option to change or edit gestures on the fly.

Avoid gesture in abstract 3D spaces; instead use relative gesturing.

[Su and Furuta, 1994]

[Hinckley et al., 1994a]

Guidelines: Speech Recognition and Natural Language

References

Strive for seamless integration of annotation, provide quick, efficient, and unobtrusive means to record and playback annotations.

Allow users to edit, remove, and extract or save annotations

[Verlinden et al., 1993]

[Harmon et al., 1996]

VE User Interface Presentation Components
VEs rely on specialized hardware to "present" information to users. Note that the term "present" and "presentation" are used to imply much more than simply a visual context - all the senses can be used in VE user interface presentation. The following guidelines address devices and components used to support presentation. A system's presentation components may have an effect on a user's cognitive processes (among many others), and subsequently, usability.

Guidelines: Visual Feedback -- Graphical Presentation

Reference

Timing and responsiveness of an AR system are crucial elements (e.g., effect user performance).

[Mynatt, et al., 1997]

Strive for consistency among the various visual (and other sensory) cues which are used to infer information about the combined virtual and real world.

[Drascic and Milgram, 1996]

For stereoscopic applications, employ headsets that support adjustable interpupillary distances (IPD) between approximately 45mm to 75mm.

[Drascic and Milgram, 1996]

Allow that user to optimize the visual display, (e.g., support user-controlled (and preset) illuminance and contrast levels.

[Drascic and Milgram, 1996]

Ensure that wearable display is sufficiently comfortable and optically transparent for the user.

[Feiner, 1999]

Minimize static errors by isolating and evaluating 1) optical distortion, 2) errors in the tracking system(s), 3) mechanical misalignments, and 4) incorrect viewing parameters (e.g., field of view, tracker-to-eye position and orientation, interpupillary distance)

[Azuma, 1997]

Joseph L. Gabbard	Systems Research Center
Copyright (c) 2001	Virginia Tech

Guidelines: VE Users	Reference
Take into account user experience (i.e., support both expert and novice users), user’s physical abilities (e.g., handedness), and users' technical aptitudes (e.g., orientation, spatial visualization, and spatial memory). Support users with varying degrees of domain knowledge.	[Egan, 1988] [Darken and Sibert, 1995] [Stoakley et al., 1995] [Stanney, 1995]
When designing collaborative environments, support social interaction among users (e.g., group communication, role-play, informal interaction) and cooperative task performance (e.g., facilitate social organization, construction, and execution of plans)	[Benford, 1996] [Malone and Crowston, 1990] [Waters et al., 1997]
Take into account the number and locations of potential users.	[experience and observation]
For testbed AR environments (i.e., those used for research purposes), calibration methods should by subject-specific. That is, the calibration should account for individual differences.	[Summers, et al., 1999]
For testbed AR environments (i.e., those used for research purposes), calibration methods should provide no additional or residual cues that may be exploited by subjects.	[Summers, et al., 1999]
Provide (access to) information about other collaborative users even when they are physically occluded or "remotely" located.	[Feiner, 1999]
In social/collaborative environments, support face-to-face communication by presenting visual information within a users field-of-view that would otherwise require the user to look away. Avoid interaction techniques (and devices) that require a noticeable portion of a users attention.	[Feiner, 1999].

Guidelines: VE User Tasks	Reference
Design interaction mechanisms and methods to support user performance of serial tasks and task sequences. Support concurrent task execution and user multiprocessing.	[experience and observation]
Provides stepwise, subtask refinement including the ability to undo and go "back" when navigating information spaces.	[experience and observation]

Guidelines: Navigation and Locomotion	Reference
Support appropriate types of user navigation (e.g., naive search, primed search, exploration), facilitate user acquisition of survey knowledge (e.g., maintain a consistent spatial layout)	[Darken and Sibert, 1995] [Darken and Sibert, 1996] [Lynch, 1960]
When augmenting landscape and terrain layout, consider [Darken and Sibert, 1995] organizational principles. When appropriate, include spatial labels, landmarks, and a compass	[Bennett et al., 1996] [Darken and Sibert, 1995] [Darken and Sibert, 1996]
Provide information so that users can always answer the questions: Where am I now? What is my current attitude and orientation? Where do I want to go? How do I travel there?	[Wickens and Baker, 1995]

Guidelines: Object Selection	Reference
Strive for body-centered interaction. Support multimodal interaction.	[Brooks et al., 1990] [Davies, 1996] [Slater et al., 1995b] [Wickens and Baker, 1995]
Use non-direct manipulation means (such as query-based selection) when selection criteria are temporal, descriptive, or relational.	[experience and observation]
Strive for high frame rates and low latency to assist users in three-dimensional target acquisition	[Ware and Balakrishnan, 1994] [Richard et al., 1996]
Provide accurate depiction of location and orientation of graphics and text.	[Wickens and Baker, 1995]

Guidelines: User Representation and Presentation	Reference
For AR-based social environments (e.g., games), allow users to create, present, and customize private and group-wide information.	[Szalavari, Eckstein and Gervautz, 1998]
For AR-based social environments (e.g., games), provide equal access to "public" information.	[Szalavari, Eckstein and Gervautz, 1998]
In collaborative environments, allow users to share tracking information about themselves (e.g., gesture based information) to others. Allow users to control presentation of both themselves and others (e.g., to facilitate graceful degradation).	[Feiner, 1999]

Guidelines: VE Agent Representation and Presentation	Reference
Include agents that are relevant to user tasks and goals. Organize multiple agents according to user tasks and goals	[Ishizaki, 1996] [Trias et al., 1996]
Allow agent behavior to dynamically adapt, depending upon context, user activity, etc. Represent interactions among agents and users (rules of engagement) in a semantically consistent, easily visualizable manner.	[Trias et al., 1996]

Guidelines: Virtual Surrounding and Setting	Reference
Support significant occlusion-based visual cues to the user, by maintaining proper occlusion between real and virtual objects.	[Wloka and Anderson, 1995]
When possible, determine occlusion, dynamically, in real-time (i.e., at every graphics frame).	[Wloka and Anderson, 1995]
When presenting inherently 2D information, consider employing 2D text and graphics of the sort supported by current window systems	[Feiner, et al., 1993]
In collaborative environments, support customized views (including individual markers, icons and annotations) that can be either shared or kept private	[Fuhrmann, Loffelmann, Schmalstieg, 1997]
To avoid display clutter in collaborative environments, allow users to control the type and extent of visual information (per participant) presented.	[Fuhrmann, Loffelmann, Schmalstieg, 1997]
Optimize stereoscopic visual perception by ensuring that left and right eye images contain minimum vertical disparities. Minimize lag between creation of left and right eye frames.	[Drascic and Milgram, 1996]

Guidelines: VE System and Application Information	Reference
Use progressive disclosure for information-rich interfaces. Pay close attention to the visual, aural, and haptic organization of presentation (e.g., eliminate unnecessary information, minimize overall and local density, group related information, and emphasize information related to user tasks). Strive to maintain interface consistency across applications	[Hix and Hartson, 1993]
Language and labeling for commands should clearly and concisely reflect meaning. System messages should be worded in a clear, constructive manner so as to encourage user engagement (as opposed to user alienation)	[Hix and Hartson, 1993]
For large environments, include a navigational grid and/or a navigational map. When implementing maps, consider to [Darken and Sibert, 1995] map design principles	[Darken and Sibert, 1995]
Present domain-specific data in a clear, unobtrusive manner such that the information is tightly coupled to the environment and vice-versa. Strive for unique, powerful presentation of application-specific data, providing insight not possible through other presentation means	[Bowman et al., 1996]

Guidelines: Tracking User Location and Orientation	Reference
Assess the extent to which degrees of freedom are integrabile and seperabile within the context of representative user tasks. Eliminate extraneous degrees of freedom by implementing only those dimensions which users perceive as being related to given tasks. Multiple (integral) degrees of freedom input is well-suited for coarse positioning tasks, but not for tasks which require precision.	[Hinckley et al., 1994a] [Jacob et al., 1994] [Zhai and Milgram, 1993b]
When assessing appropriate tracking technology relative to user tasks, one should consider working volume, desired range of motion, accuracy and precision required, and likelihood of tracker occlusion.	[Applewhite, 1991] [Azuma, 1997] [Waldrop et al., 1995] [Strickland et al., 1994] [Sowizral and Barnes, 1993]
Calibration requirements for AR tracking systems should include: calibration methods which are statistically robust, a variety of calibration approaches for different circumstances, and, metrology equipment that is sufficiently accurate, convenient to use.	[Hollerbach and Wampler, 1996] [Summers, et al., 1999]
For testbed AR environments (i.e., those used for research purposes), calibration methods should be independent. That is, separate parts of the entire calibration should not rely on each.	[Summers, et al., 1999]
Relative latency is a source of misregistration and should be reduced.	[Jacobs, Livingston and State, 1997]
Devices should be both spatially and temporally registered (supports effective integration of user interaction devices which may vary in type, accuracy, bandwidth, dynamics and frequency).	[Jacobs, Livingston and State, 1997]
Match the number of degrees of freedom (in the physical device and interaction techniques) to the inherent nature of the task. For example, menu selection is a 2D task and as such should not require a device or interaction technique with more than two degrees of freedom.	[experience, observation]
Consider using a Kalman Filter in head tracking data to smooth the motion and decrease lag.	[Feiner, et al., 1993]
Trackers should be accurate to small fraction of a degree in orientation and a few millimeters in position.	[Azuma, 1993]
In head-tracked based AR systems, errors in measured head orientation usually cause larger registration offsets (errors) than object orientation errors do.	[Azuma, 1993]
Minimize the combined latency of the tracker and the graphics engine.	[Azuma, 1993]
Tracking systems (singleton tracking system or hybrids) should work at long ranges (i.e., support mobile users).	[Azuma, 1993]
Minimize dynamic errors (maximize dynamic registration) by 1) reducing system lag, 2) reducing apparent lag, 3) matching temporal streams (with video-based systems), and 4) predicting future locations.	[Azuma, 1997]

Guidelines: Data Gloves and Gesture Recognition	Reference
Allow gestures to be defined by users incrementally, with the option to change or edit gestures on the fly. Avoid gesture in abstract 3D spaces; instead use relative gesturing.	[Su and Furuta, 1994] [Hinckley et al., 1994a]

Guidelines: Speech Recognition and Natural Language	References
Strive for seamless integration of annotation, provide quick, efficient, and unobtrusive means to record and playback annotations. Allow users to edit, remove, and extract or save annotations	[Verlinden et al., 1993] [Harmon et al., 1996]

Guidelines: Visual Feedback -- Graphical Presentation	Reference
Timing and responsiveness of an AR system are crucial elements (e.g., effect user performance).	[Mynatt, et al., 1997]
Strive for consistency among the various visual (and other sensory) cues which are used to infer information about the combined virtual and real world.	[Drascic and Milgram, 1996]
For stereoscopic applications, employ headsets that support adjustable interpupillary distances (IPD) between approximately 45mm to 75mm.	[Drascic and Milgram, 1996]
Allow that user to optimize the visual display, (e.g., support user-controlled (and preset) illuminance and contrast levels.	[Drascic and Milgram, 1996]
Ensure that wearable display is sufficiently comfortable and optically transparent for the user.	[Feiner, 1999]
Minimize static errors by isolating and evaluating 1) optical distortion, 2) errors in the tracking system(s), 3) mechanical misalignments, and 4) incorrect viewing parameters (e.g., field of view, tracker-to-eye position and orientation, interpupillary distance)	[Azuma, 1997]

Researching Usability Design and Evaluation Guidelines for Augmented Reality (AR) Systems

Results

Researching Usability Design and Evaluation Guidelines
for Augmented Reality (AR) Systems