Mercator

Providing Access to Graphical User Interfaces for Computer Users Who Are Blind


Mercator provides access to X Windows applications for people who are blind by transforming the graphical interface into an interactive auditory interface. The project was begun in 1991 by the Multimedia Computing Group at the Graphics, Visualization & Usability Center at Georgia Tech.

Information on the X-Agent discussions and the RAP protocol is available here.


Related GVU Technical Reports:

Technical Report GIT-GVU-92-05: Mynatt, E. D., and Edwards, W. K. "The Mercator Environment: A Nonvisual Interface to the X Window System," February, 1992. Click HERE for PostScript version.

Technical Report GIT-GVU-92-28: Mynatt, E. D., and Edwards, W. K. "New Metaphors for Nonvisual Interfaces," 1992. Click HERE for PostScript version.

Other Publications:

Mynatt, E.D. and Weber, G., "Nonvisual Presentation of Graphical User Interfaces: Contrasting Two Approaches," in the Proceedings of the 1994 ACM Conference on Human Factors in Computing Systems (CHI'94), Boston, MA, April 24-28, 1994. Click HERE for ASCII version. Click HERE for PostScript version.

Mynatt, E.D. "Auditory Presentation of Graphical User Interfaces, " in Kramer, G. (ed) Auditory Display: Sonification, Audification and Auditory Interfaces, Santa Fe. Addison-Wesley: Reading MA., 1994. Click HERE for ASCII version. Click HERE for PostScript version.

Mynatt, E and Edwards, W. K., "Mapping GUIs to Auditory Interfaces," in the Proceedings of ACM Symposium on User Interface Software and Technology (UIST), 1992. Click HERE for ASCII version. Click HERE for PostScript version.

Edwards, W. K. and Rodriguez, T., Runtime Translation of X Interfaces to Support Visually- Impaired Users," in the Proceedings of the 7th Annual X Technical Conference, Boston, MA, January 8-20, 1993. Click HERE for ASCII version. Click HERE for PostScript version.

Edwards, W. K. Mynatt E., and Rodriguez, T., "The Mercator Project: A Nonvisual Interface to the X Window System," in The X Resource, Seastopol, CA. Issue #7, 1993. Click HERE for ASCII version. Click HERE for PostScript version.

Mynatt, E. and Edwards, W. K., "New Metaphors for Nonvisual Interfaces," book chapter to appear in Extraordinary Human-Computer Interaction, Edwards, A. (ed.), Addison Wesley, due 1994. Click HERE for PostScript version.

Edwards, W. K., Mynatt, E. D., and Stockton, K. "Providing Access to Graphical User Interfaces--Not Graphical Screens," in Proceedings of ACM Conference on Assistive and Enabling Technologies (ASSETS), Marina Del Rey, CA, November, 1994. Click HERE for ASCII version. Click HERE for PostScript version.

Edwards, W. K., Mynatt, E. D. "An Architecture for Transforming Graphical Interfaces," in Proceedings of ACM Conference on User Interface Software and Technology (UIST), Marina Del Rey, CA, November, 1994. Click HERE for PostScript version.


Research Abstract

One important breakthrough in human-computer interfaces is the development of graphical user interfaces. These interfaces provide graphical representations for system objects such as disks and files, interface objects such as buttons and scrollbars, and computing concepts such as multi-tasking. Unfortunately, these graphical user interfaces, or GUIs, disenfranchise a percentage of the computing population. Presently, many graphical user interfaces are all but completely inaccessible for computer users who are blind or severely visually-disabled.

Our work on this project began with a simple question, how could we provide access to X Windows applications for blind computer users. Historically, blind computer users had little trouble accessing standard ASCII terminals. The line-oriented textual output displayed on the screen is stored in the computer's framebuffer. An access program simply copies the contents of the framebuffer to a speech synthesizer, a Braille terminal or a Braille printer. Conversely, the contents of the framebuffer for a graphical interface are simple pixel values. To provide access to GUIs, it is necessary to intercept application output before it reaches the screen. This intercepted application output becomes the basis for an off-screen model of the application interface. The information in the off-screen model is then used to create alternative, accessible interfaces.

The typical scenario to providing access to a graphical interface is as follows: While an unmodified graphical application is running, an outside agent collects information about the application interface by watching objects drawn to the screen and by monitoring the application behavior. This outside agent (or screen reader) then translates the graphical interface into an auditory and/or tactile interface. Not only does the screen reader translate the graphical presentation into an nonvisual presentation, but the screen reader often provides different user input mechanisms which are more appropriate with the new interface.

The goal of this work, called the Mercator Project, is to provide transparent access to X Windows applications for computer users who are blind or severely visually-impaired. In order to achieve this goal, we needed to solve two major problems. First, in order to provide transparent access to applications, we needed to provide a framework which would allow us to monitor, model and translate graphical interfaces of X Windows applications without modifying the applications. Second, given these application models, we needed to support a methodology for translating graphical interfaces into non- visual interfaces. This methodology should mimic the advantages of GUIs in an nonvisual presentation.

GUI Models

The de facto standard graphical user interface for Unix environments is the X Window System. X Windows is based on a client-server architecture where X applications communicate with a display server over a network protocol. This protocol is the lowest layer of the X hierarchy. Xlib and the Xt Intrinsics provide two programming interfaces to the X protocol. Xlib establishes the concept of events and provides support for drawing graphics and text. The Xt Intrinsics establishes the concept of widgets or programmable interface objects and provides a basic set of widgets. Most people who develop X Windows applications use X toolkits such as Motif or Athena. These toolkits build on top of the Xt Intrinsics and provide many generic interface objects or widgets

The Mercator architecture captures and models application GUIs while the graphical application is executing. We use multiple strategies to gather information about the application GUI, and to interface with the application. First, we use a protocol to communicate with the underlying X libraries upon which the application is based. This protocol is implemented by extending the Xt Intrinsics. We use the protocol to obtain high-level information (widget level) about the application interface. We also use a hook into the Xlib layer to monitor low-level interface information (X packets) which may not be expressed in terms of widgets. Through our interaction with the Disability Access Committee on X (see below), we have worked with the X Consortium to extend the standard X Window System to include our access methods. From these two sources of information, we create an off-screen model of the application GUI based on the windows and widgets used by the application. We are then able to create an auditory presentation of the off- screen model, as well as substitute, via the XTest extension, user keyboard input for mouse input expected by the application.

We are coordinating our current design efforts with the Disability Action Committee on X (DACX) which is directed by Trace Research and Development Center. This committee is made up of Unix workstation vendors (Sun, DEC, IBM), researchers (i.e. Trace, our group at Tech), commercial access vendors (i.e. Berkeley Systems), the X Consortium, and other interested parties. The goal of the committee is to design and implement standard access solution to X Windows for people with various motor and sensory impairments. The X Consortium controls the yearly updates to the X Windows distribution. Bob Scheifler, the director of the X Consortium, has been working with the group to include various access solutions into the standard X distribution. Contact information for the DACX committee is included at the end of this document.

Audio GUIs

The primary interface design question addressed in this work is, given a model for a graphical application interface, what corresponding interface do we present for blind computer users. Our work has examined the trade-offs between tactile and auditory presentations as well as determining the degree to which Mercator should mimic the existing user visual interface.

Mercator interfaces are made up of auditory interface components which are related to graphical interface components such as menus, buttons, dialog boxes and so on. In addition to synthesized speech, auditory icons are used to identify the auditory interface components and auditory filters are used to convey attributes of those components. For example, a text-entry field could be represented by the sound of an old-fashioned typewriter and a low pass (muffling) filter could convey that the field is unavailable, i.e. grayed out in a graphical interface. The label for the field could also be read by the speech synthesizer.

Mercator provides a separate navigation method based on the hierarchy of the interface to replace the visual, spatial-oriented mouse navigation used in GUIs. We are also exploring the use of 3D spatialized sound to mimic the advantages of spatial organization used in graphical computing environments.

Commercialization - Sonic X

Since the beginning of this project, the primary goal of this work is to support the production of a commercial screenreader for X Windows. This work has not been conducted for research's sake alone, but with the intent of significantly affecting the accessibility of X Windows for people who are blind. Georgia Tech is currently working on locating support for creating a commercial version of Mercator (Sonic X) for accessing Motif applications.

Project Members and Sponsorship

The Mercator project is a joint effort by the Georgia Tech Multimedia Computing Group (a part of the Graphics, Visualization, and Usability Center) and the Center for Rehabilitation Technology. This work has been sponsored by the NASA Marshall Space Flight Center (Research Grant NAG8-194) and Sun Microsystems Laboratories, Inc.

The team is made up of the following people: