Information on the X-Agent discussions and the RAP protocol is available here.
Technical Report GIT-GVU-92-28: Mynatt, E. D., and Edwards, W. K. "New Metaphors for Nonvisual Interfaces," 1992. Click HERE for PostScript version.
Mynatt, E.D. "Auditory Presentation of Graphical User Interfaces, " in Kramer, G. (ed) Auditory Display: Sonification, Audification and Auditory Interfaces, Santa Fe. Addison-Wesley: Reading MA., 1994. Click HERE for ASCII version. Click HERE for PostScript version.
Mynatt, E and Edwards, W. K., "Mapping GUIs to Auditory Interfaces," in the Proceedings of ACM Symposium on User Interface Software and Technology (UIST), 1992. Click HERE for ASCII version. Click HERE for PostScript version.
Edwards, W. K. and Rodriguez, T., Runtime Translation of X Interfaces to Support Visually- Impaired Users," in the Proceedings of the 7th Annual X Technical Conference, Boston, MA, January 8-20, 1993. Click HERE for ASCII version. Click HERE for PostScript version.
Edwards, W. K. Mynatt E., and Rodriguez, T., "The Mercator Project: A Nonvisual Interface to the X Window System," in The X Resource, Seastopol, CA. Issue #7, 1993. Click HERE for ASCII version. Click HERE for PostScript version.
Mynatt, E. and Edwards, W. K., "New Metaphors for Nonvisual Interfaces," book chapter to appear in Extraordinary Human-Computer Interaction, Edwards, A. (ed.), Addison Wesley, due 1994. Click HERE for PostScript version.
Edwards, W. K., Mynatt, E. D., and Stockton, K. "Providing Access to Graphical User Interfaces--Not Graphical Screens," in Proceedings of ACM Conference on Assistive and Enabling Technologies (ASSETS), Marina Del Rey, CA, November, 1994. Click HERE for ASCII version. Click HERE for PostScript version.
Edwards, W. K., Mynatt, E. D. "An Architecture for Transforming Graphical Interfaces," in Proceedings of ACM Conference on User Interface Software and Technology (UIST), Marina Del Rey, CA, November, 1994. Click HERE for PostScript version.
Our work on this project began with a simple question, how could we provide access to X Windows applications for blind computer users. Historically, blind computer users had little trouble accessing standard ASCII terminals. The line-oriented textual output displayed on the screen is stored in the computer's framebuffer. An access program simply copies the contents of the framebuffer to a speech synthesizer, a Braille terminal or a Braille printer. Conversely, the contents of the framebuffer for a graphical interface are simple pixel values. To provide access to GUIs, it is necessary to intercept application output before it reaches the screen. This intercepted application output becomes the basis for an off-screen model of the application interface. The information in the off-screen model is then used to create alternative, accessible interfaces.
The typical scenario to providing access to a graphical interface is as follows: While an unmodified graphical application is running, an outside agent collects information about the application interface by watching objects drawn to the screen and by monitoring the application behavior. This outside agent (or screen reader) then translates the graphical interface into an auditory and/or tactile interface. Not only does the screen reader translate the graphical presentation into an nonvisual presentation, but the screen reader often provides different user input mechanisms which are more appropriate with the new interface.
The goal of this work, called the Mercator Project, is to provide transparent access to X Windows applications for computer users who are blind or severely visually-impaired. In order to achieve this goal, we needed to solve two major problems. First, in order to provide transparent access to applications, we needed to provide a framework which would allow us to monitor, model and translate graphical interfaces of X Windows applications without modifying the applications. Second, given these application models, we needed to support a methodology for translating graphical interfaces into non- visual interfaces. This methodology should mimic the advantages of GUIs in an nonvisual presentation.
The Mercator architecture captures and models application GUIs while the graphical application is executing. We use multiple strategies to gather information about the application GUI, and to interface with the application. First, we use a protocol to communicate with the underlying X libraries upon which the application is based. This protocol is implemented by extending the Xt Intrinsics. We use the protocol to obtain high-level information (widget level) about the application interface. We also use a hook into the Xlib layer to monitor low-level interface information (X packets) which may not be expressed in terms of widgets. Through our interaction with the Disability Access Committee on X (see below), we have worked with the X Consortium to extend the standard X Window System to include our access methods. From these two sources of information, we create an off-screen model of the application GUI based on the windows and widgets used by the application. We are then able to create an auditory presentation of the off- screen model, as well as substitute, via the XTest extension, user keyboard input for mouse input expected by the application.
We are coordinating our current design efforts with the Disability Action Committee on X (DACX) which is directed by Trace Research and Development Center. This committee is made up of Unix workstation vendors (Sun, DEC, IBM), researchers (i.e. Trace, our group at Tech), commercial access vendors (i.e. Berkeley Systems), the X Consortium, and other interested parties. The goal of the committee is to design and implement standard access solution to X Windows for people with various motor and sensory impairments. The X Consortium controls the yearly updates to the X Windows distribution. Bob Scheifler, the director of the X Consortium, has been working with the group to include various access solutions into the standard X distribution. Contact information for the DACX committee is included at the end of this document.
Mercator interfaces are made up of auditory interface components which are related to graphical interface components such as menus, buttons, dialog boxes and so on. In addition to synthesized speech, auditory icons are used to identify the auditory interface components and auditory filters are used to convey attributes of those components. For example, a text-entry field could be represented by the sound of an old-fashioned typewriter and a low pass (muffling) filter could convey that the field is unavailable, i.e. grayed out in a graphical interface. The label for the field could also be read by the speech synthesizer.
Mercator provides a separate navigation method based on the hierarchy of the interface to replace the visual, spatial-oriented mouse navigation used in GUIs. We are also exploring the use of 3D spatialized sound to mimic the advantages of spatial organization used in graphical computing environments.
The team is made up of the following people:
The Multimedia Computing Group at Georgia Tech has been involved in a number of audio-related projects, including NetAudio, a networked audio server that provides a number of audio effects. This server is supported by Mercator. Also, the Spatialized Sound research effort has produced a system which can implement synthetic "three-dimensional" (localized) sound sources using software on UNIX workstations.
The Enabling Technologies group at Sun Microsystems Laboratories works on developing access technologies for the open systems marketplace. Sun Microsystems Laboratories is a sponsor of our research.