GVU Technical Report Number:
GIT-GVU-97-16
Title:
Characterizing World Wide Web Ecologies
Authors:
James E. Pitkow
Abstract:
One of the fastest growing sources of information today is the World Wide
Web (WWW), having grown from only fifty sources of information in January
of 1993 to over a half million four years later. The exponential growth of
information within the Web has created an overabundance of information
and a poverty of human attention, with users citing the inability to
navigate and find relevant information on the Web as one of the biggest
problems facing the Web today. The primary goal of the research
presented here is to put forth new techniques and models that can be
used to help efficiently manage peoples attentional processes when
dealing with large, unstructured, heterogeneous information
environments. The primary model is based upon the desirability of items
on the Web. This research searches for lawful patterns of structure,
content, and use. Methods are developed to exploit these patterns to
organize and optimize users® information foraging and sense-making
activities. These enhancements rely on predicting, categorization and
allocation of attention. Several methods are explored for inducing
categorical structures for the WWW. Some of these enhancements involve
clustering in a high-dimensional space of content, use, and structural
features. Others derive from cocitation analysis methods used in the
study of scientific communities. A user would also be aided by retrieval
mechanisms that predicted and returned the most likely needed WWW pages,
given that the user is attending to some given page(s). The approach of
this research uses a spreading activation mechanism to predict the
needed, relevant information, computed using past usage patterns, degree
of shared content, and WWW hyperlink structure.
Keywords:
World Wide Web, statistical analysis, categorization, clustering,
modeling, log file analysis
You can access this technical report via:
PDF
Postscript
 
|