Rapid and effective access to information in large electronic
documentation systems can be facilitated if information relevant in
an individual
user's context can be automatically supplied to this user. However
most of this knowledge on contextual relevance is not found within
the contents of documents,
it is rather established incrementally by users during information
access. We propose a new model for interactively learning
contextual relevance during
information retrieval, and incrementally adapting retrieved
information to individual user profiles. The model, called a relevance
network, records the
relevance of references based on user feedback for specific queries
and user profiles. It also generalizes such knowledge to later derive
relevant
references for similar queries and profiles. The relevance network
lets users filter information by context of relevance. Compared to
other approaches,
it does not require any a priori knowledge nor training. More
importantly, our approach to adaptivity is user-centered. It
facilitates acceptance and
understanding by users by giving them shared control over the
adaptation without disturbing their primary task. Users easily
control when to adapt and
when to use the adapted system. Lastly, the model is independent
of the particular application used to access information, and supports
sharing of
adaptations among users.
As the amount of available electronic information is dramatically increasing, the ability for rapid and effective access to information in large electronic information systems is becoming critical. Access can be facilitated if information relevant to a particular user's needs can be automatically supplied to the user. The difficulty lies in computing the relevance of information: relevance does not only depend on the information content, but also on users' problem solving context (Boy 1990). Context typically includes particular user preferences, tasks being performed, or problems being solved. Knowledge on contextual information relevance is thus specific to individual users, and evolves dynamically with users' experience. Therefore, it is very difficult to acquire a priori for indexing purpose. We believe that contextual knowledge can only be acquired during information access. Current information retrieval and hypermedia tools do not offer any mechanism (besides simple annotations and bookmark facilities) to let users add this knowledge to their documents, and reuse it in similar contexts over time.
This paper proposes a new model for interactively learning contextual relevance during information retrieval, and incrementally adapting retrieved information to individual user profiles. This model is based on the metaphor of a computerized personal librarian assistant which remembers which information was relevant to particular users under various contexts. It learns over time from its interaction with users, and automatically retrieves and ranks the information based on its contextual relevance.
In the following section, we present some of the limitations of
previous approaches to adaptive interactive systems from the point
of
view of learning contextual relevance for information access. We
also compare our approach to related work on adaptive information
retrieval and hypertext.
In the third section, we give an overview of our approach to user-
centered adaptive information retrieval based on a network of
contextual information relevance.
Details on how this relevance network is automatically built, updated
and used to adapt retrieval, are presented in the fourth section. The
fifth section of
this paper presents future directions for this work.
Research on adaptive interactive systems is a very active field, especially in the domain of adaptive user interfaces (Browne et al. 1990; Norcio & Stanley 1989; Schneider-Hufschmidt et al. 1993). It is widely accepted that adaptivity requires to maintain a user model embedded in the system (Murray 1987). The canonical user modeling approach is a knowledge-intensive method that requires a priori encoding of a cognitive user model, as well as a domain model (Sullivan & Tyler 1991). The canonical user model is usually normative, i.e., based on stereotyped classes of users (Rich 1989), and prescriptive, i.e., user tasks or intents are deduced from observed actions (Kobsa 1991). These models are costly and difficult to acquire, and cannot easily be updated or customized during use (Benyon & Murray 1993). Adaptation is usually performed automatically without any user control; and given the complexity of the underlying models, the behavior of the adaptive system is difficult to explain to users, without disturbing their primary task.
Other approaches to adaptive user interfaces rely on individual and descriptive user models, which are dynamically learned by observing user's behavior (Crow & Smith 1993). They are used for the automatic detection of patterns of use, for example, to learn repetitive sequences of actions (Cypher 1991), or to record user preferences based on frequency of use. Adaptation is performed automatically or suggested by the system (Thomas & Krogsaeter 1993). These methods need to collect a large amount of data before the models can be useful. Moreover, users have no control over when to initiate adaptations. At best, they can only approve or ignore suggested adaptations.
In the information retrieval domain, methods for adapting semantic indexing and hypertext links have been investigated by several authors. Most methods rely on a specialized retrieval structure (semantic index) whose parameters get tuned by use, but require a large amount of a priori knowledge (Frisse & Cousin 1989). On the other hand, connectionist approaches require an extended training period to attain a state of fertility (Belew 1989). Other hypertext methods utilize "learning by observation" techniques to automatically learn patterns of navigation based on frequency of nodes accesses (Kibby & Mayes 1989; Monk 1989). In most cases, the user has no or limited control over the adaptive mechanism.
Our work is also related to previous approaches based on user
relevance feedback (Frisse et al. 1991; Salton 1989), but extends
them by capturing relevance information over time and across
several searches, instead of refining the formulation of a single
query. In a similar manner, (Baudin et al. 1993) also rely on user
feedback during retrieval to acquire and refine conceptual indices,
but the method used to update and derive relevance measures is
quite simple, and directly modifies indices used for retrieval. Thus,
the user has no control on whether to use the adapted indices or not,
and cannot share personal adaptations with other users. A similar
approach to adaptive hypertext navigation is also proposed in
(Kaplan et al. 1993). Their model memorizes user preferences about
relevance between topics in an associative matrix. User controls
when to adapt by reordering the list of topics retrieved by the
system. But giving ranking feedback requires more user
involvement than direct relevance feedback, and it is not clear how
well this model scales up to large documents.
We propose a new method for adapting information access to individual users profiles based on contextual information relevance. Since we cannot rely on a priori knowledge, we chose to use an individual and descriptive user model to capture the contextual relevance of information for individual users profiles. Our user model contains tasks and users names to describe profiles, and a relevance network incrementally built from user feedback. Unlike most information retrieval methods, the model we propose here is not aimed at establishing the underlying information structure of the application domain to facilitate retrieval. Instead, our objective is to build a hypermedia indexing structure of contextual relevance pertaining to specific user needs, based on previous usage and feedback information, and to support sharing of adaptations among users. Our adaptive method utilizes a compositional relevance network to learn contextual relevance information in a rapid, cost- effective, and incremental manner. It does not rely on any a priori knowledge. Since the cost of providing feedback has to be balanced by its reusability, this method is appropriate for single users who regularly use a large set of documents, and for multiple users who work collaboratively on the same set of documents.
The model is intended to work in conjunction with other hypertext or information retrieval systems, which provide non- adapted information access and navigation (Figure 1). Therefore, we designed the model to be independent of the particular application used to access information. Information is retrieved by clicking on a hypertext link or by executing an explicit query. Non-adapted retrieval is performed by an hypertext system using stored hypertext links, or by a search engine using stored indices. Adapted retrieval is performed by our adaptive system using knowledge stored in the relevance network, pertaining both to links and indices contextual relevance.
Furthermore, our adaptive process is user-centered. Analysis of current research indicates that the most promising approaches are those which give the user more control over the adaptation process (Kuhme, et al., 1992). However, tools for controlling and modifying the adaptive capability might introduce tangential tasks for the user, and add another layer of complexity. The objective should be to develop user-centered adaptive techniques which provide both user control, and ease of control and use. We present an example of such a user-centered approach to adaptivity. It facilitates acceptance and understanding by users, by giving them shared control over the adaptation without disturbing their primary task. Users control when to adapt and when to use the adapted system with a non intrusive user interface.
Users control when to initiate adaptation by providing direct feedback on the relevance of retrieved information (Figure 1, "thumb up and down" buttons). Users also control when to use the adapted system: the method provides a user-defined filter to retrieve and rank information based on its relevance in context. By turning off this filter, users can, however, ignore previous adaptations and access all the existing information at any time (Figure 1, "ignore or reuse previous feedback").
Figure 1: Adaptive hypertext and information retrieval overview
In a typical information retrieval interaction with our adaptive system, the user first specifies the current context, by selecting her/his user profile among several existing profiles. This user profile information is used to personalized future relevance feedback. It contains information about the user and tasks being performed (e.g., "Nathalie Mathe, astronaut, Hubble repairs, on orbit"). It can be changed by the user at any time.
The user then executes a query to retrieve a set of references relevant for this query under the current context (e.g., find all references containing the word "Telescope", similar to reference "2.1.4 Solar Array", and with the bookmark "free floating"). Based on previous feedback from the user, the adaptive system retrieves and displays in the Hit List window a ranked list of references that were found relevant for the same or similar queries under the same or similar user profiles (Figure 2 - Hits box). Using the filtering capability at the bottom of the Hit List window (Figure 2), the user can dynamically view various list of references relevant for the same query, but under various set of tasks and/or for different users (1). The system maintains a separate relevance network for each user, as well as a system network which integrates feedback information from all users. This supports knowledge sharing among users, by letting them access tailoring done by others on the same documents. Users can also ignore adaptations and display all existing references at any time (i.e., full list of hypertext links, or full list of search results using non-adapted retrieval).
In looking for information relevant to her/his query, the user accesses the content of some of the references retrieved with the network, as well as other references accessed using non-adapted retrieval. When the user finds some reference containing part of the information she/he was looking for, she/he marks it as relevant by giving positive feedback to the system (Figure 2 - "Success" button). Users can customize any reference, and not only these previously retrieved from the network. This prevents the system from narrowing its suggested list of references over time. The adaptive system memorizes in the relevance network that this reference was found relevant under the current user profile, and generalizes this relevance information for future queries.
A simplified version of this adaptive information retrieval mechanism was implemented within the Computer Integrated Documentation (CID) hypertext system (Boy 1990; Mathe & Boy 1992), and tested with real documentation users (Mathe 1993). Testing results were used to refine user requirements for the new compositional relevance network, which we are now integrating within CID. Moreover we did a qualitative evaluation of the ease of use and acceptance by users of the adaptive mechanism. Users gave relevance feedback each time they found some piece of information they were looking for, in order to mark it for future use. This let them build personalized views of documents over time. The required amount of feedback was minimal (a button click) and didn't disturb their task. Users felt they were the ones teaching the system, as they could choose whether or not to continually augment and refine the intelligence of the adaptive system. In the following section, we describe the structure of the relevance network, its learning method based on user feedback, and its information retrieval method. We finish the section with a discussion of the efficiency and computational cost of the relevance network.
Figure 2: Hit List window displays list of retrieved references
relevant for the selected user profile (at the bottom). Success and
Failure buttons are used to give relevance feedback on an individual
reference.
We describe our approach, called a compositional relevance
network, to model user preferences on contextual information
relevance. The network memorizes information on the relevance of
references based on user feedback for specific queries and profiles.
It also generalizes such information to facilitate future retrievals for
similar queries and profiles. Lastly, the network serves as a user-
oriented contextual filter for adaptive information retrieval.
A relevance network records measures of relevance of output nodes with respect to input nodes. For information retrieval purpose, an output node corresponds to a reference, which can be a document, a chapter, a section, a paragraph, or any marked location within a document. There are two types of input nodes: basic and composite nodes. A basic input node corresponds to a descriptor. A descriptor can be a keyword in the index, a sequence of words in the text, or a user-defined task in a user profile. A descriptor can also be a reference, in order to retrieve other related references (this corresponds to associative hypertext links). A composite input node corresponds to a combination of descriptors representing a complex user query. We generalize the notion of a user query to include keywords, and/or user profile descriptors (as opposed to keywords only, in most retrieval systems). Composites nodes are described in more details in the next section.
Figure 3 shows an example of a simple relevance network with only basic input nodes. Nodes in the top layer represent output nodes. Nodes in the lower layer represent input nodes. A user query is interpreted as an input activation pattern by the relevance network. A boolean activation value of an input node denotes whether the corresponding descriptor is a member of the current query. An activation value on an output node denotes the relevance of the corresponding reference, conditioned by the current user query encoded in the input layer.
Figure 3: An example of a simple relevance network
Associated with each connection from an input node to an output node is a relevance measure between the corresponding descriptor and reference. A network is initially empty (2). As a user specifies queries and provides positive or negative feedback on the relevance of retrieved references, input and output nodes that do not exist yet in the network are created, and relevance measures associated with the connections are adjusted accordingly. A relevance measure, in its simplest form, is defined as the relative frequency of positive user feedback for a reference given a descriptor. It is an unbiased estimate of the probability that a reference is relevant to a descriptor. More formally, each relevance measure is maintained as a fraction, defined as the quotient of the number of positive feedback, S, and the number of total feedback, N. That is, a relevance measure Rij of a reference j with respect to a descriptor i is
Maintaining the total number of feedback in the denominator facilitates an accurate recording of both the relevance of the reference and the sampling precision of such relevance. Composite Nodes The relevance model described thus far records only relevance information based on single descriptors. User preferences for particular composite queries cannot be saved, and non-trivial relations between references and descriptors cannot be encoded. To retain better information from user feedback, the adaptive relevance network accommodates composite nodes in the input layer, as illustrated in Figure 4. A new composite node, corresponding to a user query, is added to the relevance network when a user provides feedback upon retrieval (3). It is assumed that the relevance information associated with a composite node is more specific than the information associated with its nested subset nodes or its basic input nodes (corresponding to its descriptors). Therefore, during retrieval, a user query is first matched against highest level composite nodes, rather than lower level nodes.
Figure 4: Input layer of a relevance network with composite nodes. Arrows denote subset relations. Output nodes and relevance connections are not displayed.
The use of composite nodes enables the system to encode
reference relevance for specific user queries, as well as to derive
generalized relevance measures from previous queries. We first
described how user queries and associated reference relevance
measures are memorized into the network upon user feedback. This
information is generalized to enable the derivation of relevant
references for future new queries. Lastly, we described how the
network is used to retrieve relevant references for a query, and the
associated cost of computation.
When a user provides positive or negative feedback for a reference given the current query, this relevance information is memorized and generalized accordingly in the network. Four cases of memorization are possible depending on whether the reference was in the list of retrieved references from the network (4), and on whether the user query was already memorized in the network. They are described in the table below.
In order both to memorize exact user feedback on specific queries, and to derive relevance measures for new queries in the future, nodes which are more general than the user query inherit feedback: relevance measures from all proper query subsets including basic input nodes are updated. We describe below how relevance measures are updated or initialized.
Updating Relevance Measures. As mentioned above, relevance measures from an input node can be adjusted either through direct user feedback from a query of exact match to that node, or through feedback propagated from its superset composite nodes. Direct feedback provides more accurate information pertaining to the node than inherited feedback. To compromise between customization and generalization, a weight constant C (integer > 0) is added to the relevance feedback adjustment: if C = 1, inherited feedback is as important as direct feedback; and the relative importance of inherited feedback decreases when C increases. Relevance measures are updated as follows:
Initializing Relevance Measures.
If the reference was not retrieved from the network (cases 3 and 4),
a new connection between an input node and the reference is
established and initialized with Sold = 0, Nold = K, with K an integer >
0. K correspond to an initial negative bias. Such a negative bias is
justified with the assumption that any reference that can be
retrieved from the network is by default more useful than
references not recommended by the network. An additional benefit
incurred by the use of negative bias is that the network has better
resolution for positive feedback. If the reference was retrieved by
derivation from subset nodes (case 2), the relevance measure of the
new composite node is initialized to the derived relevance measure
of this reference, as described in the next section.
In response to a query, the network retrieves and displays a list of references with the highest relevance measures. We propose a retrieval algorithm which makes use of information on nodes most related to the user query. It then aggregates such information in a way least biased by the availability of various query subsets, and the information redundancy among them. Relevance measures are collected only from the top-level subsets closest to the query. Equal importance is given to all query descriptors. And for each descriptor, relevance measures from composite nodes are pooled together according to their statistical accuracy. This retrieval algorithm is illustrated on Figure 5, and detailed in the following paragraphs.
Figure 5: An example of retrieval by derivation from subset query nodes for one reference
Retrieval by exact match. Upon presentation of a query, if a composite node corresponding to the query already exists in the network, the relevance measures from that node to associated references are directly used to generate a ranked list of relevant references.
Retrieval by derivation. If a matching composite node does not exist, relevance measures from other input nodes, corresponding to proper subsets of the query, are used to derive a list of relevant references. For a relevance network with only basic input nodes, and under the simplistic assumptions of equal importance and independence among descriptors of a query, the measure of relevance of a reference with respect to a query can be estimated by taking the product of the relevance measures of that reference with respect to the descriptors of the query. While such assumptions are far from realistic, better general purpose derivation rules cannot be easily obtained without semantic knowledge of the query descriptors, or without statistical information of reference relevance.
We propose a new retrieval heuristic for deriving more accurate relevance information for a composite query, with no a priori knowledge. With the presence of composite nodes corresponding to subsets of the query, relevance information is derived only from the top-level subsets (5), i.e., the ones not nested within other subsets. However, derived relevance measures for a query cannot be appropriately obtained by simply taking the product of top-level subsets relevance measures. Top-level subsets of a query can be of different sizes, and are not necessarily disjoint. References connected from one composite node may not be connected from another. Multiplicative measures, therefore, can be biased toward certain query components. We propose a heuristic derivation algorithm intended to provide impartial relevance estimations. For simplicity, we describe the relevance derivation algorithm for a single reference, hence subscript reference indices are neglected in the formulae that follow.
For each individual descriptor in a query, a pooled estimate of relevance measure is obtained from the top-level subsets which contain that descriptor. Let CompQ denote the set of descriptors in the user query Q, and Stop the set of top-level subsets of Q. The pooled estimate of relevance for a descriptor di in CompQ is
An estimate of the compositional relevance for the query Q is then computed as the nth root of the product of the pooled estimates of all descriptors, where n is the number of descriptors. The derived composite relevance measures are then compared for all references and ordered to generate the suggested list of references (6).
If the algorithm were applied to a composite node of exact match to a query, it would derive the exact relevance measures carried directly on the connections from that composite node. In a query derivation where no subset composite node is present in the network, the algorithm degenerates to simple multiplicative derivation over basic input nodes. When applied to a query with disjoint top-level subsets, the algorithm is equivalent to taking the root of the product of all top-level subsets, each to the power of its own size. An example is shown in Figure 6.
Figure 6: An example of composite derivation with disjoint top-
level subsets
The composite relevance derivation algorithm requires a search
for all proper query subsets, which computationally is exponential
with respect to the size of the query. For pragmatic information
retrieval purpose, the number of descriptors in a query is usually
small. Furthermore, the cost of computation is also linearly bound by
the total number of composite nodes in the network. The size of the
network grows as new composites nodes and new relevance
connections are added upon user feedback. The size of the relevance
network therefore can have significant impact on the performance of
retrieval. Thus, it is important to keep the relevance network under
a manageable size to ensure timely retrieval and reasonable space
usage. This is dealt with in two ways. First, a node cutting
procedure is used to recycle composite nodes when the network size
has reached a specified limit. The nodes to be replaced are those
with least impact to the system's pattern of retrievals. Cutting
composite nodes not only maintains the size of the relevance
network, but also facilitates better generalization. Secondly, a
connection trimming procedure is used to delete connections whose
relevance measure is less than a given threshold.
An important future extension to the model will be to create new composite nodes by automatically extracting common intersections of user queries. Conversely, the cutting of composite nodes is entirely usage based and also warrants further exploration. A different direction for future work is to explore reference dependency among composite nodes. Only subset relations are currently used in the derivation of composite relevance.
In order to enable sharing of adaptations between a large number of users, it would be useful to have a common vocabulary for describing user profiles. This could be done by acquiring a domain model and enforcing a fixed vocabulary on users; or we could learn this common vocabulary by applying machine learning techniques to the set of user profiles defined by users after a period of time.
We are currently designing simulation protocols to
evaluate the precision and robustness of a relevance network
depending on various user behaviors and various documents types.
Proving the value of an adaptive system requires testing in an
operational environment over an extended period of time. We plan
to integrate our adaptive mechanism within the Hyperman hypertext
system which will be used by Space Shuttle flight controllers in
Mission Control, at NASA Johnson Space Control Center.
We have presented a user-centered approach to adaptive
hypertext and information retrieval, based on a new subjective
information relevance model. The ability of our model to
automatically acquire and reuse the context in which previous
information was found relevant is unique. As opposed to other
approaches, our model employs a simple adaptive algorithm
embedded in a dynamic indexing architecture based on user
feedback. It does not require any a priori specialized index
structure, nor any a priori statistical knowledge or computation.
While the feedback-based relevance adjustment algorithm tunes the
reference relevance measures, the usage-based composition process
maintains a useful set of structural components in the network. A
relevance network can adapt to specific user needs, or it can
generalize over multi-user information requirements. The relevance
network model provides a general purpose filtering tool pertaining to
user needs without tackling content-specific semantic information.
Yet, the model can be used in conjunction with standard statistical
models to provide user-centered adaptivity. Users control when to
initiate adaptation and when to use the adapted system. Controlling
is done with minimum additional interaction, using a non intrusive
user interface seamlessly integrated with their information retrieval
task. Since the model is independent of the particular application
used to access information, it supports sharing of adaptations among
users. For example, a novice can access the information relevance
model of an experienced user, and use it to filter information under
similar contexts, or use it as a starting point for his/her own
adaptations.
Thanks to Guy Boy, who initiated this project, and first proposed
the concept of context-sensitive indexing and retrieval. Thanks to
Josh Rabinowitz and Bharathi Raghavan for their help in developing
CID. Thanks to Celeste Plaisance for her support in evaluating CID,
and to our test users Dave Van Pelt and Chelo Picardal. Thanks to
Lise Getoor and John Allen for useful comments on this paper.