Title: Return Results from RetrieveDocuments() Date Prepared: April 30, 1996 Date Needed: Priority: Routine Document Affected: Design Paragraphs Affected: 3.1, 4,1, 7.1.4 References: Change Required: Alter the RetrieveDocuments() operation to return a Collection of Documents which have a nil RawData component. These Documents will have a BaseDocument attribute which references another Document. Make a corresponding alteration to the RetrieveQueries() operation. This requires five changes to the architecture: 1) A change to the return value of RetrieveDocuments(). 2) A change to the return value of RetrieveQueries(). 3) A change to Documents to allow them to have a nil RawData component. 4) A change to Documents to specify the behavior of the BaseDocument attribute. 5) An addition of a CreateDocumentReferenceBySpecification() operation to allow the construction of a DocumentReference without needing the referenced Document present. Specific Recommendation: In paragraph 3.1, add the following operation to DocumentReference: CreateDocumentReferenceFromSpecification( ReferencedCollection: string, ReferencedDocumentId: string): DocumentReference creates a new DocumentReference with CollectionName equal to ReferencedCollection and DocumentId equal to ReferencedDocumentId. In paragraph 4.1, add the following paragraph to the description of Documents: A Document may have a BaseDocument Attribute, which identifies a Document from which this Document was derived. Those Tipster operations that use a Document's RawData component should, if the RawData component is nil, use instead the RawData component of the Document's BaseDocument. Such operations include DocumentCollectionIndex.Augment, QueryCollectionIndex.RetrieveQueries, Document.Annotate, Document.WriteSgml, and Collection.AnnotateCollection. If the Document's RawData is nil and there is no BaseDocument, then these operations should ignore the Document. In paragraph 4.1, modify the declaration of the RawData component of a Document: RawData: ByteSequence OR nil. Append the following to the description of the Annotations component: Annotations may contain information about a Document related to the current Document. The knowledge of the relationship between Documents must be maintained by the application (possibly using Attributes). Alter the third parameter of CreateDocument to read: RawData: ByteSequence OR nil In paragraph 7.1.4, replace the description of the Augment() operation of DocumentCollectionIndex with the following text: adds all the Documents in Collection to the DocumentCollectionIndex. If a Document in Collection has an attribute with name "BaseDocument", the DocumentCollectionIndex uses the Attribute's value as the "BaseDocument" value when the Document is returned as a result of a search. Replace the description of the RetrieveDocuments() operation of DocumentCollectionIndex with the following text: returns a collection of Documents (of maximal length NumberToRetrieve) corresponding to those Documents in the DocumentCollectionIndexes that are most closely related to the DetectionNeed from which the RetrievalQuery is derived. Each Document in the returned Collection will have a nil RawData component and an Attribute with name "BaseDocument" and, as value, a DocumentReference identifying the corresponding Document in the DocumentCollectionIndex. Replace the description of the RetrieveQueries() operation of QueryCollectionIndex with the following text: returns the Collection of DetectionNeeds (of maximal length NumberToRetrieve) corresponding to those DetectionNeeds which are most closely related to Document. Each DetectionNeed in the returned Collection will have a nil RawData component and an Attribute with name "BaseDocument" and, as value, a DocumentReference identifying the corresponding DetectionNeed in the QueryCollectionIndex. Reason for Proposed Change: There are two primary reasons for this proposal. The first relates to side-effects of the proposed change to the Document copy operations. The second relates to efficiency concerns in the QueryCollectionIndex and DocumentCollectionIndex. Under a proposed change to the Document copy operations, the Document Id can no longer be used to maintain the relationship between different versions of the same Document kept in different Collections. Therefore, some other mechanism is required to map the results of a search to the Documents originally added to the DocumentCollectionIndex. This RFC proposes such a mechanism. Currently, adding a Document to a Collection requires having the Document at hand. Thus, to construct its results Collection, a DocumentCollectionIndex must first retrieve all Documents that are found by a search. Retrieving a Document is a relatively expensive operation, possibly requiring file I/O. The proposed change eliminates the need for the DocumentCollectionIndex to retrieve the Documents: instead, the DocumentCollectionIndex only needs to construct empty Documents that reference the retrieved Document, an operation requiring no additional I/O. Similar arguments motivate the change to RetrieveQueries(). The change to allow a nil RawData component to Documents allows the returned Documents to be as light-weight as possible, thus minimizing the processing required of the DocumentCollectionIndex. The addition of the CreateDocumentReferenceFromSpecification() operation provides the DocumentCollectionIndex with a mechanism to create the required DocumentReference without having the referenced Document at hand. Change Evaluation: Applications Affected: Evaluator: Organization: Name: Phone Number: Date: Change Requested By: Organization: Logicon, Inc. Name: Joseph Dzikiewicz Phone Number: (703) 486-3500 x2227 Date: April 30, 1996 CCB Action: Approved Disapproved Hold Name: Title: Date: Ac Action: Approved Disapproved Name: Title: Date: