US 7,398,196 B1 | ||
Method and apparatus for summarizing multiple documents using a subsumption model | ||
Weiquan Liu, Beijing (China); and Joe F Zhou, Beijing (China) | ||
Assigned to Intel Corporation, Santa Clara, Calif. (US) | ||
Appl. No. 10/18,517 PCT Filed Sep. 07, 2000, PCT No. PCT/CN00/00265 § 371(c)(1), (2), (4) Date Aug. 19, 2002, PCT Pub. No. WO02/21324, PCT Pub. Date Mar. 14, 2002. |
||
Int. Cl. G06F 17/20 (2006.01); G06F 17/21 (2006.01) |
U.S. Cl. 704—1 [704/9; 715/277] | 15 Claims |
1. A computer-implemented method comprising:
parsing a plurality of paragraphs in a plurality of computer documents stored on a computer-readable medium, each document
with one or more of the paragraphs;
selecting paragraphs from the documents through a subsuming relation calculation including,
creating a link from terms in each paragraph to identical terms in substantially all of the other paragraphs, wherein terms
include noun phrases, verb phrases or entity names,
counting for each paragraph the number of links from the terms in the paragraph to the terms in other paragraphs,
denoting for each paragraph the number of links counted for that paragraph as the significant score of that paragraph,
ranking the paragraphs by the significant score,
selecting paragraphs based on the ranking, wherein paragraphs in the ranking that subsume the highest number of other paragraphs
are selected prior to other paragraphs in the ranking, and wherein a first paragraph subsumes a second paragraph if all noun
phrases verb phrases, and entity names contained in the second paragraph are also contained in the first paragraph;
aggregating the selected paragraphs into a summary and outputting the summary.
|