[BKMA] info about OKAPI
Johnson, Calvin (NIH/CIT) [E]
johnson at mail.nih.gov
Thu Jan 18 19:46:03 EST 2007
Alex,
Thanks.
Is there some way that you could "condense" the equations to make them
more readable, perhaps using MS Word? Then resend to the group.
Jigar, after you receive the MS Word file from Alex, could you convert
to PDF and post to the portal?
Calvin
-----Original Message-----
From: Wang, Alex (NIH/CIT) [E]
Sent: Wednesday, January 17, 2007 10:45 AM
To: bkma at dcb.cit.nih.gov
Subject: [BKMA] info about OKAPI
OKAPI BM25 Ranking of FREETEXT
Rank = SUM[Terms in Query] w ( ( ( k1 + 1 ) tf ) / ( K + tf ) ) * ( ( k3
+ 1 ) qtf / ( k3 + qtf ) ) )
Where:
w is the Robertson-Sparck Jones weight.
In simplified form, w is defined as:
w = log10 ( ( ( r + 0.5 ) * ( N - R + r + 0.5 ) ) / ( ( R - r + 0.5 ) *
( n - r + 0.5 ) )
N is the number of indexed rows for the property being queried.
n is the number of rows containing the word.
K is ( k1 * ( ( 1 - b ) + ( b * dl / avdl ) ) ).
dl is the property length, in word occurrences.
avdl is the average length of the property being queried, in word
occurrences.
k1, b, and k3 are the constants 1.2, 0.75, and 8.0, respectively.
tf is the frequency of the word in the queried property in a specific
row.
qtf is the frequency of the term in the query.
FREETEXT ranking is based on the OKAPI BM25 ranking formula. FREETEXT
queries will add words to the query via inflectional generation
(inflected forms of the original query words); these words are treated
as separate words with no special relationship to the words from which
they were generated. Synonyms generated from the Thesaurus feature are
treated as separate, equally weighted terms. Each word in the query
contributes to the rank.
_______________________________________________
BKMA mailing list
BKMA at dcb.cit.nih.gov
http://dcb.cit.nih.gov/mailman/listinfo/bkma
More information about the BKMA
mailing list