)O8:43'~ core docuinents delimited by ~DOGNOs lines from ThEC data or fourth 10 questions on TREC list 920527-29. 0601, 0602, 0603 usage: swk -f TREcacore.Q.ll-40.awk .`ypically will want to do aoeething like; zcst ws1/l9~/= .1 I tr a-z A-Z ; awk -f TRECscore.O.ll-40.awk >Q.ll-40.ThEcacorea.out (isaybe prefixed by nohup)) then typically will want to do something like; sort -n +1 TREcscores.out I tail -1000 ;=0ll.beat etc.... this proqras; reads in format; i~; sDOC;;Os/ ( prOntf )~%-i0s %5d %Sd %Sd %5d %Sd %Od %Od %Od %Od %5d\n~, docno, ml, ml, ml, mA, mO, m6, m7, ma, m9. slO); docno = $2; ml = 0; m2 = 0; ml = 0; mA = 0; = 0; mO = 0; ~SOm;0 ~ TRECscore.Q=31 AO.gawk = 0; = 0; m9 = 0; mio= 0; ala = 0; sib = 0; sic = 0; ala = 0; sib = 0; ala = 0; sIb = 0; sic = 0; a4a = 0; eAb = 0; ass = 0; s5b = 0; s5c = 0; sSd = 0; a6a = 0; s6b = 0; s6c = 0; s7a = 0; s7b = 0; 565 = 0; s9b = 0; sBc = 0; a9a = 0; s9b = 0; slOs = 0; slOb = 0; * topic 011 --- advantages of OS/2 /OS\/2/ I sla == 0; /ADVANTAOISTRENGTH! { sib += 5; /WINDO;;SIX.WINDOWS)DOS/ I sic += 5; Si = sia - sib sic; if (Si > ml) ml = SI; sla = .9; slb `= .9; sic == .9; * topic 012 --- outsourcing computer work /CONTRACT. OUT I OUTSOORCZNG / sia += 5; ICOMPUTER I DATA I NETWORKI sib += 5; Si = s2* sib; if )s2 5 ml) ml = s2; sia == .9; s2b == .9; * topic 031 --- companies capable of producing document management systems IDOCUMENTI I ala += 5; IMANAGEMENT I PROCESSING I AUTOMATION (OCR) O~ICAL CMARACTER RECOGNI/ I sIb + = 5; ,COMPANYICORPICO\.IINC\.ILTD\.IINCORPORATED/ ) sIc += 5; 51 = ala = sIb sIc; if (51 > ml) ml = 51; ala == .9; sIb == .9;