Comparison of total time to process each file type in "tika_1_6" vs. "tika_1_8_SNAPSHOT"

DETECTED_CONTENT_TYPE_A ELAPSED_TIME_MILLIS_A NUM_FILES ELAPSED_TIME_MILLIS_B NUM_FILES
 end-functional polystyrene, interdiffusion, neutron reflectometry, surface, thin film , Diffusion, Reflectometry, Thin Films; c
15
1
10
1
Public Affairs Officer, USAID/WBG
204
1
14
1
This USAID/Timor-Leste page describes the programmatic activities of USAID in Timor-Leste.
18
1
7
1
application/fits
5213
86
2893
86
application/gzip
93064
1596
69685
1596
application/msword
3087245
16265
1919531
16265
application/msword2
577
18
169
18
application/octet-stream
12355
391
6773
379
application/pdf
20129359
52024
18159038
52024
application/postscript
102060
2484
76858
2484
application/rdf+xml
610
25
567
25
application/rss+xml
3620
14
2093
14
application/rtf
15494
246
10441
246
application/vnd.framemaker
115
3
64
3
application/vnd.google-earth.kml+xml
898
42
563
42
application/vnd.ms-excel
1183461
7875
752968
7759
application/vnd.ms-powerpoint
4745267
12063
2952413
12063
application/vnd.ms-xpsdocument
328
1
123
1
application/vnd.openxmlformats-officedocument.presentationml.presentation
44515
35
30891
35
application/vnd.openxmlformats-officedocument.presentationml.slideshow
1593
1
289
1
application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
551
2
160
2
application/vnd.openxmlformats-officedocument.wordprocessingml.document
5569
18
8617
18
application/vnd.sun.xml.impress
37
1
65
1
application/x-123
64
3
46
3
application/x-bibtex-text-file
156
6
45
6
application/x-elc
15
1
45
1
application/x-executable
19
1
6
1
application/x-iso9660-image
251
1
20
1
application/x-shockwave-flash
3066
78
2782
78
application/x-stuffit
8
1
19
1
application/x-tex
478
26
370
26
application/x-tika-msworks-spreadsheet
143
2
442
2
application/xhtml+xml
55181
1454
40476
1454
application/xhtml+xml; charset=UTF-8
6
1
4
1
application/xhtml+xml; charset=iso-8859-1
589
11
313
11
application/xhtml+xml; charset=utf-8
556
25
956
25
application/xml
61585
1993
42710
1993
application/xml; charset=UTF-8
11937
353
8573
353
application/zip
2268
49
1413
49
image/g3fax
8
1
14
1
image/gif
115437
4103
75548
4103
image/jpeg
566079
14908
396322
14908
image/png
18781
728
15930
728
image/tiff
1109
5
234
5
image/x-ms-bmp
408
16
450
16
image/x-portable-bitmap
39
2
436
2
image/x-portable-pixmap
3
1
14
1
message/rfc822
14536
463
13273
463
model/vnd.dwf
425
16
365
16
noindex
20
1
36
1
text-html; charset=Windows-1252
150
5
47
5
text/css
127
3
32
3
text/css; charset=ISO-8859-1
229
10
144
10
text/css; charset=iso-8859-1
286
4
92
4
text/html
12132
321
9964
321
text/html   charset=iso-8859-1
35
1
10
1
text/html charset=ISO-8859-1
69
6
59
6
text/html' charset=iso-8859-1
241
9
182
9
text/html+xml; charset=UTF-8
24
1
19
1
text/html/ charset=iso-8859-1
8
1
9
1
text/html; chaobjrset=windows-1252
874
2
775
2
text/html; charset=
954
24
527
24
text/html; charset="iso-8859\<sup\>-1\<\/sup\>"
6
1
8
1
text/html; charset=0
6
1
10
1
text/html; charset=10646
1787
45
1663
45
text/html; charset=8859-1
171
6
65
6
text/html; charset=EUC-JP
212
3
94
3
text/html; charset=GB18030
720
23
345
23
text/html; charset=IBM437
365
9
200
9
text/html; charset=IBM500
125
1
28
1
text/html; charset=IBM855
14
1
12
1
text/html; charset=IBM866
30
2
47
2
text/html; charset=ISO-2022-JP
121
1
17
1
text/html; charset=ISO-8859-1
530429
11552
392536
11552
text/html; charset=ISO-8859-15
67
2
60
2
text/html; charset=ISO-8859-9
19
1
11
1
text/html; charset=KOI8-R
4
1
12
1
text/html; charset=Shift_JIS
20
2
30
2
text/html; charset=US-ASCII
583
22
419
22
text/html; charset=UTF-16
905
17
334
17
text/html; charset=UTF-16LE
37
1
22
1
text/html; charset=UTF-32LE
44
1
49
1
text/html; charset=UTF-8
315660
8404
225290
8404
text/html; charset=WINDOWS-1251
36
3
24
3
text/html; charset=WINDOWS-1252
329
3
290
3
text/html; charset=Windows-1252
2474
76
1630
76
text/html; charset=big5
65
2
68
2
text/html; charset=csVISCII
14
1
23
1
text/html; charset=euc-kr
23
1
15
1
text/html; charset=gb2312
171
5
118
5
text/html; charset=iso-10646
15
1
8
1
text/html; charset=iso-2022-jp
28
1
14
1
text/html; charset=iso-8859-1
488036
13233
368787
13233
text/html; charset=iso-8859-15
327
2
62
2
text/html; charset=iso-8859-1; macromedia dreamweaver 4.0=
26
1
18
1
text/html; charset=iso-8859-2
136
4
105
4
text/html; charset=iso8859-1
152
7
159
7
text/html; charset=iso_8859_1
51
3
23
3
text/html; charset=ks_c_5601-1987
19
1
8
1
text/html; charset=macintosh
1843
19
563
19
text/html; charset=shift_jis
152
2
113
2
text/html; charset=unicode
139
2
61
2
text/html; charset=us-ascii
9096
227
5637
227
text/html; charset=utf-8
121752
3268
90980
3268
text/html; charset=windows-1250
243
11
130
11
text/html; charset=windows-1251
239
10
385
10
text/html; charset=windows-1252
596980
13654
474532
13654
text/html; charset=windows-1254
33
2
22
2
text/html; charset=windows-1256
215
5
124
5
text/html; charset=x-mac-roman
12
1
39
1
text/html; iso-8859-1=
108
3
47
3
text/html; set=iso-8859-1
59
2
29
2
text/plain; charset=EUC-KR
217
6
160
7
text/plain; charset=GB18030
2195
73
1943
73
text/plain; charset=IBM500
50
1
44
1
text/plain; charset=IBM855
38
3
10
3
text/plain; charset=IBM866
5
1
12
1
text/plain; charset=ISO-2022-JP
272
2
163
2
text/plain; charset=ISO-8859-1
647624
18086
453834
18087
text/plain; charset=ISO-8859-15
699
17
290
17
text/plain; charset=ISO-8859-5
155
5
143
5
text/plain; charset=KOI8-R
238
6
300
6
text/plain; charset=Shift_JIS
150
5
159
5
text/plain; charset=UTF-8
1957
79
1590
79
text/plain; charset=windows-1250
1098
7
433
7
text/plain; charset=windows-1251
19
1
23
1
text/plain; charset=windows-1252
476933
13151
336078
13155
text/plain; charset=windows-1253
166
1
38
1
text/plain; charset=windows-1255
264
3
63
3
text/x-java-source
11144
53
4724
53
text; charset=ISO-8859-1
212
10
184
10
texthtml; charset=is0-8859-1
152
5
161
5
video/x-ms-wmv
18
2
22
2




select millis_A.DETECTED_CONTENT_TYPE_A, millis_A.ELAPSED_TIME_MILLIS_A, detected_types_A.NUM_FILES, millis_B.ELAPSED_TIME_MILLIS_B, detected_types_B.NUM_FILES from millis_A join millis_B on millis_A.DETECTED_CONTENT_TYPE_A=millis_B.DETECTED_CONTENT_TYPE_B join detected_types_A on millis_B.DETECTED_CONTENT_TYPE_B=detected_types_A.DETECTED_CONTENT_TYPE_A join detected_types_B on millis_B.DETECTED_CONTENT_TYPE_B=detected_types_B.DETECTED_CONTENT_TYPE_B