Location of text file containing seed URLs.
c:/temp/seeds.txt
Number of fetcher threads. Equivalent to Nutch's fetcher.threads.fetch
10
The maximum number of pipedlined HTTP GETs to perform per HTTP connection. Defaults to 5.
5
If true, when files/directories exist, they will be overwritten.
true
Segment directory.
c:/temp/sitesearch/segment
c:/temp/sitesearch/contentseen
If true, when files/directories exist, they will be overwritten.
true
500
Scope determining which URLs get persisted.
Returned when no filters explicitly allow or reject.
true
Scope determining which URLs get added to the fetchlist.
false
20
Scope determining which URLs get parsed.
Returned when no filters explicitly allow or reject.
true
Interval between consecutive accesses to same host = time taken to download request * waitFactor.
0
c:/temp/sitesearch/fetchedurls
If true, when files/directories exist, they will be overwritten.
true
500