java.lang.Object org.supermind.crawl.FileCrawlSeedSource
public class FileCrawlSeedSource
Seed a crawl from a file. This implementation loads all seeds into memory, so for obvious reasons, is inappropriate if the file is too large.
Field Summary | |
---|---|
protected java.io.BufferedReader |
reader
|
protected java.util.ArrayList<SeedURL> |
seeds
|
Constructor Summary | |
---|---|
FileCrawlSeedSource(java.lang.String file)
|
Method Summary | |
---|---|
void |
close()
Close resources. |
SeedURL |
getSeedURL(int index)
Get seed URL corresponding to an index. |
java.util.Iterator<SeedURL> |
getSeedURLs()
Get iterator of seed URLs. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected java.io.BufferedReader reader
protected java.util.ArrayList<SeedURL> seeds
Constructor Detail |
---|
public FileCrawlSeedSource(java.lang.String file) throws java.io.IOException
java.io.IOException
Method Detail |
---|
public void close() throws java.io.IOException
CrawlSeedSource
close
in interface CrawlSeedSource
java.io.IOException
public SeedURL getSeedURL(int index)
CrawlSeedSource
getSeedURL
in interface CrawlSeedSource
public java.util.Iterator<SeedURL> getSeedURLs() throws java.io.IOException
CrawlSeedSource
getSeedURLs
in interface CrawlSeedSource
java.io.IOException