public interface FetchList
A data structure for managing the list of URLs to fetch. Roughly equivalent to Mercator's Frontier. This class is NOT thread-safe. Each FetcherThread is supposed to have its own fetchlist.
Method Summary | |
---|---|
void |
close()
Release resources. |
boolean |
contains(java.net.URL url)
Does the fetchlist contain this url? |
int |
getCurrentSize()
Total number of URLs this fetchlist currently contains. |
void |
init()
Initialize resources. |
HostQueue |
next()
Get next HostQueue . |
void |
queue(ScheduledURL parent,
java.net.URL url)
Add a ScheduledURL to the fetchlist. |
void |
release(HostQueue hostQueue,
int popped,
long timeTaken)
Release HostQueue from use. |
Method Detail |
---|
void close()
boolean contains(java.net.URL url)
url
-
int getCurrentSize()
release(org.supermind.crawl.HostQueue, int, long)
d.
void init()
HostQueue next()
HostQueue
.
void queue(ScheduledURL parent, java.net.URL url)
ScheduledURL
to the fetchlist. Multiple threads
can be calling this method, and implementing classes must
synchronize access accordingly.
parent
- originating urlurl
- url to queuevoid release(HostQueue hostQueue, int popped, long timeTaken)
HostQueue.pop()
completes.
hostQueue
- popped
- number of URLs popped from the queuetimeTaken
- total time taken to download the popped urls