org.supermind.crawl
Class InMemoryFetchedURLs

java.lang.Object
  extended by org.supermind.crawl.InMemoryFetchedURLs
All Implemented Interfaces:
FetchedURLs

public class InMemoryFetchedURLs
extends java.lang.Object
implements FetchedURLs

Saves fetched URLs to a HashSet. Not recommended for large crawls.


Field Summary
 
Fields inherited from interface org.supermind.crawl.FetchedURLs
LOG
 
Constructor Summary
InMemoryFetchedURLs()
           
 
Method Summary
 void close()
           
 boolean contains(java.lang.String url)
           
 boolean contains(java.net.URL url)
          Has the URL already been fetched?
 ScheduledURL get(long id)
          Get a persisted URL.
 void init()
           
 void insert(ScheduledURL url, org.apache.nutch.protocol.ProtocolOutput output)
          Insert a fetched URL.
 void insert(java.lang.String url)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

InMemoryFetchedURLs

public InMemoryFetchedURLs()
Method Detail

close

public void close()
           throws java.io.IOException
Specified by:
close in interface FetchedURLs
Throws:
java.io.IOException

contains

public boolean contains(java.lang.String url)

contains

public boolean contains(java.net.URL url)
Description copied from interface: FetchedURLs
Has the URL already been fetched?

Specified by:
contains in interface FetchedURLs
Returns:

get

public ScheduledURL get(long id)
Description copied from interface: FetchedURLs
Get a persisted URL. (optional operation)

Specified by:
get in interface FetchedURLs
Parameters:
id - ScheduledURL's id
Returns:
ScheduledURL, or null if doesn't exist

init

public void init()
          throws java.io.IOException
Specified by:
init in interface FetchedURLs
Throws:
java.io.IOException

insert

public void insert(ScheduledURL url,
                   org.apache.nutch.protocol.ProtocolOutput output)
Description copied from interface: FetchedURLs
Insert a fetched URL.

Specified by:
insert in interface FetchedURLs
Parameters:
url - url
output - protocol output

insert

public void insert(java.lang.String url)