org.supermind.crawl.http
Class Http

java.lang.Object
  extended by org.supermind.crawl.http.Http
All Implemented Interfaces:
org.apache.nutch.protocol.Protocol

public class Http
extends java.lang.Object
implements org.apache.nutch.protocol.Protocol

Takes advantage of HTTP 1.1 features such as connection persistence and request pipelining to improve performance for situations where multiple URLs must be fetched from the same host.


Field Summary
(package private) static java.lang.String AGENT_STRING
           
(package private) static int BUFFER_SIZE
           
static java.util.logging.Logger LOG
           
(package private) static int MAX_CONTENT
           
(package private) static int MAX_DELAYS
           
(package private) static int MAX_REDIRECTS
           
(package private) static boolean PROXY
           
(package private) static java.lang.String PROXY_HOST
           
(package private) static int PROXY_PORT
           
(package private) static long SERVER_DELAY
           
(package private) static int TIMEOUT
           
 
Fields inherited from interface org.apache.nutch.protocol.Protocol
X_POINT_ID
 
Constructor Summary
Http()
           
 
Method Summary
 org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(org.apache.nutch.pagedb.FetchListEntry fle)
           
 org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(ScheduledURL[] pages)
           
 org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(java.lang.String urlString)
           
 org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(java.lang.String[] urlstrings)
           
static void main(java.lang.String[] args)
          For debugging.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

AGENT_STRING

static java.lang.String AGENT_STRING

BUFFER_SIZE

static final int BUFFER_SIZE
See Also:
Constant Field Values

LOG

public static final java.util.logging.Logger LOG

MAX_CONTENT

static int MAX_CONTENT

MAX_DELAYS

static int MAX_DELAYS

MAX_REDIRECTS

static final int MAX_REDIRECTS

PROXY

static boolean PROXY

PROXY_HOST

static java.lang.String PROXY_HOST

PROXY_PORT

static int PROXY_PORT

SERVER_DELAY

static long SERVER_DELAY

TIMEOUT

static int TIMEOUT
Constructor Detail

Http

public Http()
Method Detail

getProtocolOutput

public org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(org.apache.nutch.pagedb.FetchListEntry fle)
Specified by:
getProtocolOutput in interface org.apache.nutch.protocol.Protocol

getProtocolOutput

public org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(ScheduledURL[] pages)

getProtocolOutput

public org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(java.lang.String urlString)
Specified by:
getProtocolOutput in interface org.apache.nutch.protocol.Protocol

getProtocolOutput

public org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(java.lang.String[] urlstrings)

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
For debugging.

Throws:
java.lang.Exception