org.supermind.crawl.http
Class Http
java.lang.Object
org.supermind.crawl.http.Http
- All Implemented Interfaces:
- org.apache.nutch.protocol.Protocol
public class Http
- extends java.lang.Object
- implements org.apache.nutch.protocol.Protocol
Takes advantage of HTTP 1.1 features such as connection persistence and
request pipelining to improve performance for situations where multiple URLs
must be fetched from the same host.
Fields inherited from interface org.apache.nutch.protocol.Protocol |
X_POINT_ID |
Constructor Summary |
Http()
|
Method Summary |
org.apache.nutch.protocol.ProtocolOutput |
getProtocolOutput(org.apache.nutch.pagedb.FetchListEntry fle)
|
org.apache.nutch.protocol.ProtocolOutput[] |
getProtocolOutput(ScheduledURL[] pages)
|
org.apache.nutch.protocol.ProtocolOutput |
getProtocolOutput(java.lang.String urlString)
|
org.apache.nutch.protocol.ProtocolOutput[] |
getProtocolOutput(java.lang.String[] urlstrings)
|
static void |
main(java.lang.String[] args)
For debugging. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
AGENT_STRING
static java.lang.String AGENT_STRING
BUFFER_SIZE
static final int BUFFER_SIZE
- See Also:
- Constant Field Values
LOG
public static final java.util.logging.Logger LOG
MAX_CONTENT
static int MAX_CONTENT
MAX_DELAYS
static int MAX_DELAYS
MAX_REDIRECTS
static final int MAX_REDIRECTS
PROXY
static boolean PROXY
PROXY_HOST
static java.lang.String PROXY_HOST
PROXY_PORT
static int PROXY_PORT
SERVER_DELAY
static long SERVER_DELAY
TIMEOUT
static int TIMEOUT
Http
public Http()
getProtocolOutput
public org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(org.apache.nutch.pagedb.FetchListEntry fle)
- Specified by:
getProtocolOutput
in interface org.apache.nutch.protocol.Protocol
getProtocolOutput
public org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(ScheduledURL[] pages)
getProtocolOutput
public org.apache.nutch.protocol.ProtocolOutput getProtocolOutput(java.lang.String urlString)
- Specified by:
getProtocolOutput
in interface org.apache.nutch.protocol.Protocol
getProtocolOutput
public org.apache.nutch.protocol.ProtocolOutput[] getProtocolOutput(java.lang.String[] urlstrings)
main
public static void main(java.lang.String[] args)
throws java.lang.Exception
- For debugging.
- Throws:
java.lang.Exception