org.supermind.crawl
Class FetcherOutput
java.lang.Object
org.supermind.crawl.FetcherOutput
- All Implemented Interfaces:
- org.apache.nutch.io.Writable
public final class FetcherOutput
- extends java.lang.Object
- implements org.apache.nutch.io.Writable
An entry in the fetcher's output. This includes all of the fetcher output
except the raw and stripped versions of the content, which are placed in
separate files.
Note by John Xing: As of 20041022, option -noParsing is introduced
in Fetcher.java. This changes fetcher behavior. Accordingly
there are necessary modifications in this class.
Check Fetcher.java and ParseSegment.java for details.
Methods inherited from class java.lang.Object |
clone, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
DIR_NAME
public static final java.lang.String DIR_NAME
- See Also:
- Constant Field Values
DIR_NAME_NP
public static final java.lang.String DIR_NAME_NP
- See Also:
- Constant Field Values
DONE_NAME
public static final java.lang.String DONE_NAME
- See Also:
- Constant Field Values
ERROR_NAME
public static final java.lang.String ERROR_NAME
- See Also:
- Constant Field Values
FetcherOutput
public FetcherOutput()
FetcherOutput
public FetcherOutput(ScheduledURL scheduledURL,
org.apache.nutch.io.MD5Hash md5Hash,
org.apache.nutch.protocol.ProtocolStatus protocolStatus)
equals
public boolean equals(java.lang.Object o)
- Overrides:
equals
in class java.lang.Object
getFetchDate
public long getFetchDate()
getMD5Hash
public org.apache.nutch.io.MD5Hash getMD5Hash()
getProtocolStatus
public org.apache.nutch.protocol.ProtocolStatus getProtocolStatus()
getScheduledURL
public ScheduledURL getScheduledURL()
getUrl
public java.lang.String getUrl()
getVersion
public byte getVersion()
read
public static FetcherOutput read(java.io.DataInput in)
throws java.io.IOException
- Throws:
java.io.IOException
readFields
public final void readFields(java.io.DataInput in)
throws java.io.IOException
- Specified by:
readFields
in interface org.apache.nutch.io.Writable
- Throws:
java.io.IOException
setFetchDate
public void setFetchDate(long fetchDate)
setProtocolStatus
public void setProtocolStatus(org.apache.nutch.protocol.ProtocolStatus protocolStatus)
toString
public java.lang.String toString()
- Overrides:
toString
in class java.lang.Object
write
public final void write(java.io.DataOutput out)
throws java.io.IOException
- Specified by:
write
in interface org.apache.nutch.io.Writable
- Throws:
java.io.IOException