java.lang.Objectorg.supermind.crawl.scope.OneExternalLinkFLFilter
public class OneExternalLinkFLFilter
Allows a URL if its parent has the same host as its seed. This effectively allows all URLs within the same host, as well as one link outside the seed's host.
Field Summary |
---|
Fields inherited from interface org.supermind.crawl.scope.ScopeFilter |
---|
ABSTAIN, ALLOW, REJECT |
Constructor Summary | |
---|---|
OneExternalLinkFLFilter()
|
Method Summary | |
---|---|
int |
filter(FetchListScope.Input input)
Filter the input. |
void |
setSeedSource(CrawlSeedSource seedSource)
Set CrawlSeedSource . |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public OneExternalLinkFLFilter()
Method Detail |
---|
public int filter(FetchListScope.Input input)
ScopeFilter
ScopeFilter.ALLOW
, ScopeFilter.REJECT
and ScopeFilter.ABSTAIN
.
filter
in interface ScopeFilter<FetchListScope.Input>
public void setSeedSource(CrawlSeedSource seedSource)
CrawlSeedSource
. Note: CrawlSeedSource implementation
must support random access by seed id.
seedSource
- CrawlSeedSource.getSeedURL(int)