public class GenericSpoiledReferenceStrategizer extends Object implements ISpoiledReferenceStrategizer, IXMLConfigurable
Generic implementation of ISpoiledReferenceStrategizer
that
offers a simple mapping between the crawl state of references that have
turned "bad" and the strategy to adopt for each.
Whenever a crawl state does not have a strategy associated, the fall-back
strategy is used (default being DELETE
).
The mappings defined by default are as follow:
Crawl state | Strategy |
NOT_FOUND | DELETE |
BAD_STATUS | GRACE_ONCE |
ERROR | GRACE_ONCE |
<spoiledReferenceStrategizer
class="com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer"
fallbackStrategy="[DELETE|GRACE_ONCE|IGNORE]">
<mapping
state="(any crawl state)"
strategy="[DELETE|GRACE_ONCE|IGNORE]"/>
(repeat mapping tag as needed)
</spoiledReferenceStrategizer>
<spoiledReferenceStrategizer
class="GenericSpoiledReferenceStrategizer">
<mapping
state="NOT_FOUND"
strategy="DELETE"/>
<mapping
state="BAD_STATUS"
strategy="DELETE"/>
<mapping
state="ERROR"
strategy="IGNORE"/>
</spoiledReferenceStrategizer>
The above example indicates we should ignore (do nothing) errors processing documents, and send a deletion request if they are not found or have resulted in a bad status.
Modifier and Type | Field and Description |
---|---|
static SpoiledReferenceStrategy |
DEFAULT_FALLBACK_STRATEGY |
Constructor and Description |
---|
GenericSpoiledReferenceStrategizer() |
Modifier and Type | Method and Description |
---|---|
void |
addMapping(CrawlState state,
SpoiledReferenceStrategy strategy) |
boolean |
equals(Object other) |
SpoiledReferenceStrategy |
getFallbackStrategy() |
int |
hashCode() |
void |
loadFromXML(XML xml) |
SpoiledReferenceStrategy |
resolveSpoiledReferenceStrategy(String reference,
CrawlState state)
Establish which spoiled reference strategy to adopt.
|
void |
saveToXML(XML xml) |
void |
setFallbackStrategy(SpoiledReferenceStrategy fallbackStrategy) |
String |
toString() |
public static final SpoiledReferenceStrategy DEFAULT_FALLBACK_STRATEGY
public SpoiledReferenceStrategy resolveSpoiledReferenceStrategy(String reference, CrawlState state)
ISpoiledReferenceStrategizer
resolveSpoiledReferenceStrategy
in interface ISpoiledReferenceStrategizer
reference
- a document referencestate
- the reference crawl state to evaluatepublic SpoiledReferenceStrategy getFallbackStrategy()
public void setFallbackStrategy(SpoiledReferenceStrategy fallbackStrategy)
public void addMapping(CrawlState state, SpoiledReferenceStrategy strategy)
public void loadFromXML(XML xml)
loadFromXML
in interface IXMLConfigurable
public void saveToXML(XML xml)
saveToXML
in interface IXMLConfigurable
Copyright © 2014–2023 Norconex Inc.. All rights reserved.