public class DeleteRejectedEventListener extends Object implements IEventListener<Event>, IXMLConfigurable
Provides the ability to send deletion requests to your configured committer(s) whenever a reference is rejected, regardless whether it was encountered in a previous crawling session or not.
By default this listener will send deletion requests for all references
associated with a CrawlerEvent
name starting with
REJECTED_
. To avoid performance issues when dealing with
too many deletion requests, it is recommended you can change this behavior
to match exactly the events you are interested in with
setEventMatcher(TextMatcher)
.
Keep limiting events to "rejected" ones to avoid unexpected results.
This class tries to handles each reference for "rejected" events only once. To do so it will queue all such references and wait until normal crawler completion to send them. Waiting for completion also gives this class a chance to listen for deletion requests sent to your committer as part of the crawler regular execution (typically on subsequent crawls). This helps ensure you do not get duplicate deletion requests for the same reference.
Since several rejection events are triggered before document are processed, we can't assume there is any metadata attached with rejected references. Be aware this can cause issues if you are using rules in your committer (e.g., to route requests) based on metadata.
<listener
class="com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener">
<eventMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(event name-matching expression)
</eventMatcher>
</listener>
<listener
class="DeleteRejectedEventListener">
<eventMatcher
method="csv">
REJECTED_NOTFOUND,REJECTED_FILTER
</eventMatcher>
</listener>
The above example will send deletion requests whenever a reference is not found (e.g., a 404 response from a web server) or if it was filtered out by the crawler.
Modifier and Type | Field and Description |
---|---|
static String |
DEFAULT_FILENAME_PREFIX |
Constructor and Description |
---|
DeleteRejectedEventListener() |
Modifier and Type | Method and Description |
---|---|
void |
accept(Event event) |
boolean |
equals(Object other) |
TextMatcher |
getEventMatcher()
Gets the event matcher used to identify which events can trigger
a deletion request.
|
int |
hashCode() |
void |
loadFromXML(XML xml) |
void |
saveToXML(XML xml) |
void |
setEventMatcher(TextMatcher eventMatcher)
Sets the event matcher used to identify which events can trigger
a deletion request.
|
String |
toString() |
public static final String DEFAULT_FILENAME_PREFIX
public TextMatcher getEventMatcher()
REJECTED_.*
.null
public void setEventMatcher(TextMatcher eventMatcher)
eventMatcher
- event matcherpublic void loadFromXML(XML xml)
loadFromXML
in interface IXMLConfigurable
public void saveToXML(XML xml)
saveToXML
in interface IXMLConfigurable
Copyright © 2014–2023 Norconex Inc.. All rights reserved.