public class ReferenceFilter extends AbstractDocumentFilter
Accepts or rejects a document based on its reference (e.g. URL).
Can be used both as a pre-parse or post-parse handler.
<handler
class="com.norconex.importer.handler.filter.impl.ReferenceFilter"
onMatch="[include|exclude]">
<!-- multiple "restrictTo" tags allowed (only one needs to match) -->
<restrictTo>
<fieldMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(field-matching expression)
</fieldMatcher>
<valueMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(value-matching expression)
</valueMatcher>
</restrictTo>
<valueMatcher
method="[basic|csv|wildcard|regex]"
ignoreCase="[false|true]"
ignoreDiacritic="[false|true]"
partial="[false|true]">
(expression of reference value to match)
</valueMatcher>
</handler>
<handler
class="ReferenceFilter"
onMatch="exclude">
<valueMatcher
method="regex">
.*/login/.*
</valueMatcher>
</handler>
The above eample reject documents having "/login/" in their reference.
Constructor and Description |
---|
ReferenceFilter() |
ReferenceFilter(TextMatcher textMatcher) |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object other) |
TextMatcher |
getValueMatcher()
Gets the text matcher for field values.
|
int |
hashCode() |
protected boolean |
isDocumentMatched(HandlerDoc doc,
InputStream input,
ParseState parseState) |
protected void |
loadFilterFromXML(XML xml) |
protected void |
saveFilterToXML(XML xml) |
void |
setValueMatcher(TextMatcher valueMatcher)
Sets the text matcher for field values.
|
String |
toString() |
acceptDocument, getOnMatch, loadHandlerFromXML, saveHandlerToXML, setOnMatch
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public ReferenceFilter()
public ReferenceFilter(TextMatcher textMatcher)
public TextMatcher getValueMatcher()
public void setValueMatcher(TextMatcher valueMatcher)
valueMatcher
- text matcherprotected boolean isDocumentMatched(HandlerDoc doc, InputStream input, ParseState parseState) throws ImporterHandlerException
isDocumentMatched
in class AbstractDocumentFilter
ImporterHandlerException
protected void loadFilterFromXML(XML xml)
loadFilterFromXML
in class AbstractDocumentFilter
protected void saveFilterToXML(XML xml)
saveFilterToXML
in class AbstractDocumentFilter
public boolean equals(Object other)
equals
in class AbstractDocumentFilter
public int hashCode()
hashCode
in class AbstractDocumentFilter
public String toString()
toString
in class AbstractDocumentFilter
Copyright © 2009–2023 Norconex Inc.. All rights reserved.