public class DateMetadataFilter extends AbstractDocumentFilter
Accepts or rejects a document based on the date value(s) of a metadata field, stored in a specified format. If multiple values are found for a field, only one of them needs to match for this filter to take effect. If the value cannot be parsed to a valid date, it is considered not to be matching.
To successfully parse a date, an optional date format can be specified,
as per the formatting options found on SimpleDateFormat
.
The default format when not specified is EPOCH (the difference, measured in
milliseconds, between the date and midnight, January 1, 1970).
When adding a condition, you can specify a static date (i.e. a constant date value), or you can tell this filter you want to use a date relative to the current type. There is a distinction to be made between TODAY and NOW. TODAY is the current day without the hours, minutes, and seconds, where as NOW is the current day with the hours, minutes, and seconds. You can also decide whether you want the current date to be fixed (does not change after being set for the first time), or whether it should be refreshed on every call to reflect system date time changes.
<filter class="com.norconex.importer.handler.filter.impl.DateMetadataFilter" onMatch="[include|exclude]" field="(name of metadata field to match)" format="(date format)" > <restrictTo caseSensitive="[false|true]" field="(name of header/metadata field name to match)"> (regular expression of value to match) </restrictTo> <!-- multiple "restrictTo" tags allowed (only one needs to match) --> <!-- Use one or two (for ranges) conditions where: Possible operators are: gt -> greater than ge -> greater equal lt -> lower than le -> lowe equal eq -> equals Condition date value format are either one of: yyyy-MM-dd -> date (e.g. 2015-05-31) yyyy-MM-ddThh:mm:ss[.SSS] -> date and time with optional milliseconds (e.g. 2015-05-31T22:44:15) TODAY[-+]9[YMDhms][*] -> the string "TODAY" (at 0:00:00) minus or plus a number of years, months, days, hours, minutes, or seconds (e.g. 1 week ago: TODAY-7d). * means TODAY can change from one invocation to another to adjust to a change of current day NOW[-+]9[YMDhms][*] -> the string "NOW" (at current time) minus or plus a number of years, months, days, hours, minutes, or seconds (e.g. 1 week ago: NOW-7d). * means NOW changes from one invocation to another to adjust to the current time. --> <condition operator="[gt|ge|lt|le|eq]" date="(a date)" /> </filter>
For example, let's say you want to keep only documents from the last seven days, not including today. The following would achieve that:
<filter class="com.norconex.importer.handler.filter.impl.DateMetadataFilter" onMatch="include" field="publish_date" > <condition operator="ge" date="TODAY-7" /> <condition operator="lt" date="TODAY" /> </filter>
Modifier and Type | Class and Description |
---|---|
static class |
DateMetadataFilter.Condition |
static class |
DateMetadataFilter.Operator |
static class |
DateMetadataFilter.TimeUnit |
Constructor and Description |
---|
DateMetadataFilter() |
DateMetadataFilter(String field) |
DateMetadataFilter(String field,
OnMatch onMatch) |
Modifier and Type | Method and Description |
---|---|
void |
addCondition(DateMetadataFilter.Operator operator,
Date date) |
void |
addConditionFromNow(DateMetadataFilter.Operator operator,
DateMetadataFilter.TimeUnit timeUnit,
int value,
boolean fixed) |
void |
addConditionFromToday(DateMetadataFilter.Operator operator,
DateMetadataFilter.TimeUnit timeUnit,
int value,
boolean fixed) |
boolean |
equals(Object other) |
String |
getField() |
String |
getFormat() |
int |
hashCode() |
protected boolean |
isDocumentMatched(String reference,
InputStream input,
ImporterMetadata metadata,
boolean parsed) |
protected void |
loadFilterFromXML(org.apache.commons.configuration.XMLConfiguration xml) |
protected void |
saveFilterToXML(EnhancedXMLStreamWriter writer) |
void |
setField(String property) |
void |
setFormat(String format) |
String |
toString() |
acceptDocument, getOnMatch, loadHandlerFromXML, saveHandlerToXML, setOnMatch
addRestriction, addRestriction, addRestrictions, clearRestrictions, detectCharsetIfBlank, getRestrictions, isApplicable, loadFromXML, removeRestriction, removeRestriction, saveToXML
public DateMetadataFilter()
public DateMetadataFilter(String field)
public String getField()
public void setField(String property)
public String getFormat()
public void setFormat(String format)
public void addCondition(DateMetadataFilter.Operator operator, Date date)
public void addConditionFromNow(DateMetadataFilter.Operator operator, DateMetadataFilter.TimeUnit timeUnit, int value, boolean fixed)
public void addConditionFromToday(DateMetadataFilter.Operator operator, DateMetadataFilter.TimeUnit timeUnit, int value, boolean fixed)
protected boolean isDocumentMatched(String reference, InputStream input, ImporterMetadata metadata, boolean parsed) throws ImporterHandlerException
isDocumentMatched
in class AbstractDocumentFilter
ImporterHandlerException
protected void loadFilterFromXML(org.apache.commons.configuration.XMLConfiguration xml) throws IOException
loadFilterFromXML
in class AbstractDocumentFilter
IOException
protected void saveFilterToXML(EnhancedXMLStreamWriter writer) throws XMLStreamException
saveFilterToXML
in class AbstractDocumentFilter
XMLStreamException
public boolean equals(Object other)
equals
in class AbstractDocumentFilter
public int hashCode()
hashCode
in class AbstractDocumentFilter
public String toString()
toString
in class AbstractDocumentFilter
Copyright © 2009–2021 Norconex Inc.. All rights reserved.