Class CurrentDateTagger

  • All Implemented Interfaces:
    IXMLConfigurable, IImporterHandler, IDocumentTagger

    public class CurrentDateTagger
    extends AbstractDocumentTagger

    Adds the current computer UTC date to the specified field. If no field is provided, the date will be added to document.importedDate.

    The default date format is EPOCH (the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC). A custom date format can be specified with the format attribute, as per the formatting options found on SimpleDateFormat.

    Storing values in an existing field

    If a target field with the same name already exists for a document, values will be added to the end of the existing value list. It is possible to change this default behavior by supplying a PropertySetter.

    Can be used both as a pre-parse or post-parse handler.

    It is possible to specify a locale used for formatting dates. The locale is the ISO two-letter language code, with an optional ISO country code, separated with an underscore (e.g., "fr" for French, "fr_CA" for Canadian French). When no locale is specified, the default is "en_US" (US English).

    XML configuration usage:

    
    <handler
        class="com.norconex.importer.handler.tagger.impl.CurrentDateTagger"
        toField="(target field)"
        format="(date format)"
        locale="(locale)">
      <!-- multiple "restrictTo" tags allowed (only one needs to match) -->
      <restrictTo>
        <fieldMatcher>(field-matching expression)</fieldMatcher>
        <valueMatcher>(value-matching expression)</valueMatcher>
      </restrictTo>
    </handler>

    XML usage example:

    
    <handler
        class="CurrentDateTagger"
        toField="crawl_date"
        format="yyyy-MM-dd HH:mm"/>

    The above will store the current date along with hours and minutes in a "crawl_date" field.

    Since:
    2.2.0
    Author:
    Pascal Essiembre