Class RegexReferenceFilter

  • All Implemented Interfaces:
    IXMLConfigurable, IDocumentFilter, IOnMatchFilter, IImporterHandler

    @Deprecated
    public class RegexReferenceFilter
    extends AbstractDocumentFilter
    Deprecated.
    Since 3.0.0, use ReferenceFilter instead.

    Accepts or rejects a document based on its reference (e.g. URL).

    XML configuration usage:

      <handler class="com.norconex.importer.handler.filter.impl.RegexReferenceFilter"
              onMatch="[include|exclude]"
              caseSensitive="[false|true]">
    
          <restrictTo caseSensitive="[false|true]"
                  field="(name of header/metadata field name to match)">
              (regular expression of value to match)
          </restrictTo>
          <!-- multiple "restrictTo" tags allowed (only one needs to match) -->
    
          <regex>(regular expression of reference to match)</regex>
      </handler>
     

    Can be used both as a pre-parse or post-parse handler.

    Usage example:

    The following will reject documents having "/login/" in their reference.

      <handler class="RegexReferenceFilter" onMatch="exclude">
          <regex>.*/login/.*</regex>
      </handler>
     
    Since:
    2.7.0
    Author:
    Pascal Essiembre