Class StripBeforeTransformer

  • All Implemented Interfaces:
    IXMLConfigurable, IImporterHandler, IDocumentTransformer

    public class StripBeforeTransformer
    extends AbstractStringTransformer
    implements IXMLConfigurable

    Strips any content found before first match found for given pattern.

    This class can be used as a pre-parsing (text content-types only) or post-parsing handlers.

    XML configuration usage:

    
    <handler
        class="com.norconex.importer.handler.transformer.impl.StripBeforeTransformer"
        inclusive="[false|true]"
        maxReadSize="(max characters to read at once)"
        sourceCharset="(character encoding)">
      <!-- multiple "restrictTo" tags allowed (only one needs to match) -->
      <restrictTo>
        <fieldMatcher>(field-matching expression)</fieldMatcher>
        <valueMatcher>(value-matching expression)</valueMatcher>
      </restrictTo>
      <stripBeforeMatcher>
        >
             (expression matching text up to which to strip)
      </stripBeforeMatcher>
    </handler>

    XML usage example:

    
    <handler
        class="StripBeforeTransformer"
        inclusive="true">
      <stripBeforeMatcher>
        <![CDATA[<!-- HEADER_END -->]]>
      </stripBeforeMatcher>
    </handler>

    The above example will strip all text up to and including this HTML comment: <!-- HEADER_END -->.

    Author:
    Pascal Essiembre
    • Constructor Detail

      • StripBeforeTransformer

        public StripBeforeTransformer()