Class ImporterConfig

    • Field Detail

      • DEFAULT_TEMP_DIR_PATH

        public static final String DEFAULT_TEMP_DIR_PATH
      • DEFAULT_MAX_MEM_INSTANCE

        public static final long DEFAULT_MAX_MEM_INSTANCE
        100 MB.
      • DEFAULT_MAX_MEM_POOL

        public static final long DEFAULT_MAX_MEM_POOL
        1 GB.
    • Constructor Detail

      • ImporterConfig

        public ImporterConfig()
    • Method Detail

      • getParseErrorsSaveDir

        public Path getParseErrorsSaveDir()
        Gets the directory where file generating parsing errors will be saved. Default is null (not storing errors).
        Returns:
        directory where to save error files
      • setParseErrorsSaveDir

        public void setParseErrorsSaveDir​(Path parseErrorsSaveDir)
        Sets the directory where file generating parsing errors will be saved.
        Parameters:
        parseErrorsSaveDir - directory where to save error files
      • getPreParseConsumer

        public Consumer<HandlerContext> getPreParseConsumer()
        Gets the Consumer to be executed on documents before their parsing has occurred.
        Returns:
        the document consumer
        Since:
        3.0.0
      • getPostParseConsumer

        public Consumer<HandlerContext> getPostParseConsumer()
        Gets the Consumer to be executed on documents after their parsing has occurred.
        Returns:
        the document consumer
        Since:
        3.0.0
      • setPreParseHandlers

        @Deprecated
        public void setPreParseHandlers​(List<IImporterHandler> preParseHandlers)
        Deprecated.
        Since 3.0.0, use setPreParseConsumer(Consumer) instead
        Sets importer handlers to be executed on documents before they are parsed.
        Parameters:
        preParseHandlers - list of importer handlers
      • setPostParseHandlers

        @Deprecated
        public void setPostParseHandlers​(List<IImporterHandler> postParseHandlers)
        Deprecated.
        Since 3.0.0, use setPostParseConsumer(Consumer) instead
        Sets importer handlers to be executed on documents after they are parsed.
        Parameters:
        postParseHandlers - list of importer handlers
      • getTempDir

        public Path getTempDir()

        Gets the temporary directory where files can be deleted safely by the OS or any other processes when the Importer is not running. When not set, the importer will use the system temporary directory.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their temp/cache directory built-in.

        Returns:
        path to temporary directory
      • setTempDir

        public void setTempDir​(Path tempDir)

        Sets the temporary directory where files can be deleted safely by the OS or any other processes when the Importer is not running. When not set, the importer will use the system temporary directory.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their temp/cache directory built-in.

        Parameters:
        tempDir - path to temporary directory
      • getMaxMemoryInstance

        public long getMaxMemoryInstance()

        Gets the maximum number of bytes used for memory caching of a single documents being processed. Default is DEFAULT_MAX_MEM_INSTANCE.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their memory settings built-in.

        Returns:
        max document memory cache size
        Since:
        3.0.0
      • setMaxMemoryInstance

        public void setMaxMemoryInstance​(long maxMemoryInstance)

        Sets the maximum number of bytes used for memory caching of a single documents being processed.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their memory settings built-in.

        Parameters:
        maxMemoryInstance - max document memory cache size
        Since:
        3.0.0
      • getMaxMemoryPool

        public long getMaxMemoryPool()

        Gets the maximum number of bytes used for memory caching of data for all documents concurrently being processed. Default is DEFAULT_MAX_MEM_POOL.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their memory settings built-in.

        Returns:
        max documents memory pool cache size
        Since:
        3.0.0
      • setMaxMemoryPool

        public void setMaxMemoryPool​(long maxMemoryPool)

        Sets the maximum number of bytes used for memory caching of data for all documents concurrently being processed.

        This only get used when the Importer launched directly from the command-line or when importing documents via Importer.importDocument(ImporterRequest). Documents imported via Importer.importDocument(Doc) already have their memory settings built-in.

        Parameters:
        maxMemoryPool - max documents memory pool cache size
        Since:
        3.0.0
      • setMaxFileCacheSize

        @Deprecated
        public void setMaxFileCacheSize​(long maxFileCacheSize)
        Deprecated.
        Since 3.0.0, use setMaxMemoryInstance(long).
        Parameters:
        maxFileCacheSize - byte amount
      • getMaxFilePoolCacheSize

        @Deprecated
        public long getMaxFilePoolCacheSize()
        Deprecated.
        Since 3.0.0, use getMaxMemoryPool().
        Returns:
        byte amount
      • setMaxFilePoolCacheSize

        @Deprecated
        public void setMaxFilePoolCacheSize​(long maxFilePoolCacheSize)
        Deprecated.
        Since 3.0.0, use setMaxMemoryPool(long).
        Parameters:
        maxFilePoolCacheSize - byte amount
      • hashCode

        public int hashCode()
        Overrides:
        hashCode in class Object