public class ImporterConfig extends Object implements IXMLConfigurable
Modifier and Type | Field and Description |
---|---|
static long |
DEFAULT_MAX_MEM_INSTANCE
100 MB.
|
static long |
DEFAULT_MAX_MEM_POOL
1 GB.
|
static String |
DEFAULT_TEMP_DIR_PATH |
Constructor and Description |
---|
ImporterConfig() |
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object other) |
long |
getMaxFileCacheSize()
Deprecated.
Since 3.0.0, use
getMaxMemoryInstance() . |
long |
getMaxFilePoolCacheSize()
Deprecated.
Since 3.0.0, use
getMaxMemoryPool() . |
long |
getMaxMemoryInstance()
Gets the maximum number of bytes used for memory caching of a single
documents being processed.
|
long |
getMaxMemoryPool()
Gets the maximum number of bytes used for memory caching of data for all
documents concurrently being processed.
|
Path |
getParseErrorsSaveDir()
Gets the directory where file generating parsing errors will be saved.
|
IDocumentParserFactory |
getParserFactory() |
Consumer<HandlerContext> |
getPostParseConsumer()
Gets the
Consumer to be executed on documents after
their parsing has occurred. |
List<IImporterHandler> |
getPostParseHandlers()
Deprecated.
Since 3.0.0, use
getPostParseConsumer() instead |
Consumer<HandlerContext> |
getPreParseConsumer()
Gets the
Consumer to be executed on documents before
their parsing has occurred. |
List<IImporterHandler> |
getPreParseHandlers()
Deprecated.
Since 3.0.0, use
getPreParseConsumer() instead |
List<IImporterResponseProcessor> |
getResponseProcessors() |
Path |
getTempDir()
Gets the temporary directory where files can be deleted safely by the OS
or any other processes when the Importer is not running.
|
int |
hashCode() |
void |
loadFromXML(XML xml) |
void |
saveToXML(XML xml) |
void |
setMaxFileCacheSize(long maxFileCacheSize)
Deprecated.
Since 3.0.0, use
setMaxMemoryInstance(long) . |
void |
setMaxFilePoolCacheSize(long maxFilePoolCacheSize)
Deprecated.
Since 3.0.0, use
setMaxMemoryPool(long) . |
void |
setMaxMemoryInstance(long maxMemoryInstance)
Sets the maximum number of bytes used for memory caching of a single
documents being processed.
|
void |
setMaxMemoryPool(long maxMemoryPool)
Sets the maximum number of bytes used for memory caching of data for all
documents concurrently being processed.
|
void |
setParseErrorsSaveDir(Path parseErrorsSaveDir)
Sets the directory where file generating parsing errors will be saved.
|
void |
setParserFactory(IDocumentParserFactory parserFactory) |
void |
setPostParseConsumer(Consumer<HandlerContext> consumer)
Sets the
Consumer to be executed on documents after
their parsing has occurred. |
void |
setPostParseHandlers(List<IImporterHandler> postParseHandlers)
Deprecated.
Since 3.0.0, use
setPostParseConsumer(Consumer)
instead |
void |
setPreParseConsumer(Consumer<HandlerContext> consumer)
Sets the
Consumer to be executed on documents before
their parsing has occurred. |
void |
setPreParseHandlers(List<IImporterHandler> preParseHandlers)
Deprecated.
Since 3.0.0, use
setPreParseConsumer(Consumer)
instead |
void |
setResponseProcessors(List<IImporterResponseProcessor> responseProcessors) |
void |
setTempDir(Path tempDir)
Sets the temporary directory where files can be deleted safely by the OS
or any other processes when the Importer is not running.
|
String |
toString() |
public static final String DEFAULT_TEMP_DIR_PATH
public static final long DEFAULT_MAX_MEM_INSTANCE
public static final long DEFAULT_MAX_MEM_POOL
public IDocumentParserFactory getParserFactory()
public void setParserFactory(IDocumentParserFactory parserFactory)
public Path getParseErrorsSaveDir()
null
(not storing errors).public void setParseErrorsSaveDir(Path parseErrorsSaveDir)
parseErrorsSaveDir
- directory where to save error filespublic Consumer<HandlerContext> getPreParseConsumer()
Consumer
to be executed on documents before
their parsing has occurred.public void setPreParseConsumer(Consumer<HandlerContext> consumer)
Sets the Consumer
to be executed on documents before
their parsing has occurred. The consumer will automatically be
created when relying on XML configuration of handlers
(IImporterHandler
). XML
configuration also offers extra XML tags to create basic "flow"
for handler execution.
To programmatically set multiple consumers or take advantage of the
many configurable IImporterHandler
instances instead,
you can use FunctionUtil.allConsumers(Consumer...)
or
HandlerConsumer.fromHandlers(IImporterHandler...)
respectively to create a consumer.
consumer
- the document consumerpublic Consumer<HandlerContext> getPostParseConsumer()
Consumer
to be executed on documents after
their parsing has occurred.public void setPostParseConsumer(Consumer<HandlerContext> consumer)
Sets the Consumer
to be executed on documents after
their parsing has occurred. The consumer will automatically be
created when relying on XML configuration of handlers
(IImporterHandler
). XML
configuration also offers extra XML tags to create basic "flow"
for handler execution.
To programmatically set multiple consumers or take advantage of the
many configurable IImporterHandler
instances instead,
you can use FunctionUtil.allConsumers(Consumer...)
or
HandlerConsumer.fromHandlers(IImporterHandler...)
respectively to create a consumer.
consumer
- the document consumer@Deprecated public List<IImporterHandler> getPreParseHandlers()
getPreParseConsumer()
instead@Deprecated public void setPreParseHandlers(List<IImporterHandler> preParseHandlers)
setPreParseConsumer(Consumer)
insteadpreParseHandlers
- list of importer handlers@Deprecated public List<IImporterHandler> getPostParseHandlers()
getPostParseConsumer()
instead@Deprecated public void setPostParseHandlers(List<IImporterHandler> postParseHandlers)
setPostParseConsumer(Consumer)
insteadpostParseHandlers
- list of importer handlerspublic List<IImporterResponseProcessor> getResponseProcessors()
public void setResponseProcessors(List<IImporterResponseProcessor> responseProcessors)
public Path getTempDir()
Gets the temporary directory where files can be deleted safely by the OS or any other processes when the Importer is not running. When not set, the importer will use the system temporary directory.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their temp/cache directory built-in.
public void setTempDir(Path tempDir)
Sets the temporary directory where files can be deleted safely by the OS or any other processes when the Importer is not running. When not set, the importer will use the system temporary directory.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their temp/cache directory built-in.
tempDir
- path to temporary directorypublic long getMaxMemoryInstance()
Gets the maximum number of bytes used for memory caching of a single
documents being processed. Default
is DEFAULT_MAX_MEM_INSTANCE
.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their memory settings built-in.
public void setMaxMemoryInstance(long maxMemoryInstance)
Sets the maximum number of bytes used for memory caching of a single documents being processed.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their memory settings built-in.
maxMemoryInstance
- max document memory cache sizepublic long getMaxMemoryPool()
Gets the maximum number of bytes used for memory caching of data for all
documents concurrently being processed. Default
is DEFAULT_MAX_MEM_POOL
.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their memory settings built-in.
public void setMaxMemoryPool(long maxMemoryPool)
Sets the maximum number of bytes used for memory caching of data for all documents concurrently being processed.
This only get used when the Importer launched directly from the
command-line or when importing documents via
Importer.importDocument(ImporterRequest)
. Documents
imported via Importer.importDocument(Doc)
already have
their memory settings built-in.
maxMemoryPool
- max documents memory pool cache size@Deprecated public long getMaxFileCacheSize()
getMaxMemoryInstance()
.@Deprecated public void setMaxFileCacheSize(long maxFileCacheSize)
setMaxMemoryInstance(long)
.maxFileCacheSize
- byte amount@Deprecated public long getMaxFilePoolCacheSize()
getMaxMemoryPool()
.@Deprecated public void setMaxFilePoolCacheSize(long maxFilePoolCacheSize)
setMaxMemoryPool(long)
.maxFilePoolCacheSize
- byte amountpublic void loadFromXML(XML xml)
loadFromXML
in interface IXMLConfigurable
public void saveToXML(XML xml)
saveToXML
in interface IXMLConfigurable
Copyright © 2009–2023 Norconex Inc.. All rights reserved.