A B C D E F G H I J L M N O P Q R S T U V 
All Classes All Packages

A

AbstractDocumentChecksummer - Class in com.norconex.collector.core.checksum
Abstract implementation of IDocumentChecksummer giving the option to keep the generated checksum in a metadata field.
AbstractDocumentChecksummer() - Constructor for class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
AbstractMetadataChecksummer - Class in com.norconex.collector.core.checksum
Abstract implementation of IMetadataChecksummer giving the option to keep the generated checksum.
AbstractMetadataChecksummer() - Constructor for class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
AbstractPipelineContext - Class in com.norconex.collector.core.pipeline
Base IPipelineStage context for collector Pipelines.
AbstractPipelineContext(Crawler) - Constructor for class com.norconex.collector.core.pipeline.AbstractPipelineContext
Constructor.
AbstractSubCommand - Class in com.norconex.collector.core.cmdline
Base class for subcommands.
AbstractSubCommand() - Constructor for class com.norconex.collector.core.cmdline.AbstractSubCommand
 
accept(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
accept(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
accept(Event) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
accept(Event) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
acceptDocument(Doc) - Method in interface com.norconex.collector.core.filter.IDocumentFilter
Whether to accept a document.
acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
acceptMetadata(String, Properties) - Method in interface com.norconex.collector.core.filter.IMetadataFilter
Whether to accept the metadata.
acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
acceptReference(String) - Method in interface com.norconex.collector.core.filter.IReferenceFilter
Whether to accept this reference.
ACTIVE - com.norconex.collector.core.doc.CrawlDocInfo.Stage
 
addEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.CollectorConfig
Adds event listeners.
addEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Adds event listeners.
addEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.CollectorConfig
Adds event listeners.
addEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Adds event listeners.
addMapping(CrawlState, SpoiledReferenceStrategy) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
afterCrawlerExecution() - Method in class com.norconex.collector.core.crawler.Crawler
Gives crawler implementations a chance to do something right after the crawler is done processing its last reference, before all resources are shut down.
ALL - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
Stop the crawler when all of the matching event counts have reached the maximum.
ANY - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
Stop the crawler when any of the matching event count reaches the specified maximum.

B

BAD_STATUS - Static variable in class com.norconex.collector.core.doc.CrawlState
 
beforeCrawlerExecution(boolean) - Method in class com.norconex.collector.core.crawler.Crawler
Gives crawler implementations a chance to prepare before execution starts Invoked right after the CrawlerEvent.CRAWLER_RUN_BEGIN is fired.
beforeFinalizeDocumentProcessing(CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
Gives implementors a change to take action on a document before its processing is being finalized (cycle end-of-life for a crawled reference).
build() - Method in class com.norconex.collector.core.CollectorEvent.Builder
 
build() - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
 
Builder(String, Collector) - Constructor for class com.norconex.collector.core.CollectorEvent.Builder
 
Builder(String, Crawler) - Constructor for class com.norconex.collector.core.crawler.CrawlerEvent.Builder
 

C

call() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
call() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
 
CHECKSUM_DOC - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
 
CHECKSUM_METADATA - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
 
checksumMD5(InputStream) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
 
checksumMD5(String) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
 
ChecksumStageUtil - Class in com.norconex.collector.core.pipeline
Checksum stage utility methods.
ChecksumUtil - Class in com.norconex.collector.core.checksum
Checksum utility methods.
clean() - Method in class com.norconex.collector.core.Collector
 
clean() - Method in class com.norconex.collector.core.crawler.Crawler
 
clean() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
clean() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
clean() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
clean() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
clean() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
CleanCommand - Class in com.norconex.collector.core.cmdline
Clean the Collector crawling history.
CleanCommand() - Constructor for class com.norconex.collector.core.cmdline.CleanCommand
 
clear() - Method in interface com.norconex.collector.core.store.IDataStore
 
clear() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
clear() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
clear() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
clearEventListeners() - Method in class com.norconex.collector.core.CollectorConfig
Clears all event listeners.
clearEventListeners() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Clears all event listeners.
close() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
close() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
close() - Method in interface com.norconex.collector.core.store.IDataStore
 
close() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
close() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
close() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
close() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
close() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
close() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
close() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
Collector - Class in com.norconex.collector.core
Base implementation of a Collector.
Collector(CollectorConfig) - Constructor for class com.norconex.collector.core.Collector
Creates and configure a Collector with the provided configuration.
Collector(CollectorConfig, EventManager) - Constructor for class com.norconex.collector.core.Collector
Creates and configure a Collector with the provided configuration.
COLLECTOR_CLEAN_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_CLEAN_END - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_ERROR - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_RUN_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_RUN_END - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STOP_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STOP_END - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STORE_EXPORT_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STORE_EXPORT_END - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STORE_IMPORT_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
 
COLLECTOR_STORE_IMPORT_END - Static variable in class com.norconex.collector.core.CollectorEvent
 
CollectorCommand - Class in com.norconex.collector.core.cmdline
Encapsulates command line arguments when running the Collector from a command prompt.
CollectorCommand(Collector) - Constructor for class com.norconex.collector.core.cmdline.CollectorCommand
 
CollectorCommandLauncher - Class in com.norconex.collector.core.cmdline
Launches a collector implementation from a string array representing command line arguments.
CollectorCommandLauncher() - Constructor for class com.norconex.collector.core.cmdline.CollectorCommandLauncher
 
CollectorConfig - Class in com.norconex.collector.core
Base Collector configuration.
CollectorConfig() - Constructor for class com.norconex.collector.core.CollectorConfig
 
CollectorConfig(Class<? extends CrawlerConfig>) - Constructor for class com.norconex.collector.core.CollectorConfig
 
CollectorEvent - Class in com.norconex.collector.core
A crawler event.
CollectorEvent.Builder - Class in com.norconex.collector.core
 
CollectorException - Exception in com.norconex.collector.core
Runtime exception for most unrecoverable issues thrown by Collector classes.
CollectorException() - Constructor for exception com.norconex.collector.core.CollectorException
 
CollectorException(String) - Constructor for exception com.norconex.collector.core.CollectorException
 
CollectorException(String, Throwable) - Constructor for exception com.norconex.collector.core.CollectorException
 
CollectorException(Throwable) - Constructor for exception com.norconex.collector.core.CollectorException
 
CollectorLifeCycleListener - Class in com.norconex.collector.core
Collector event listener adapter for collector startup/shutdown.
CollectorLifeCycleListener() - Constructor for class com.norconex.collector.core.CollectorLifeCycleListener
 
CollectorStopperException - Exception in com.norconex.collector.core.stop
Exception thrown when a problem occurred while trying to stop a collector.
CollectorStopperException(String) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
 
CollectorStopperException(String, Throwable) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
 
CollectorStopperException(Throwable) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
 
com.norconex.collector.core - package com.norconex.collector.core
 
com.norconex.collector.core.checksum - package com.norconex.collector.core.checksum
 
com.norconex.collector.core.checksum.impl - package com.norconex.collector.core.checksum.impl
 
com.norconex.collector.core.cmdline - package com.norconex.collector.core.cmdline
 
com.norconex.collector.core.crawler - package com.norconex.collector.core.crawler
 
com.norconex.collector.core.crawler.event.impl - package com.norconex.collector.core.crawler.event.impl
 
com.norconex.collector.core.doc - package com.norconex.collector.core.doc
 
com.norconex.collector.core.filter - package com.norconex.collector.core.filter
 
com.norconex.collector.core.filter.impl - package com.norconex.collector.core.filter.impl
 
com.norconex.collector.core.monitor - package com.norconex.collector.core.monitor
 
com.norconex.collector.core.pipeline - package com.norconex.collector.core.pipeline
 
com.norconex.collector.core.pipeline.committer - package com.norconex.collector.core.pipeline.committer
 
com.norconex.collector.core.pipeline.importer - package com.norconex.collector.core.pipeline.importer
 
com.norconex.collector.core.pipeline.queue - package com.norconex.collector.core.pipeline.queue
 
com.norconex.collector.core.spoil - package com.norconex.collector.core.spoil
 
com.norconex.collector.core.spoil.impl - package com.norconex.collector.core.spoil.impl
 
com.norconex.collector.core.stop - package com.norconex.collector.core.stop
 
com.norconex.collector.core.stop.impl - package com.norconex.collector.core.stop.impl
 
com.norconex.collector.core.store - package com.norconex.collector.core.store
 
com.norconex.collector.core.store.impl.jdbc - package com.norconex.collector.core.store.impl.jdbc
 
com.norconex.collector.core.store.impl.mongodb - package com.norconex.collector.core.store.impl.mongodb
 
com.norconex.collector.core.store.impl.mvstore - package com.norconex.collector.core.store.impl.mvstore
 
commandLine() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
CommitModuleStage - Class in com.norconex.collector.core.pipeline.committer
Common pipeline stage for committing documents.
CommitModuleStage() - Constructor for class com.norconex.collector.core.pipeline.committer.CommitModuleStage
 
ConfigCheckCommand - Class in com.norconex.collector.core.cmdline
Validate configuration file format and quit.
ConfigCheckCommand() - Constructor for class com.norconex.collector.core.cmdline.ConfigCheckCommand
 
ConfigRenderCommand - Class in com.norconex.collector.core.cmdline
Resolve all includes and variables substitution and print the resulting configuration to facilitate sharing.
ConfigRenderCommand() - Constructor for class com.norconex.collector.core.cmdline.ConfigRenderCommand
 
count() - Method in interface com.norconex.collector.core.store.IDataStore
 
count() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
count() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
count() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
CrawlDoc - Class in com.norconex.collector.core.doc
A crawl document, which holds an additional DocInfo from cache (if any).
CrawlDoc(DocInfo, CrawlDocInfo, CachedInputStream) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
 
CrawlDoc(DocInfo, CrawlDocInfo, CachedInputStream, boolean) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
 
CrawlDoc(DocInfo, CachedInputStream) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
 
crawlDocInfo(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
 
CrawlDocInfo - Class in com.norconex.collector.core.doc
 
CrawlDocInfo() - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
 
CrawlDocInfo(DocInfo) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
Copy constructor.
CrawlDocInfo(String) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
 
CrawlDocInfo.Stage - Enum in com.norconex.collector.core.doc
 
CrawlDocInfoService - Class in com.norconex.collector.core.doc
 
CrawlDocInfoService(Crawler, Class<? extends CrawlDocInfo>) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfoService
 
CrawlDocMetadata - Class in com.norconex.collector.core.doc
Metadata constants for common metadata field names typically set by a collector crawler.
Crawler - Class in com.norconex.collector.core.crawler
Abstract crawler implementation providing a common base to building crawlers.
Crawler(CrawlerConfig, Collector) - Constructor for class com.norconex.collector.core.crawler.Crawler
Constructor.
CRAWLER_CLEAN_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
 
CRAWLER_CLEAN_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
 
CRAWLER_INIT_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler began its initialization.
CRAWLER_INIT_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler has been initialized.
CRAWLER_RUN_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler is about to begin crawling.
CRAWLER_RUN_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler completed crawling execution normally (without being stopped).
CRAWLER_RUN_THREAD_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler just started a new crawling thread.
CRAWLER_RUN_THREAD_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
The crawler completed execution of a crawling thread.
CRAWLER_STOP_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
Issued when a request to stop the crawler has been received.
CRAWLER_STOP_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
Issued when a request to stop the crawler has been fully executed (crawler stopped).
Crawler.ReferenceProcessStatus - Enum in com.norconex.collector.core.crawler
 
CrawlerCommitterService - Class in com.norconex.collector.core.crawler
Wrapper around multiple Committers so they can all be handled as one.
CrawlerCommitterService(Crawler) - Constructor for class com.norconex.collector.core.crawler.CrawlerCommitterService
 
CrawlerConfig - Class in com.norconex.collector.core.crawler
Base Crawler configuration.
CrawlerConfig() - Constructor for class com.norconex.collector.core.crawler.CrawlerConfig
Creates a new crawler configuration.
CrawlerConfig.OrphansStrategy - Enum in com.norconex.collector.core.crawler
 
CrawlerConfigLoader - Class in com.norconex.collector.core.crawler
HTTP Crawler configuration loader.
CrawlerConfigLoader(Class<? extends CrawlerConfig>) - Constructor for class com.norconex.collector.core.crawler.CrawlerConfigLoader
 
CrawlerEvent - Class in com.norconex.collector.core.crawler
A crawler event.
CrawlerEvent.Builder - Class in com.norconex.collector.core.crawler
 
CrawlerLifeCycleListener - Class in com.norconex.collector.core.crawler
Listener adapter for crawler events.
CrawlerLifeCycleListener() - Constructor for class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
CrawlerMonitor - Class in com.norconex.collector.core.monitor
 
CrawlerMonitor(Crawler) - Constructor for class com.norconex.collector.core.monitor.CrawlerMonitor
 
CrawlerMonitorJMX - Class in com.norconex.collector.core.monitor
 
CrawlerMonitorMXBean - Interface in com.norconex.collector.core.monitor
 
CrawlState - Class in com.norconex.collector.core.doc
Reference processing status.
CrawlState(String) - Constructor for class com.norconex.collector.core.doc.CrawlState
Constructor.
createChildDocInfo(String, CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
 
createCrawler(CrawlerConfig) - Method in class com.norconex.collector.core.Collector
Creates a new crawler instance.
createDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
createDocumentChecksum(Doc) - Method in interface com.norconex.collector.core.checksum.IDocumentChecksummer
Creates a document checksum.
createMetadataChecksum(Properties) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
createMetadataChecksum(Properties) - Method in interface com.norconex.collector.core.checksum.IMetadataChecksummer
Creates a metadata checksum.

D

DataStoreException - Exception in com.norconex.collector.core.store
Crawl data store runtime exception.
DataStoreException() - Constructor for exception com.norconex.collector.core.store.DataStoreException
 
DataStoreException(String) - Constructor for exception com.norconex.collector.core.store.DataStoreException
 
DataStoreException(String, Throwable) - Constructor for exception com.norconex.collector.core.store.DataStoreException
 
DataStoreException(Throwable) - Constructor for exception com.norconex.collector.core.store.DataStoreException
 
DataStoreExporter - Exception in com.norconex.collector.core.store
Exports data stores to a format that can be imported back to the same or different store implementation.
DataStoreImporter - Exception in com.norconex.collector.core.store
Imports from a previously exported data store.
DEFAULT_FALLBACK_STRATEGY - Static variable in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
DEFAULT_FILENAME_PREFIX - Static variable in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
DEFAULT_WORK_DIR - Static variable in class com.norconex.collector.core.CollectorConfig
Default relative directory where progress files are stored.
delete(CrawlDoc) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
Delete a document operation using all accepting committers.
delete(String) - Method in interface com.norconex.collector.core.store.IDataStore
 
delete(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
delete(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
delete(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
DELETE - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
Deleting orphans sends them to the Committer for deletions and they are removed from the internal reference cache.
DELETE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
Deleting spoiled references sends them to the Committer for deletions and they are removed from the internal reference cache.
deleteCacheOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
 
DELETED - Static variable in class com.norconex.collector.core.doc.CrawlState
 
deleteFirst() - Method in interface com.norconex.collector.core.store.IDataStore
 
deleteFirst() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
deleteFirst() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
deleteFirst() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
DeleteRejectedEventListener - Class in com.norconex.collector.core.crawler.event.impl
Provides the ability to send deletion requests to your configured committer(s) whenever a reference is rejected, regardless whether it was encountered in a previous crawling session or not.
DeleteRejectedEventListener() - Constructor for class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
destroy() - Method in interface com.norconex.collector.core.stop.ICollectorStopper
Destroys resources allocated with this stopper.
destroy() - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
 
destroyCollector() - Method in class com.norconex.collector.core.Collector
 
destroyCrawler() - Method in class com.norconex.collector.core.crawler.Crawler
 
DocInfoPipelineContext - Class in com.norconex.collector.core.pipeline
A IPipelineStage context for collector Pipelines dealing with a CrawlDocInfo (e.g. document queuing).
DocInfoPipelineContext(Crawler, CrawlDocInfo) - Constructor for class com.norconex.collector.core.pipeline.DocInfoPipelineContext
Constructor.
doCreateDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
doCreateDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
doCreateMetaChecksum(Properties) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
doCreateMetaChecksum(Properties) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
DOCUMENT_COMMITTED_DELETE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was submitted to a committer for removal.
DOCUMENT_COMMITTED_UPSERT - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was submitted to a committer for upsert.
DOCUMENT_FETCHED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was successfully retrieved for processing.
DOCUMENT_IMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was imported.
DOCUMENT_METADATA_FETCHED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document metadata fields were successfully retrieved.
DOCUMENT_POSTIMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document post-import processor was executed properly.
DOCUMENT_PREIMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document pre-import processor was executed properly.
DOCUMENT_PROCESSED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was processed (successfully or not).
DOCUMENT_QUEUED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document reference was queued in the data store for processing.
DOCUMENT_SAVED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was saved.
DocumentChecksumStage - Class in com.norconex.collector.core.pipeline.committer
Common pipeline stage for creating a document checksum.
DocumentChecksumStage() - Constructor for class com.norconex.collector.core.pipeline.committer.DocumentChecksumStage
 
DocumentFiltersStage - Class in com.norconex.collector.core.pipeline.importer
 
DocumentFiltersStage() - Constructor for class com.norconex.collector.core.pipeline.importer.DocumentFiltersStage
 
DocumentPipelineContext - Class in com.norconex.collector.core.pipeline
IPipelineStage context for collector Pipelines dealing with an Doc.
DocumentPipelineContext(Crawler, CrawlDoc) - Constructor for class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
doExecute() - Method in class com.norconex.collector.core.crawler.Crawler
 
dropStore(String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
dropStore(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
dropStore(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
dropStore(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 

E

equals(Object) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
equals(Object) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
equals(Object) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
equals(Object) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.CollectorCommand
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.StartCommand
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
 
equals(Object) - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
 
equals(Object) - Method in class com.norconex.collector.core.CollectorConfig
 
equals(Object) - Method in class com.norconex.collector.core.CollectorEvent
 
equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerEvent
 
equals(Object) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
equals(Object) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
equals(Object) - Method in class com.norconex.collector.core.doc.CrawlDoc
 
equals(Object) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
equals(Object) - Method in class com.norconex.collector.core.doc.CrawlState
 
equals(Object) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
equals(Object) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
equals(Object) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
equals(Object) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
equals(Object) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
equals(Object) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
equals(Object) - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
 
equals(Object) - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
equals(Object) - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
 
equals(Object) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
equals(Object) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
equals(Object) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
ERROR - Static variable in class com.norconex.collector.core.doc.CrawlState
 
execute(DocInfoPipelineContext) - Method in class com.norconex.collector.core.pipeline.queue.QueueReferenceStage
 
execute(DocInfoPipelineContext) - Method in class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
 
execute(DocumentPipelineContext) - Method in class com.norconex.collector.core.pipeline.committer.CommitModuleStage
 
execute(DocumentPipelineContext) - Method in class com.norconex.collector.core.pipeline.committer.DocumentChecksumStage
 
execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.DocumentFiltersStage
 
execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.ImportModuleStage
 
execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
 
executeCommitterPipeline(Crawler, CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
 
executeImporterPipeline(ImporterPipelineContext) - Method in class com.norconex.collector.core.crawler.Crawler
 
executeQueuePipeline(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
 
exists(String) - Method in interface com.norconex.collector.core.store.IDataStore
 
exists(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
exists(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
exists(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
exportDataStore(Crawler, Path) - Static method in exception com.norconex.collector.core.store.DataStoreExporter
 
exportDataStore(Path) - Method in class com.norconex.collector.core.Collector
 
exportDataStore(Path) - Method in class com.norconex.collector.core.crawler.Crawler
 
ExtensionReferenceFilter - Class in com.norconex.collector.core.filter.impl
Filters a reference based on a comma-separated list of extensions.
ExtensionReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
ExtensionReferenceFilter(String) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
ExtensionReferenceFilter(String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
ExtensionReferenceFilter(String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 

F

FileBasedStopper - Class in com.norconex.collector.core.stop.impl
Listens for STOP requests using a stop file.
FileBasedStopper() - Constructor for class com.norconex.collector.core.stop.impl.FileBasedStopper
 
find(String) - Method in interface com.norconex.collector.core.store.IDataStore
 
find(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
find(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
find(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
findFirst() - Method in interface com.norconex.collector.core.store.IDataStore
 
findFirst() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
findFirst() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
findFirst() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
fire(CrawlerEvent) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
fire(String, Consumer<CrawlerEvent.Builder>) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
Fires a crawler event with the current crawler as source.
fireStopRequest() - Method in class com.norconex.collector.core.Collector
 
fireStopRequest(Collector) - Method in interface com.norconex.collector.core.stop.ICollectorStopper
Stops a currently running Collector.
fireStopRequest(Collector) - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
 
forEach(BiPredicate<String, T>) - Method in interface com.norconex.collector.core.store.IDataStore
 
forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
forEachActive(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
forEachCached(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
forEachProcessed(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
forEachQueued(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 

G

GenericMetadataChecksummer - Class in com.norconex.collector.core.checksum.impl
Generic implementation of IMetadataChecksummer that uses specified field names and their values to create a checksum.
GenericMetadataChecksummer() - Constructor for class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
GenericSpoiledReferenceStrategizer - Class in com.norconex.collector.core.spoil.impl
Generic implementation of ISpoiledReferenceStrategizer that offers a simple mapping between the crawl state of references that have turned "bad" and the strategy to adopt for each.
GenericSpoiledReferenceStrategizer() - Constructor for class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
get() - Static method in class com.norconex.collector.core.Collector
 
getActiveCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getActiveCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
 
getActiveCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
 
getAutoCommitBufferSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getAutoCommitDelay() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getAutoCompactFillRate() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getCacheConcurrency() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getCached(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getCachedDocInfo() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
getCachedDocInfo() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
Gets cached crawl data.
getCacheSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getCollector() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
getCollector() - Method in class com.norconex.collector.core.crawler.Crawler
 
getCollectorConfig() - Method in class com.norconex.collector.core.Collector
Gets the collector configuration.
getCommitter() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Deprecated.
getCommitters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets Committers responsible for persisting information to a target location/repository.
getCommitterService() - Method in class com.norconex.collector.core.crawler.Crawler
 
getCommitterService() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
getCompress() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
getConfig() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
getConfigFile() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
getConfigProperties() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getConfiguration() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
getConnectionString() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
getContent() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
getContentChecksum() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
Gets the content checksum.
getContentReader() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
getCrawlDate() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
Gets the crawl date.
getCrawlDocInfo() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
Gets the crawl data holding contextual information about the crawled reference.
getCrawlDocInfoType() - Method in class com.norconex.collector.core.crawler.Crawler
 
getCrawler() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
getCrawlerConfig() - Method in class com.norconex.collector.core.crawler.Crawler
Gets the crawler configuration.
getCrawlerConfigs() - Method in class com.norconex.collector.core.CollectorConfig
Gets crawler configurations.
getCrawlers() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
getCrawlers() - Method in class com.norconex.collector.core.Collector
Gets all crawler instances in this collector.
getCrawlersStartInterval() - Method in class com.norconex.collector.core.CollectorConfig
Gets the amount of time between each concurrent crawlers are started.
getDataStoreEngine() - Method in class com.norconex.collector.core.crawler.Crawler
 
getDataStoreEngine() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the crawl data store factory.
getDeferredShutdownDuration() - Method in class com.norconex.collector.core.CollectorConfig
Gets the amount of time to defer the collector shutdown when it is done executing.
getDocInfo() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
getDocInfo() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
 
getDocInfo() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
getDocInfoService() - Method in class com.norconex.collector.core.crawler.Crawler
 
getDocInfoService() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
getDocument() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
getDocumentChecksummer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the document checksummer.
getDocumentFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the document filters.
getDownloadDir() - Method in class com.norconex.collector.core.crawler.Crawler
 
getEventCounts() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
 
getEventCounts() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
 
getEventListeners() - Method in class com.norconex.collector.core.CollectorConfig
Gets event listeners.
getEventListeners() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets event listeners.
getEventManager() - Method in class com.norconex.collector.core.Collector
Gets the event manager.
getEventManager() - Method in class com.norconex.collector.core.crawler.Crawler
Gets the event manager.
getEventManager() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
getEventMatcher() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
Gets the event matcher used to identify which events can trigger a deletion request.
getEventMatcher() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
Gets the event matcher used to identify which events will be counted.
getExtensions() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
getFallbackStrategy() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
getField() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
getFieldMatcher() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Gets the field matcher.
getFieldMatcher() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Gets the field matcher.
getFieldMatcher() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
Gets the field matcher.
getId() - Method in class com.norconex.collector.core.Collector
Gets the collector unique identifier.
getId() - Method in class com.norconex.collector.core.CollectorConfig
Gets this collector unique identifier.
getId() - Method in class com.norconex.collector.core.crawler.Crawler
 
getId() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets this crawler unique identifier.
getImporter() - Method in class com.norconex.collector.core.crawler.Crawler
Gets the crawler Importer module.
getImporterConfig() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the Importer module configuration.
getImporterResponse() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
 
getMaxConcurrentCrawlers() - Method in class com.norconex.collector.core.CollectorConfig
Gets the maximum number of crawlers that can be executed concurrently.
getMaxDocuments() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the maximum number of documents that can be processed.
getMaximum() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
getMaxMemoryInstance() - Method in class com.norconex.collector.core.CollectorConfig
 
getMaxMemoryPool() - Method in class com.norconex.collector.core.CollectorConfig
 
getMaxParallelCrawlers() - Method in class com.norconex.collector.core.CollectorConfig
Deprecated.
getMetaChecksum() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
getMetadataChecksummer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the metadata checksummer.
getMetadataFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets metadata filters.
getMonitor() - Method in class com.norconex.collector.core.crawler.Crawler
 
getName() - Method in interface com.norconex.collector.core.store.IDataStore
 
getName() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
getName() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
getName() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
getNumThreads() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the maximum number of threads a crawler can use.
getOnMatch() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
getOnMatch() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
getOnMatch() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
getOnMatch() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
getOnMatch() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
getOnMultiple() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
getOnSet() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Gets the property setter to use when a value is set.
getOnSet() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Gets the property setter to use when a value is set.
getOrphansStrategy() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the strategy to adopt when there are orphans.
getPageSplitSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
Get the max memory page size in bytes before splitting it, in bytes.
getParentRootReference() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
getProcessed(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getProcessedCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getProcessedCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
 
getProcessedCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
 
getProcessingStage(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getQueueCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
getQueuedCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
 
getQueuedCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
 
getReferenceFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets reference filters
getRegex() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
getRegex() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
getReleaseVersions() - Method in class com.norconex.collector.core.Collector
 
getSource() - Method in class com.norconex.collector.core.CollectorEvent
 
getSource() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
 
getSourceFields() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Deprecated.
getSourceFields() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Deprecated.
getSourceFieldsRegex() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Deprecated.
getSourceFieldsRegex() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Deprecated.
getSpoiledReferenceStrategizer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the spoiled state strategy resolver.
getState() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
getStopOnExceptions() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets the exceptions we want to stop the crawler on.
getStoreNames() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
getStoreNames() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getStoreNames() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
getStoreNames() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
getStoreType(String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
getStoreType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getStoreType(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
getStoreType(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
getStreamFactory() - Method in class com.norconex.collector.core.Collector
 
getStreamFactory() - Method in class com.norconex.collector.core.crawler.Crawler
 
getSubject() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
Gets the subject.
getTablePrefix() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getTargetField() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Deprecated.
getTargetField() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Deprecated.
getTempDir() - Method in class com.norconex.collector.core.Collector
 
getTempDir() - Method in class com.norconex.collector.core.CollectorConfig
Gets the temporary directory where files can be deleted safely by the OS or other processes when the collector is not running.
getTempDir() - Method in class com.norconex.collector.core.crawler.Crawler
Gets the directory where most temporary files are created for the duration of a crawling session.
getTextType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getTimestapType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getToField() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Gets the metadata field to use to store the checksum value.
getToField() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Gets the metadata field to use to store the checksum value.
getValueMatcher() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
Gets the value matcher.
getValueMatcher() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
Gets the value matcher.
getVarcharType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
getVariablesFile() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
getVersion() - Method in class com.norconex.collector.core.Collector
 
getWorkDir() - Method in class com.norconex.collector.core.Collector
 
getWorkDir() - Method in class com.norconex.collector.core.CollectorConfig
Gets the base directory location where files created during execution are created.
getWorkDir() - Method in class com.norconex.collector.core.crawler.Crawler
Gets the directory where files needing to be persisted between crawling sessions are kept.
GRACE_ONCE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
Gracing spoiled references gives them one chance (and only one) to recover by not sending a deletion request to the Committer the first time, but doing so if the reference is still spoiled on the next crawl.

H

handleExecutionException(Exception, CommandLine, CommandLine.ParseResult) - Method in class com.norconex.collector.core.cmdline.CollectorCommand
 
handleOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
 
hasCache() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
hashCode() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
hashCode() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
hashCode() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
hashCode() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
hashCode() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
hashCode() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
 
hashCode() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
 
hashCode() - Method in class com.norconex.collector.core.cmdline.StartCommand
 
hashCode() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
 
hashCode() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
 
hashCode() - Method in class com.norconex.collector.core.CollectorConfig
 
hashCode() - Method in class com.norconex.collector.core.CollectorEvent
 
hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
 
hashCode() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
hashCode() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
hashCode() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
hashCode() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
hashCode() - Method in class com.norconex.collector.core.doc.CrawlState
 
hashCode() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
hashCode() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
hashCode() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
hashCode() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
hashCode() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
hashCode() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
hashCode() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
 
hashCode() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
hashCode() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
 
hashCode() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
hashCode() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
hashCode() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 

I

ICollectorStopper - Interface in com.norconex.collector.core.stop
Responsible for shutting down a Collector upon explicit invocation of ICollectorStopper.fireStopRequest(Collector) or when specific conditions are met.
IDataStore<T> - Interface in com.norconex.collector.core.store
 
IDataStoreEngine - Interface in com.norconex.collector.core.store
 
IDocumentChecksummer - Interface in com.norconex.collector.core.checksum
Creates a checksum representing a a document.
IDocumentFilter - Interface in com.norconex.collector.core.filter
Filter a document after the document content is fetched, downloaded, or otherwise read or acquired.
IGNORE - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
Ignoring orphans effectively does nothing with them (not deleted, not processed).
IGNORE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
Ignoring spoiled references does not send a deletion request to the Committer.
IMetadataChecksummer - Interface in com.norconex.collector.core.checksum
Creates a checksum representing a document based on document metadata values obtained prior to fetching that document (e.g.
IMetadataFilter - Interface in com.norconex.collector.core.filter
Filter a reference based on the metadata that could be obtained for a document, before it was fetched, downloaded, or otherwise read or acquired (e.g.
importDataStore(Crawler, Path) - Static method in exception com.norconex.collector.core.store.DataStoreImporter
 
importDataStore(Path) - Method in class com.norconex.collector.core.crawler.Crawler
 
importDataStore(List<Path>) - Method in class com.norconex.collector.core.Collector
 
ImporterPipelineContext - Class in com.norconex.collector.core.pipeline.importer
IPipelineStage context for collector Pipelines dealing with ImporterResponse.
ImporterPipelineContext(Crawler, CrawlDoc) - Constructor for class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
Constructor.
ImportModuleStage - Class in com.norconex.collector.core.pipeline.importer
Common pipeline stage for importing documents.
ImportModuleStage() - Constructor for class com.norconex.collector.core.pipeline.importer.ImportModuleStage
 
init(Crawler) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
init(Crawler) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
init(Crawler) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
init(Crawler) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
init(CommitterContext) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
initCollector() - Method in class com.norconex.collector.core.Collector
 
initCrawlDoc(CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
 
initCrawler() - Method in class com.norconex.collector.core.crawler.Crawler
 
IReferenceFilter - Interface in com.norconex.collector.core.filter
Filter a document based on its reference, before its properties or content gets read or otherwise acquired.
is(CrawlDocInfo.Stage) - Method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
 
IS_CRAWL_NEW - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
Boolean flag indicating whether a document is new to the crawler that fetched it.
isActiveEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
isCollectorShutdown(Event) - Method in class com.norconex.collector.core.CollectorEvent
 
isCombineFieldsAndContent() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Gets whether we are combining the fields and content checksums.
isCrawlerShutdown() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
 
isDisabled() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Deprecated.
Since 2.0.0, not having a checksummer defined or setting one explicitly to null effectively disables it.
isDisabled() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Deprecated.
Since 2.0.0, not having a checksummer defined or setting one explicitly to null effectively disables it.
isDocumentDeduplicate() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets whether to turn on deduplication based on document checksum.
isEmpty() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
isEmpty() - Method in interface com.norconex.collector.core.store.IDataStore
 
isEmpty() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
isEmpty() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
isEmpty() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
isGoodState() - Method in class com.norconex.collector.core.doc.CrawlState
Returns whether a reference should be considered "good" (the corresponding document is not in a "bad" state, such as being rejected or produced an error.
isKeep() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Whether to keep the document checksum value as a new field in the document metadata.
isKeep() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Whether to keep the metadata checksum value as a new metadata field.
isMaxDocuments() - Method in class com.norconex.collector.core.crawler.Crawler
 
isMetadataDeduplicate() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Gets whether to turn on deduplication based on metadata checksum.
isNewOrModified() - Method in class com.norconex.collector.core.doc.CrawlState
Returns whether a state indicates new or modified.
isOneOf(CrawlState...) - Method in class com.norconex.collector.core.doc.CrawlState
 
isOrphan() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
ISpoiledReferenceStrategizer - Interface in com.norconex.collector.core.spoil
Decides which strategy to adopt for a given reference with a bad state.
isProcessedEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
isQueueEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
isQueueInitialized() - Method in class com.norconex.collector.core.crawler.Crawler
 
isRunning() - Method in class com.norconex.collector.core.Collector
 
isSkipped() - Method in class com.norconex.collector.core.doc.CrawlState
Returns whether a state indicate the document is to be skipped (CrawlState.UNMODIFIED or CrawlState.PREMATURE).
isStopped() - Method in class com.norconex.collector.core.crawler.Crawler
Whether the crawler job was stopped.

J

JdbcDataStore<T> - Class in com.norconex.collector.core.store.impl.jdbc
 
JdbcDataStoreEngine - Class in com.norconex.collector.core.store.impl.jdbc
Data store engine using a JDBC-compatible database for storing crawl data.
JdbcDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 

L

launch(Collector, String[]) - Method in class com.norconex.collector.core.cmdline.CollectorCommandLauncher
 
listenForStopRequest(Collector) - Method in interface com.norconex.collector.core.stop.ICollectorStopper
Setup and/or start the stopper, which can be terminated by invoking stop in the same or different JVM (see concrete implementation for details).
listenForStopRequest(Collector) - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
 
loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
loadCollectorConfigFromXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
 
loadConfig() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
loadCrawlerConfig(CrawlerConfig, XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
Loads a crawler configuration, which can be either the default crawler or real crawler configuration instances (keeping defaults).
loadCrawlerConfigFromXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
loadCrawlerConfigs(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
 
loadCrawlerConfigs(File) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
Deprecated.
loadCrawlerConfigs(File, File) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
Deprecated.
loadCrawlerConfigs(Path) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
Loads crawler configurations.
loadCrawlerConfigs(Path, Path) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
Loads crawler configurations.
loadFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
loadFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
loadFromXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
 
loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
loadFromXML(XML) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
lock() - Method in class com.norconex.collector.core.Collector
 

M

markReferenceVariationsAsProcessed(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
 
MAX_REACHED - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
 
MD5DocumentChecksummer - Class in com.norconex.collector.core.checksum.impl
Implementation of IDocumentChecksummer which returns a MD5 checksum value of the extracted document content unless one or more given source fields are specified, in which case the MD5 checksum value is constructed from those fields.
MD5DocumentChecksummer() - Constructor for class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
MdcUtil - Class in com.norconex.collector.core.monitor
Utility methods to simplify adding Mapped Diagnostic Context (MDC) to logging in a consistent way for crawlers and collectors, as well as offering filename-friendly version as well.
metadataChecksumMD5(Properties, TextMatcher) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
 
metadataChecksumMD5(Properties, String, List<String>) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
Deprecated.
metadataChecksumPlain(Properties, TextMatcher) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
 
metadataChecksumPlain(Properties, String, List<String>) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
Deprecated.
MetadataFilter - Class in com.norconex.collector.core.filter.impl
Accepts or rejects a reference based on whether one or more metadata field values are matching.
MetadataFilter() - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
 
MetadataFilter(TextMatcher, TextMatcher) - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
 
MetadataFilter(TextMatcher, TextMatcher, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
 
MODIFIED - Static variable in class com.norconex.collector.core.doc.CrawlState
 
MongoDataStore<T> - Class in com.norconex.collector.core.store.impl.mongodb
 
MongoDataStoreEngine - Class in com.norconex.collector.core.store.impl.mongodb
Data store engine using MongoDB for storing crawl data.
MongoDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
MVStoreDataStore<T> - Class in com.norconex.collector.core.store.impl.mvstore
 
MVStoreDataStoreConfig - Class in com.norconex.collector.core.store.impl.mvstore
MVStore configuration parameters.
MVStoreDataStoreConfig() - Constructor for class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
MVStoreDataStoreEngine - Class in com.norconex.collector.core.store.impl.mvstore
 
MVStoreDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 

N

NEW - Static variable in class com.norconex.collector.core.doc.CrawlState
 
NORCONEX_ASCII - Static variable in class com.norconex.collector.core.Collector
Simple ASCI art of Norconex.
NOT_FOUND - Static variable in class com.norconex.collector.core.doc.CrawlState
 

O

OK - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
 
onCollectorCleanBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorCleanEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorError(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorEvent(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorRunBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorRunEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorShutdown(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
Triggered when a collector is ending its execution on either a CollectorEvent.COLLECTOR_ERROR, CollectorEvent.COLLECTOR_RUN_END or CollectorEvent.COLLECTOR_STOP_END event.
onCollectorStopBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCollectorStopEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
 
onCrawlerCleanBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerCleanEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerEvent(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerInitBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerInitEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerRunBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerRunEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerRunThreadBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerRunThreadEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerShutdown(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
Triggered when a crawler is ending its execution on either a CrawlerEvent.CRAWLER_RUN_END or CrawlerEvent.CRAWLER_STOP_END event.
onCrawlerStopBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
onCrawlerStopEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
 
open() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
openStore(String, Class<? extends T>) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 

P

pollQueue() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
PREFIX - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
 
PREMATURE - Static variable in class com.norconex.collector.core.doc.CrawlState
For collectors that support it, this state indicates a previously crawled document is not yet ready to be re-crawled.
printErr() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
printErr(String) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
printOut() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
printOut(String) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
PROCESS - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
Processing orphans tries to obtain and process them again, normally.
processed(CrawlDocInfo) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
PROCESSED - com.norconex.collector.core.doc.CrawlDocInfo.Stage
 
processNextReference(Crawler.ProcessFlags) - Method in class com.norconex.collector.core.crawler.Crawler
 
processReferences(Crawler.ProcessFlags) - Method in class com.norconex.collector.core.crawler.Crawler
 

Q

queue(CrawlDocInfo) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
 
QUEUE_EMPTY - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
 
QUEUED - com.norconex.collector.core.doc.CrawlDocInfo.Stage
 
QueueReferenceStage - Class in com.norconex.collector.core.pipeline.queue
Common pipeline stage for queuing documents.
QueueReferenceStage() - Constructor for class com.norconex.collector.core.pipeline.queue.QueueReferenceStage
Constructor.

R

ReferenceFilter - Class in com.norconex.collector.core.filter.impl
Filters URL based on a regular expression.
ReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
 
ReferenceFilter(TextMatcher) - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
 
ReferenceFilter(TextMatcher, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
 
ReferenceFiltersStage - Class in com.norconex.collector.core.pipeline.queue
Common pipeline stage for filtering references.
ReferenceFiltersStage() - Constructor for class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
 
ReferenceFiltersStage(String) - Constructor for class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
 
ReferenceFiltersStageUtil - Class in com.norconex.collector.core.pipeline.queue
Reference-filtering stage utility methods.
RegexMetadataFilter - Class in com.norconex.collector.core.filter.impl
Deprecated.
Since 2.0.0, use MetadataFilter instead.
RegexMetadataFilter() - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
RegexMetadataFilter(String, String) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
RegexMetadataFilter(String, String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
RegexMetadataFilter(String, String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
RegexReferenceFilter - Class in com.norconex.collector.core.filter.impl
Deprecated.
Since 2.0.0, use ReferenceFilter
RegexReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
RegexReferenceFilter(String) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
RegexReferenceFilter(String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
RegexReferenceFilter(String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
register(Crawler) - Static method in class com.norconex.collector.core.monitor.CrawlerMonitorJMX
 
REJECTED - Static variable in class com.norconex.collector.core.doc.CrawlState
 
REJECTED_BAD_STATUS - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected because the status obtained when trying to obtain it was not accepted (e.g., 500 HTTP error code).
REJECTED_DUPLICATE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected since another document with a different reference was already processed with the same digital signature ( checksum).
REJECTED_ERROR - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected because an error occurred when processing it.
REJECTED_FILTER - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected by a filters.
REJECTED_IMPORT - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected by the Importer module.
REJECTED_NOTFOUND - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected because it could not be found (e.g., no longer exists at a given location).
REJECTED_PREMATURE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document could not be re-crawled because it is not yet ready to be re-crawled.
REJECTED_UNMODIFIED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
A document was rejected as it was not modified since last time it was crawled.
renameStore(IDataStore<?>, String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
 
renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
reprocessCacheOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
 
resolveDocumentChecksum(String, DocumentPipelineContext, Object) - Static method in class com.norconex.collector.core.pipeline.ChecksumStageUtil
 
resolveMetaChecksum(String, DocumentPipelineContext, Object) - Static method in class com.norconex.collector.core.pipeline.ChecksumStageUtil
 
resolveReferenceFilters(List<IReferenceFilter>, DocInfoPipelineContext, String) - Static method in class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStageUtil
 
resolveSpoiledReferenceStrategy(String, CrawlState) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
resolveSpoiledReferenceStrategy(String, CrawlState) - Method in interface com.norconex.collector.core.spoil.ISpoiledReferenceStrategizer
Establish which spoiled reference strategy to adopt.
runCommand() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.CleanCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.ConfigCheckCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.StartCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.StopCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
 
runCommand() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
 

S

save(String, T) - Method in interface com.norconex.collector.core.store.IDataStore
 
save(String, T) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
 
save(String, T) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
 
save(String, T) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
 
saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
saveCollectorConfigToXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
 
saveCrawlerConfigToXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
SaveDocumentStage - Class in com.norconex.collector.core.pipeline.importer
Common pipeline stage for saving documents.
SaveDocumentStage() - Constructor for class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
 
saveToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
saveToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
saveToXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
 
saveToXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
saveToXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
saveToXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
saveToXML(XML) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 
setAutoCommitBufferSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setAutoCommitDelay(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setAutoCompactFillRate(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setCacheConcurrency(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setCacheSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
setCollectorId(String) - Static method in class com.norconex.collector.core.monitor.MdcUtil
Sets two representations of the supplied collector ID in the MDC:
setCombineFieldsAndContent(boolean) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Sets whether to combine the fields and content checksums.
setCommitter(ICommitter) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Deprecated.
setCommitters(ICommitter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets Committers responsible for persisting information to a target location/repository.
setCommitters(List<ICommitter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets Committers responsible for persisting information to a target location/repository.
setCompress(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setConfigFile(Path) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
setConfigProperties(Properties) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
setConnectionString(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
 
setContentChecksum(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
Sets the content checksum.
setCrawlDate(ZonedDateTime) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
Sets the crawl date.
setCrawlerConfigs(CrawlerConfig...) - Method in class com.norconex.collector.core.CollectorConfig
Sets crawler configurations.
setCrawlerConfigs(List<CrawlerConfig>) - Method in class com.norconex.collector.core.CollectorConfig
Sets crawler configurations.
setCrawlerId(String) - Static method in class com.norconex.collector.core.monitor.MdcUtil
Sets two representations of the supplied crawler ID in the MDC:
setCrawlersStartInterval(Duration) - Method in class com.norconex.collector.core.CollectorConfig
Sets the amount of time in between each concurrent crawlers are started.
setDataStoreEngine(IDataStoreEngine) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the crawl data store factory.
setDeferredShutdownDuration(Duration) - Method in class com.norconex.collector.core.CollectorConfig
Sets the amount of time to defer the collector shutdown when it is done executing.
setDisabled(boolean) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Deprecated.
Since 2.0.0, not having a checksummer defined or setting one explicitly to null effectively disable it.
setDisabled(boolean) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Deprecated.
Since 2.0.0, not having a checksummer defined or setting one explicitly to null effectively disable it.
setDocumentChecksummer(IDocumentChecksummer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the document checksummer.
setDocumentDeduplicate(boolean) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets whether to turn on deduplication based on document checksum.
setDocumentFilters(IDocumentFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets document filters.
setDocumentFilters(List<IDocumentFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets document filters.
setEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.CollectorConfig
Sets event listeners.
setEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets event listeners.
setEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.CollectorConfig
Sets event listeners.
setEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets event listeners.
setEventMatcher(TextMatcher) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
Sets the event matcher used to identify which events can trigger a deletion request.
setEventMatcher(TextMatcher) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
Sets the event matcher used to identify which events will be counted.
setExtensions(String...) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
setExtensions(List<String>) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
setFallbackStrategy(SpoiledReferenceStrategy) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
setField(String) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
Sets the field matcher.
setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
Sets the field matcher.
setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
Sets the field matcher.
setId(String) - Method in class com.norconex.collector.core.CollectorConfig
Sets this collector unique identifier.
setId(String) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets this crawler unique identifier.
setImporterConfig(ImporterConfig) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the Importer module configuration.
setImporterResponse(ImporterResponse) - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
 
setKeep(boolean) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Sets whether to keep the document checksum value as a new field in the document metadata.
setKeep(boolean) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Sets whether to keep the metadata checksum value as a new metadata field.
setMaxConcurrentCrawlers(int) - Method in class com.norconex.collector.core.CollectorConfig
Sets the maximum number of crawlers that can be executed concurrently.
setMaxDocuments(int) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the maximum number of documents that can be processed.
setMaximum(long) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
setMaxMemoryInstance(long) - Method in class com.norconex.collector.core.CollectorConfig
 
setMaxMemoryPool(long) - Method in class com.norconex.collector.core.CollectorConfig
 
setMaxParallelCrawlers(int) - Method in class com.norconex.collector.core.CollectorConfig
Deprecated.
setMetaChecksum(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
setMetadataChecksummer(IMetadataChecksummer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the metadata checksummer.
setMetadataDeduplicate(boolean) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets whether to turn on deduplication based on metadata checksum.
setMetadataFilters(IMetadataFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets metadata filters.
setMetadataFilters(List<IMetadataFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets metadata filters.
setNumThreads(int) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the maximum number of threads a crawler can use.
setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
setOnMultiple(StopCrawlerOnMaxEventListener.OnMultiple) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
setOnSet(PropertySetter) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Sets the property setter to use when a value is set.
setOnSet(PropertySetter) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Sets the property setter to use when a value is set.
setOrphansStrategy(CrawlerConfig.OrphansStrategy) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the strategy to adopt when there are orphans.
setPageSplitSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
setParentRootReference(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
setReferenceFilters(IReferenceFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets reference filters.
setReferenceFilters(List<IReferenceFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets reference filters.
setRegex(String) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
setRegex(String) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
setSourceFields(String...) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
setSourceFields(String...) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
setSourceFields(List<String>) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
setSourceFields(List<String>) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
setSourceFieldsRegex(String) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
setSourceFieldsRegex(String) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
setSpoiledReferenceStrategizer(ISpoiledReferenceStrategizer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the spoiled state strategy resolver.
setState(CrawlState) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
setStopOnExceptions(Class<? extends Exception>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the exceptions we want to stop the crawler on.
setStopOnExceptions(List<Class<? extends Exception>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
Sets the exceptions we want to stop the crawler on.
setTablePrefix(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
setTargetField(String) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Deprecated.
setTargetField(String) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Deprecated.
setTempDir(Path) - Method in class com.norconex.collector.core.CollectorConfig
/** Sets the temporary directory where files can be deleted safely by the OS or other processes when the collector is not running.
setTextType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
setTimestapType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
setToField(String) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
Sets the metadata field name to use to store the checksum value.
setToField(String) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
Sets the metadata field name to use to store the checksum value.
setValueMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
Sets the value matcher.
setValueMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
Sets the value matcher.
setVarcharType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
 
setVariablesFile(Path) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
setWorkDir(Path) - Method in class com.norconex.collector.core.CollectorConfig
Sets the base directory location where files created during execution are created.
SpoiledReferenceStrategy - Enum in com.norconex.collector.core.spoil
Markers indicating what to do with references that were once processed properly, but failed to get a good processing state a subsequent time around.
start() - Method in class com.norconex.collector.core.Collector
Starts all crawlers defined in configuration.
start() - Method in class com.norconex.collector.core.crawler.Crawler
Starts crawling.
StartCommand - Class in com.norconex.collector.core.cmdline
Start the Collector.
StartCommand() - Constructor for class com.norconex.collector.core.cmdline.StartCommand
 
stop() - Method in class com.norconex.collector.core.Collector
Stops a running instance of this Collector.
stop() - Method in class com.norconex.collector.core.crawler.Crawler
 
StopCommand - Class in com.norconex.collector.core.cmdline
Stop the Collector.
StopCommand() - Constructor for class com.norconex.collector.core.cmdline.StopCommand
 
StopCrawlerOnMaxEventListener - Class in com.norconex.collector.core.crawler.event.impl
Alternative to CrawlerConfig.setMaxDocuments(int) for stopping the crawler upon reaching specific event counts.
StopCrawlerOnMaxEventListener() - Constructor for class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
StopCrawlerOnMaxEventListener.OnMultiple - Enum in com.norconex.collector.core.crawler.event.impl
 
StoreExportCommand - Class in com.norconex.collector.core.cmdline
Export crawl store to specified file.
StoreExportCommand() - Constructor for class com.norconex.collector.core.cmdline.StoreExportCommand
 
StoreImportCommand - Class in com.norconex.collector.core.cmdline
Import crawl store from specified file.
StoreImportCommand() - Constructor for class com.norconex.collector.core.cmdline.StoreImportCommand
 
subject(Object) - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
 
SUM - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
Stop the crawler when the sum of all matching event counts have reached the maximum.

T

toString() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
 
toString() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
 
toString() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
 
toString() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
 
toString() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
 
toString() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
 
toString() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
 
toString() - Method in class com.norconex.collector.core.cmdline.StartCommand
 
toString() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
 
toString() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
 
toString() - Method in class com.norconex.collector.core.Collector
 
toString() - Method in class com.norconex.collector.core.CollectorConfig
 
toString() - Method in class com.norconex.collector.core.crawler.Crawler
 
toString() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
 
toString() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
 
toString() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
 
toString() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
 
toString() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
 
toString() - Method in class com.norconex.collector.core.doc.CrawlDoc
 
toString() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
 
toString() - Method in class com.norconex.collector.core.doc.CrawlState
 
toString() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
 
toString() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
 
toString() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
 
toString() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
Deprecated.
 
toString() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
Deprecated.
 
toString() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
 
toString() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
 
toString() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
 
toString() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
 
toString() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
 
toString() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
 
toString() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
 

U

unlock() - Method in class com.norconex.collector.core.Collector
 
UNMODIFIED - Static variable in class com.norconex.collector.core.doc.CrawlState
 
unregister(Crawler) - Static method in class com.norconex.collector.core.monitor.CrawlerMonitorJMX
 
UNSUPPORTED - Static variable in class com.norconex.collector.core.doc.CrawlState
Typically when a reference cannot be processed since it is not supported by the collector or one of its configured component.
upsert(CrawlDoc) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
Updates or inserts a document using all accepting committers.
urlToPath(String) - Static method in class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
 

V

valueOf(String) - Static method in enum com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
Returns the enum constant of this type with the specified name.
valueOf(String) - Static method in class com.norconex.collector.core.doc.CrawlState
 
valueOf(String) - Static method in enum com.norconex.collector.core.spoil.SpoiledReferenceStrategy
Returns the enum constant of this type with the specified name.
values() - Static method in enum com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
Returns an array containing the constants of this enum type, in the order they are declared.
values() - Static method in enum com.norconex.collector.core.spoil.SpoiledReferenceStrategy
Returns an array containing the constants of this enum type, in the order they are declared.
A B C D E F G H I J L M N O P Q R S T U V 
All Classes All Packages