A B C D E F G H I J L M N O P Q R S T U V
All Classes All Packages
All Classes All Packages
All Classes All Packages
A
- AbstractDocumentChecksummer - Class in com.norconex.collector.core.checksum
-
Abstract implementation of
IDocumentChecksummer
giving the option to keep the generated checksum in a metadata field. - AbstractDocumentChecksummer() - Constructor for class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- AbstractMetadataChecksummer - Class in com.norconex.collector.core.checksum
-
Abstract implementation of
IMetadataChecksummer
giving the option to keep the generated checksum. - AbstractMetadataChecksummer() - Constructor for class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- AbstractPipelineContext - Class in com.norconex.collector.core.pipeline
-
Base
IPipelineStage
context for collectorPipeline
s. - AbstractPipelineContext(Crawler) - Constructor for class com.norconex.collector.core.pipeline.AbstractPipelineContext
-
Constructor.
- AbstractSubCommand - Class in com.norconex.collector.core.cmdline
-
Base class for subcommands.
- AbstractSubCommand() - Constructor for class com.norconex.collector.core.cmdline.AbstractSubCommand
- accept(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- accept(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- accept(Event) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- accept(Event) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- acceptDocument(Doc) - Method in interface com.norconex.collector.core.filter.IDocumentFilter
-
Whether to accept a document.
- acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- acceptDocument(Doc) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- acceptMetadata(String, Properties) - Method in interface com.norconex.collector.core.filter.IMetadataFilter
-
Whether to accept the metadata.
- acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- acceptMetadata(String, Properties) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- acceptReference(String) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- acceptReference(String) - Method in interface com.norconex.collector.core.filter.IReferenceFilter
-
Whether to accept this reference.
- ACTIVE - com.norconex.collector.core.doc.CrawlDocInfo.Stage
- addEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.CollectorConfig
-
Adds event listeners.
- addEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Adds event listeners.
- addEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.CollectorConfig
-
Adds event listeners.
- addEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Adds event listeners.
- addMapping(CrawlState, SpoiledReferenceStrategy) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- afterCrawlerExecution() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gives crawler implementations a chance to do something right after the crawler is done processing its last reference, before all resources are shut down.
- ALL - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
-
Stop the crawler when all of the matching event counts have reached the maximum.
- ANY - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
-
Stop the crawler when any of the matching event count reaches the specified maximum.
B
- BAD_STATUS - Static variable in class com.norconex.collector.core.doc.CrawlState
- beforeCrawlerExecution(boolean) - Method in class com.norconex.collector.core.crawler.Crawler
-
Gives crawler implementations a chance to prepare before execution starts Invoked right after the
CrawlerEvent.CRAWLER_RUN_BEGIN
is fired. - beforeFinalizeDocumentProcessing(CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
-
Gives implementors a change to take action on a document before its processing is being finalized (cycle end-of-life for a crawled reference).
- build() - Method in class com.norconex.collector.core.CollectorEvent.Builder
- build() - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
- Builder(String, Collector) - Constructor for class com.norconex.collector.core.CollectorEvent.Builder
- Builder(String, Crawler) - Constructor for class com.norconex.collector.core.crawler.CrawlerEvent.Builder
C
- call() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- call() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
- CHECKSUM_DOC - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
- CHECKSUM_METADATA - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
- checksumMD5(InputStream) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
- checksumMD5(String) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
- ChecksumStageUtil - Class in com.norconex.collector.core.pipeline
-
Checksum stage utility methods.
- ChecksumUtil - Class in com.norconex.collector.core.checksum
-
Checksum utility methods.
- clean() - Method in class com.norconex.collector.core.Collector
- clean() - Method in class com.norconex.collector.core.crawler.Crawler
- clean() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- clean() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- clean() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- clean() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- clean() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- CleanCommand - Class in com.norconex.collector.core.cmdline
-
Clean the Collector crawling history.
- CleanCommand() - Constructor for class com.norconex.collector.core.cmdline.CleanCommand
- clear() - Method in interface com.norconex.collector.core.store.IDataStore
- clear() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- clear() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- clear() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- clearEventListeners() - Method in class com.norconex.collector.core.CollectorConfig
-
Clears all event listeners.
- clearEventListeners() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Clears all event listeners.
- close() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- close() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- close() - Method in interface com.norconex.collector.core.store.IDataStore
- close() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- close() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- close() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- close() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- close() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- close() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- close() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- Collector - Class in com.norconex.collector.core
-
Base implementation of a Collector.
- Collector(CollectorConfig) - Constructor for class com.norconex.collector.core.Collector
-
Creates and configure a Collector with the provided configuration.
- Collector(CollectorConfig, EventManager) - Constructor for class com.norconex.collector.core.Collector
-
Creates and configure a Collector with the provided configuration.
- COLLECTOR_CLEAN_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_CLEAN_END - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_ERROR - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_RUN_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_RUN_END - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STOP_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STOP_END - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STORE_EXPORT_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STORE_EXPORT_END - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STORE_IMPORT_BEGIN - Static variable in class com.norconex.collector.core.CollectorEvent
- COLLECTOR_STORE_IMPORT_END - Static variable in class com.norconex.collector.core.CollectorEvent
- CollectorCommand - Class in com.norconex.collector.core.cmdline
-
Encapsulates command line arguments when running the Collector from a command prompt.
- CollectorCommand(Collector) - Constructor for class com.norconex.collector.core.cmdline.CollectorCommand
- CollectorCommandLauncher - Class in com.norconex.collector.core.cmdline
-
Launches a collector implementation from a string array representing command line arguments.
- CollectorCommandLauncher() - Constructor for class com.norconex.collector.core.cmdline.CollectorCommandLauncher
- CollectorConfig - Class in com.norconex.collector.core
-
Base Collector configuration.
- CollectorConfig() - Constructor for class com.norconex.collector.core.CollectorConfig
- CollectorConfig(Class<? extends CrawlerConfig>) - Constructor for class com.norconex.collector.core.CollectorConfig
- CollectorEvent - Class in com.norconex.collector.core
-
A crawler event.
- CollectorEvent.Builder - Class in com.norconex.collector.core
- CollectorException - Exception in com.norconex.collector.core
-
Runtime exception for most unrecoverable issues thrown by Collector classes.
- CollectorException() - Constructor for exception com.norconex.collector.core.CollectorException
- CollectorException(String) - Constructor for exception com.norconex.collector.core.CollectorException
- CollectorException(String, Throwable) - Constructor for exception com.norconex.collector.core.CollectorException
- CollectorException(Throwable) - Constructor for exception com.norconex.collector.core.CollectorException
- CollectorLifeCycleListener - Class in com.norconex.collector.core
-
Collector event listener adapter for collector startup/shutdown.
- CollectorLifeCycleListener() - Constructor for class com.norconex.collector.core.CollectorLifeCycleListener
- CollectorStopperException - Exception in com.norconex.collector.core.stop
-
Exception thrown when a problem occurred while trying to stop a collector.
- CollectorStopperException(String) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
- CollectorStopperException(String, Throwable) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
- CollectorStopperException(Throwable) - Constructor for exception com.norconex.collector.core.stop.CollectorStopperException
- com.norconex.collector.core - package com.norconex.collector.core
- com.norconex.collector.core.checksum - package com.norconex.collector.core.checksum
- com.norconex.collector.core.checksum.impl - package com.norconex.collector.core.checksum.impl
- com.norconex.collector.core.cmdline - package com.norconex.collector.core.cmdline
- com.norconex.collector.core.crawler - package com.norconex.collector.core.crawler
- com.norconex.collector.core.crawler.event.impl - package com.norconex.collector.core.crawler.event.impl
- com.norconex.collector.core.doc - package com.norconex.collector.core.doc
- com.norconex.collector.core.filter - package com.norconex.collector.core.filter
- com.norconex.collector.core.filter.impl - package com.norconex.collector.core.filter.impl
- com.norconex.collector.core.monitor - package com.norconex.collector.core.monitor
- com.norconex.collector.core.pipeline - package com.norconex.collector.core.pipeline
- com.norconex.collector.core.pipeline.committer - package com.norconex.collector.core.pipeline.committer
- com.norconex.collector.core.pipeline.importer - package com.norconex.collector.core.pipeline.importer
- com.norconex.collector.core.pipeline.queue - package com.norconex.collector.core.pipeline.queue
- com.norconex.collector.core.spoil - package com.norconex.collector.core.spoil
- com.norconex.collector.core.spoil.impl - package com.norconex.collector.core.spoil.impl
- com.norconex.collector.core.stop - package com.norconex.collector.core.stop
- com.norconex.collector.core.stop.impl - package com.norconex.collector.core.stop.impl
- com.norconex.collector.core.store - package com.norconex.collector.core.store
- com.norconex.collector.core.store.impl.jdbc - package com.norconex.collector.core.store.impl.jdbc
- com.norconex.collector.core.store.impl.mongodb - package com.norconex.collector.core.store.impl.mongodb
- com.norconex.collector.core.store.impl.mvstore - package com.norconex.collector.core.store.impl.mvstore
- commandLine() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- CommitModuleStage - Class in com.norconex.collector.core.pipeline.committer
-
Common pipeline stage for committing documents.
- CommitModuleStage() - Constructor for class com.norconex.collector.core.pipeline.committer.CommitModuleStage
- ConfigCheckCommand - Class in com.norconex.collector.core.cmdline
-
Validate configuration file format and quit.
- ConfigCheckCommand() - Constructor for class com.norconex.collector.core.cmdline.ConfigCheckCommand
- ConfigRenderCommand - Class in com.norconex.collector.core.cmdline
-
Resolve all includes and variables substitution and print the resulting configuration to facilitate sharing.
- ConfigRenderCommand() - Constructor for class com.norconex.collector.core.cmdline.ConfigRenderCommand
- count() - Method in interface com.norconex.collector.core.store.IDataStore
- count() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- count() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- count() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- CrawlDoc - Class in com.norconex.collector.core.doc
-
A crawl document, which holds an additional
DocInfo
from cache (if any). - CrawlDoc(DocInfo, CrawlDocInfo, CachedInputStream) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
- CrawlDoc(DocInfo, CrawlDocInfo, CachedInputStream, boolean) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
- CrawlDoc(DocInfo, CachedInputStream) - Constructor for class com.norconex.collector.core.doc.CrawlDoc
- crawlDocInfo(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
- CrawlDocInfo - Class in com.norconex.collector.core.doc
- CrawlDocInfo() - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
- CrawlDocInfo(DocInfo) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
-
Copy constructor.
- CrawlDocInfo(String) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfo
- CrawlDocInfo.Stage - Enum in com.norconex.collector.core.doc
- CrawlDocInfoService - Class in com.norconex.collector.core.doc
- CrawlDocInfoService(Crawler, Class<? extends CrawlDocInfo>) - Constructor for class com.norconex.collector.core.doc.CrawlDocInfoService
- CrawlDocMetadata - Class in com.norconex.collector.core.doc
-
Metadata constants for common metadata field names typically set by a collector crawler.
- Crawler - Class in com.norconex.collector.core.crawler
-
Abstract crawler implementation providing a common base to building crawlers.
- Crawler(CrawlerConfig, Collector) - Constructor for class com.norconex.collector.core.crawler.Crawler
-
Constructor.
- CRAWLER_CLEAN_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
- CRAWLER_CLEAN_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
- CRAWLER_INIT_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler began its initialization.
- CRAWLER_INIT_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler has been initialized.
- CRAWLER_RUN_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler is about to begin crawling.
- CRAWLER_RUN_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler completed crawling execution normally (without being stopped).
- CRAWLER_RUN_THREAD_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler just started a new crawling thread.
- CRAWLER_RUN_THREAD_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
The crawler completed execution of a crawling thread.
- CRAWLER_STOP_BEGIN - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
Issued when a request to stop the crawler has been received.
- CRAWLER_STOP_END - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
Issued when a request to stop the crawler has been fully executed (crawler stopped).
- Crawler.ReferenceProcessStatus - Enum in com.norconex.collector.core.crawler
- CrawlerCommitterService - Class in com.norconex.collector.core.crawler
-
Wrapper around multiple Committers so they can all be handled as one.
- CrawlerCommitterService(Crawler) - Constructor for class com.norconex.collector.core.crawler.CrawlerCommitterService
- CrawlerConfig - Class in com.norconex.collector.core.crawler
-
Base Crawler configuration.
- CrawlerConfig() - Constructor for class com.norconex.collector.core.crawler.CrawlerConfig
-
Creates a new crawler configuration.
- CrawlerConfig.OrphansStrategy - Enum in com.norconex.collector.core.crawler
- CrawlerConfigLoader - Class in com.norconex.collector.core.crawler
-
HTTP Crawler configuration loader.
- CrawlerConfigLoader(Class<? extends CrawlerConfig>) - Constructor for class com.norconex.collector.core.crawler.CrawlerConfigLoader
- CrawlerEvent - Class in com.norconex.collector.core.crawler
-
A crawler event.
- CrawlerEvent.Builder - Class in com.norconex.collector.core.crawler
- CrawlerLifeCycleListener - Class in com.norconex.collector.core.crawler
-
Listener adapter for crawler events.
- CrawlerLifeCycleListener() - Constructor for class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- CrawlerMonitor - Class in com.norconex.collector.core.monitor
- CrawlerMonitor(Crawler) - Constructor for class com.norconex.collector.core.monitor.CrawlerMonitor
- CrawlerMonitorJMX - Class in com.norconex.collector.core.monitor
- CrawlerMonitorMXBean - Interface in com.norconex.collector.core.monitor
- CrawlState - Class in com.norconex.collector.core.doc
-
Reference processing status.
- CrawlState(String) - Constructor for class com.norconex.collector.core.doc.CrawlState
-
Constructor.
- createChildDocInfo(String, CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
- createCrawler(CrawlerConfig) - Method in class com.norconex.collector.core.Collector
-
Creates a new crawler instance.
- createDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- createDocumentChecksum(Doc) - Method in interface com.norconex.collector.core.checksum.IDocumentChecksummer
-
Creates a document checksum.
- createMetadataChecksum(Properties) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- createMetadataChecksum(Properties) - Method in interface com.norconex.collector.core.checksum.IMetadataChecksummer
-
Creates a metadata checksum.
D
- DataStoreException - Exception in com.norconex.collector.core.store
-
Crawl data store runtime exception.
- DataStoreException() - Constructor for exception com.norconex.collector.core.store.DataStoreException
- DataStoreException(String) - Constructor for exception com.norconex.collector.core.store.DataStoreException
- DataStoreException(String, Throwable) - Constructor for exception com.norconex.collector.core.store.DataStoreException
- DataStoreException(Throwable) - Constructor for exception com.norconex.collector.core.store.DataStoreException
- DataStoreExporter - Exception in com.norconex.collector.core.store
-
Exports data stores to a format that can be imported back to the same or different store implementation.
- DataStoreImporter - Exception in com.norconex.collector.core.store
-
Imports from a previously exported data store.
- DEFAULT_FALLBACK_STRATEGY - Static variable in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- DEFAULT_FILENAME_PREFIX - Static variable in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- DEFAULT_WORK_DIR - Static variable in class com.norconex.collector.core.CollectorConfig
-
Default relative directory where progress files are stored.
- delete(CrawlDoc) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
-
Delete a document operation using all accepting committers.
- delete(String) - Method in interface com.norconex.collector.core.store.IDataStore
- delete(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- delete(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- delete(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- DELETE - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
-
Deleting orphans sends them to the Committer for deletions and they are removed from the internal reference cache.
- DELETE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
-
Deleting spoiled references sends them to the Committer for deletions and they are removed from the internal reference cache.
- deleteCacheOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
- DELETED - Static variable in class com.norconex.collector.core.doc.CrawlState
- deleteFirst() - Method in interface com.norconex.collector.core.store.IDataStore
- deleteFirst() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- deleteFirst() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- deleteFirst() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- DeleteRejectedEventListener - Class in com.norconex.collector.core.crawler.event.impl
-
Provides the ability to send deletion requests to your configured committer(s) whenever a reference is rejected, regardless whether it was encountered in a previous crawling session or not.
- DeleteRejectedEventListener() - Constructor for class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- destroy() - Method in interface com.norconex.collector.core.stop.ICollectorStopper
-
Destroys resources allocated with this stopper.
- destroy() - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
- destroyCollector() - Method in class com.norconex.collector.core.Collector
- destroyCrawler() - Method in class com.norconex.collector.core.crawler.Crawler
- DocInfoPipelineContext - Class in com.norconex.collector.core.pipeline
-
A
IPipelineStage
context for collectorPipeline
s dealing with aCrawlDocInfo
(e.g. document queuing). - DocInfoPipelineContext(Crawler, CrawlDocInfo) - Constructor for class com.norconex.collector.core.pipeline.DocInfoPipelineContext
-
Constructor.
- doCreateDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- doCreateDocumentChecksum(Doc) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- doCreateMetaChecksum(Properties) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- doCreateMetaChecksum(Properties) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- DOCUMENT_COMMITTED_DELETE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was submitted to a committer for removal.
- DOCUMENT_COMMITTED_UPSERT - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was submitted to a committer for upsert.
- DOCUMENT_FETCHED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was successfully retrieved for processing.
- DOCUMENT_IMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was imported.
- DOCUMENT_METADATA_FETCHED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document metadata fields were successfully retrieved.
- DOCUMENT_POSTIMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document post-import processor was executed properly.
- DOCUMENT_PREIMPORTED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document pre-import processor was executed properly.
- DOCUMENT_PROCESSED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was processed (successfully or not).
- DOCUMENT_QUEUED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document reference was queued in the data store for processing.
- DOCUMENT_SAVED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was saved.
- DocumentChecksumStage - Class in com.norconex.collector.core.pipeline.committer
-
Common pipeline stage for creating a document checksum.
- DocumentChecksumStage() - Constructor for class com.norconex.collector.core.pipeline.committer.DocumentChecksumStage
- DocumentFiltersStage - Class in com.norconex.collector.core.pipeline.importer
- DocumentFiltersStage() - Constructor for class com.norconex.collector.core.pipeline.importer.DocumentFiltersStage
- DocumentPipelineContext - Class in com.norconex.collector.core.pipeline
- DocumentPipelineContext(Crawler, CrawlDoc) - Constructor for class com.norconex.collector.core.pipeline.DocumentPipelineContext
- doExecute() - Method in class com.norconex.collector.core.crawler.Crawler
- dropStore(String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- dropStore(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- dropStore(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- dropStore(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
E
- equals(Object) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- equals(Object) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- equals(Object) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- equals(Object) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- equals(Object) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- equals(Object) - Method in class com.norconex.collector.core.cmdline.CollectorCommand
- equals(Object) - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
- equals(Object) - Method in class com.norconex.collector.core.cmdline.StartCommand
- equals(Object) - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
- equals(Object) - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
- equals(Object) - Method in class com.norconex.collector.core.CollectorConfig
- equals(Object) - Method in class com.norconex.collector.core.CollectorEvent
- equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- equals(Object) - Method in class com.norconex.collector.core.crawler.CrawlerEvent
- equals(Object) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- equals(Object) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- equals(Object) - Method in class com.norconex.collector.core.doc.CrawlDoc
- equals(Object) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- equals(Object) - Method in class com.norconex.collector.core.doc.CrawlState
- equals(Object) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- equals(Object) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- equals(Object) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- equals(Object) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- equals(Object) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- equals(Object) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- equals(Object) - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
- equals(Object) - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- equals(Object) - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
- equals(Object) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- equals(Object) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- equals(Object) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- ERROR - Static variable in class com.norconex.collector.core.doc.CrawlState
- execute(DocInfoPipelineContext) - Method in class com.norconex.collector.core.pipeline.queue.QueueReferenceStage
- execute(DocInfoPipelineContext) - Method in class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
- execute(DocumentPipelineContext) - Method in class com.norconex.collector.core.pipeline.committer.CommitModuleStage
- execute(DocumentPipelineContext) - Method in class com.norconex.collector.core.pipeline.committer.DocumentChecksumStage
- execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.DocumentFiltersStage
- execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.ImportModuleStage
- execute(ImporterPipelineContext) - Method in class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
- executeCommitterPipeline(Crawler, CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
- executeImporterPipeline(ImporterPipelineContext) - Method in class com.norconex.collector.core.crawler.Crawler
- executeQueuePipeline(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
- exists(String) - Method in interface com.norconex.collector.core.store.IDataStore
- exists(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- exists(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- exists(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- exportDataStore(Crawler, Path) - Static method in exception com.norconex.collector.core.store.DataStoreExporter
- exportDataStore(Path) - Method in class com.norconex.collector.core.Collector
- exportDataStore(Path) - Method in class com.norconex.collector.core.crawler.Crawler
- ExtensionReferenceFilter - Class in com.norconex.collector.core.filter.impl
-
Filters a reference based on a comma-separated list of extensions.
- ExtensionReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- ExtensionReferenceFilter(String) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- ExtensionReferenceFilter(String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- ExtensionReferenceFilter(String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
F
- FileBasedStopper - Class in com.norconex.collector.core.stop.impl
-
Listens for STOP requests using a stop file.
- FileBasedStopper() - Constructor for class com.norconex.collector.core.stop.impl.FileBasedStopper
- find(String) - Method in interface com.norconex.collector.core.store.IDataStore
- find(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- find(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- find(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- findFirst() - Method in interface com.norconex.collector.core.store.IDataStore
- findFirst() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- findFirst() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- findFirst() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- fire(CrawlerEvent) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- fire(String, Consumer<CrawlerEvent.Builder>) - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
-
Fires a crawler event with the current crawler as source.
- fireStopRequest() - Method in class com.norconex.collector.core.Collector
- fireStopRequest(Collector) - Method in interface com.norconex.collector.core.stop.ICollectorStopper
-
Stops a currently running Collector.
- fireStopRequest(Collector) - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
- forEach(BiPredicate<String, T>) - Method in interface com.norconex.collector.core.store.IDataStore
- forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- forEach(BiPredicate<String, T>) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- forEachActive(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- forEachCached(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- forEachProcessed(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- forEachQueued(BiPredicate<String, CrawlDocInfo>) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
G
- GenericMetadataChecksummer - Class in com.norconex.collector.core.checksum.impl
-
Generic implementation of
IMetadataChecksummer
that uses specified field names and their values to create a checksum. - GenericMetadataChecksummer() - Constructor for class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- GenericSpoiledReferenceStrategizer - Class in com.norconex.collector.core.spoil.impl
-
Generic implementation of
ISpoiledReferenceStrategizer
that offers a simple mapping between the crawl state of references that have turned "bad" and the strategy to adopt for each. - GenericSpoiledReferenceStrategizer() - Constructor for class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- get() - Static method in class com.norconex.collector.core.Collector
- getActiveCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getActiveCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
- getActiveCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
- getAutoCommitBufferSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getAutoCommitDelay() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getAutoCompactFillRate() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getCacheConcurrency() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getCached(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getCachedDocInfo() - Method in class com.norconex.collector.core.doc.CrawlDoc
- getCachedDocInfo() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
-
Gets cached crawl data.
- getCacheSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getCollector() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- getCollector() - Method in class com.norconex.collector.core.crawler.Crawler
- getCollectorConfig() - Method in class com.norconex.collector.core.Collector
-
Gets the collector configuration.
- getCommitter() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Deprecated.Since 2.0.0, use
CrawlerConfig.getCommitters()
. - getCommitters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets Committers responsible for persisting information to a target location/repository.
- getCommitterService() - Method in class com.norconex.collector.core.crawler.Crawler
- getCommitterService() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- getCompress() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- getConfig() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- getConfigFile() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- getConfigProperties() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getConfiguration() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- getConnectionString() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- getContent() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- getContentChecksum() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
-
Gets the content checksum.
- getContentReader() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- getCrawlDate() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
-
Gets the crawl date.
- getCrawlDocInfo() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
-
Gets the crawl data holding contextual information about the crawled reference.
- getCrawlDocInfoType() - Method in class com.norconex.collector.core.crawler.Crawler
- getCrawler() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- getCrawlerConfig() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gets the crawler configuration.
- getCrawlerConfigs() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets crawler configurations.
- getCrawlers() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- getCrawlers() - Method in class com.norconex.collector.core.Collector
-
Gets all crawler instances in this collector.
- getCrawlersStartInterval() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets the amount of time between each concurrent crawlers are started.
- getDataStoreEngine() - Method in class com.norconex.collector.core.crawler.Crawler
- getDataStoreEngine() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the crawl data store factory.
- getDeferredShutdownDuration() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets the amount of time to defer the collector shutdown when it is done executing.
- getDocInfo() - Method in class com.norconex.collector.core.doc.CrawlDoc
- getDocInfo() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
- getDocInfo() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- getDocInfoService() - Method in class com.norconex.collector.core.crawler.Crawler
- getDocInfoService() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- getDocument() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- getDocumentChecksummer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the document checksummer.
- getDocumentFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the document filters.
- getDownloadDir() - Method in class com.norconex.collector.core.crawler.Crawler
- getEventCounts() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
- getEventCounts() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
- getEventListeners() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets event listeners.
- getEventListeners() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets event listeners.
- getEventManager() - Method in class com.norconex.collector.core.Collector
-
Gets the event manager.
- getEventManager() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gets the event manager.
- getEventManager() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- getEventMatcher() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
-
Gets the event matcher used to identify which events can trigger a deletion request.
- getEventMatcher() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
-
Gets the event matcher used to identify which events will be counted.
- getExtensions() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- getFallbackStrategy() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- getField() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- getFieldMatcher() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Gets the field matcher.
- getFieldMatcher() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Gets the field matcher.
- getFieldMatcher() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
-
Gets the field matcher.
- getId() - Method in class com.norconex.collector.core.Collector
-
Gets the collector unique identifier.
- getId() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets this collector unique identifier.
- getId() - Method in class com.norconex.collector.core.crawler.Crawler
- getId() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets this crawler unique identifier.
- getImporter() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gets the crawler Importer module.
- getImporterConfig() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the Importer module configuration.
- getImporterResponse() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
- getMaxConcurrentCrawlers() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets the maximum number of crawlers that can be executed concurrently.
- getMaxDocuments() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the maximum number of documents that can be processed.
- getMaximum() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- getMaxMemoryInstance() - Method in class com.norconex.collector.core.CollectorConfig
- getMaxMemoryPool() - Method in class com.norconex.collector.core.CollectorConfig
- getMaxParallelCrawlers() - Method in class com.norconex.collector.core.CollectorConfig
-
Deprecated.Since 2.0.0, use
CollectorConfig.getMaxConcurrentCrawlers()
- getMetaChecksum() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- getMetadataChecksummer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the metadata checksummer.
- getMetadataFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets metadata filters.
- getMonitor() - Method in class com.norconex.collector.core.crawler.Crawler
- getName() - Method in interface com.norconex.collector.core.store.IDataStore
- getName() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- getName() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- getName() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- getNumThreads() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the maximum number of threads a crawler can use.
- getOnMatch() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- getOnMatch() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- getOnMatch() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- getOnMatch() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- getOnMatch() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- getOnMultiple() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- getOnSet() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Gets the property setter to use when a value is set.
- getOnSet() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Gets the property setter to use when a value is set.
- getOrphansStrategy() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the strategy to adopt when there are orphans.
- getPageSplitSize() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
-
Get the max memory page size in bytes before splitting it, in bytes.
- getParentRootReference() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- getProcessed(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getProcessedCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getProcessedCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
- getProcessedCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
- getProcessingStage(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getQueueCount() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- getQueuedCount() - Method in class com.norconex.collector.core.monitor.CrawlerMonitor
- getQueuedCount() - Method in interface com.norconex.collector.core.monitor.CrawlerMonitorMXBean
- getReferenceFilters() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets reference filters
- getRegex() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- getRegex() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- getReleaseVersions() - Method in class com.norconex.collector.core.Collector
- getSource() - Method in class com.norconex.collector.core.CollectorEvent
- getSource() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
- getSourceFields() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, use
GenericMetadataChecksummer.getFieldMatcher()
. - getSourceFields() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, use
MD5DocumentChecksummer.getFieldMatcher()
. - getSourceFieldsRegex() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, use
GenericMetadataChecksummer.getFieldMatcher()
. - getSourceFieldsRegex() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, use
MD5DocumentChecksummer.getFieldMatcher()
. - getSpoiledReferenceStrategizer() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the spoiled state strategy resolver.
- getState() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- getStopOnExceptions() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets the exceptions we want to stop the crawler on.
- getStoreNames() - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- getStoreNames() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getStoreNames() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- getStoreNames() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- getStoreType(String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- getStoreType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getStoreType(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- getStoreType(String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- getStreamFactory() - Method in class com.norconex.collector.core.Collector
- getStreamFactory() - Method in class com.norconex.collector.core.crawler.Crawler
- getSubject() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
-
Gets the subject.
- getTablePrefix() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getTargetField() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Deprecated.Since 2.0.0, use
AbstractDocumentChecksummer.getToField()
. - getTargetField() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Deprecated.Since 2.0.0, use
AbstractMetadataChecksummer.getToField()
. - getTempDir() - Method in class com.norconex.collector.core.Collector
- getTempDir() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets the temporary directory where files can be deleted safely by the OS or other processes when the collector is not running.
- getTempDir() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gets the directory where most temporary files are created for the duration of a crawling session.
- getTextType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getTimestapType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getToField() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Gets the metadata field to use to store the checksum value.
- getToField() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Gets the metadata field to use to store the checksum value.
- getValueMatcher() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
-
Gets the value matcher.
- getValueMatcher() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
-
Gets the value matcher.
- getVarcharType() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- getVariablesFile() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- getVersion() - Method in class com.norconex.collector.core.Collector
- getWorkDir() - Method in class com.norconex.collector.core.Collector
- getWorkDir() - Method in class com.norconex.collector.core.CollectorConfig
-
Gets the base directory location where files created during execution are created.
- getWorkDir() - Method in class com.norconex.collector.core.crawler.Crawler
-
Gets the directory where files needing to be persisted between crawling sessions are kept.
- GRACE_ONCE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
-
Gracing spoiled references gives them one chance (and only one) to recover by not sending a deletion request to the Committer the first time, but doing so if the reference is still spoiled on the next crawl.
H
- handleExecutionException(Exception, CommandLine, CommandLine.ParseResult) - Method in class com.norconex.collector.core.cmdline.CollectorCommand
- handleOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
- hasCache() - Method in class com.norconex.collector.core.doc.CrawlDoc
- hashCode() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- hashCode() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- hashCode() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- hashCode() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- hashCode() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- hashCode() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
- hashCode() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
- hashCode() - Method in class com.norconex.collector.core.cmdline.StartCommand
- hashCode() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
- hashCode() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
- hashCode() - Method in class com.norconex.collector.core.CollectorConfig
- hashCode() - Method in class com.norconex.collector.core.CollectorEvent
- hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- hashCode() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
- hashCode() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- hashCode() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- hashCode() - Method in class com.norconex.collector.core.doc.CrawlDoc
- hashCode() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- hashCode() - Method in class com.norconex.collector.core.doc.CrawlState
- hashCode() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- hashCode() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- hashCode() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- hashCode() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- hashCode() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- hashCode() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- hashCode() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
- hashCode() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- hashCode() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
- hashCode() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- hashCode() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- hashCode() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
I
- ICollectorStopper - Interface in com.norconex.collector.core.stop
-
Responsible for shutting down a Collector upon explicit invocation of
ICollectorStopper.fireStopRequest(Collector)
or when specific conditions are met. - IDataStore<T> - Interface in com.norconex.collector.core.store
- IDataStoreEngine - Interface in com.norconex.collector.core.store
- IDocumentChecksummer - Interface in com.norconex.collector.core.checksum
-
Creates a checksum representing a a document.
- IDocumentFilter - Interface in com.norconex.collector.core.filter
-
Filter a document after the document content is fetched, downloaded, or otherwise read or acquired.
- IGNORE - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
-
Ignoring orphans effectively does nothing with them (not deleted, not processed).
- IGNORE - com.norconex.collector.core.spoil.SpoiledReferenceStrategy
-
Ignoring spoiled references does not send a deletion request to the Committer.
- IMetadataChecksummer - Interface in com.norconex.collector.core.checksum
-
Creates a checksum representing a document based on document metadata values obtained prior to fetching that document (e.g.
- IMetadataFilter - Interface in com.norconex.collector.core.filter
-
Filter a reference based on the metadata that could be obtained for a document, before it was fetched, downloaded, or otherwise read or acquired (e.g.
- importDataStore(Crawler, Path) - Static method in exception com.norconex.collector.core.store.DataStoreImporter
- importDataStore(Path) - Method in class com.norconex.collector.core.crawler.Crawler
- importDataStore(List<Path>) - Method in class com.norconex.collector.core.Collector
- ImporterPipelineContext - Class in com.norconex.collector.core.pipeline.importer
- ImporterPipelineContext(Crawler, CrawlDoc) - Constructor for class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
-
Constructor.
- ImportModuleStage - Class in com.norconex.collector.core.pipeline.importer
-
Common pipeline stage for importing documents.
- ImportModuleStage() - Constructor for class com.norconex.collector.core.pipeline.importer.ImportModuleStage
- init(Crawler) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- init(Crawler) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- init(Crawler) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- init(Crawler) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- init(CommitterContext) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- initCollector() - Method in class com.norconex.collector.core.Collector
- initCrawlDoc(CrawlDoc) - Method in class com.norconex.collector.core.crawler.Crawler
- initCrawler() - Method in class com.norconex.collector.core.crawler.Crawler
- IReferenceFilter - Interface in com.norconex.collector.core.filter
-
Filter a document based on its reference, before its properties or content gets read or otherwise acquired.
- is(CrawlDocInfo.Stage) - Method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
- IS_CRAWL_NEW - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
-
Boolean flag indicating whether a document is new to the crawler that fetched it.
- isActiveEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- isCaseSensitive() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- isCollectorShutdown(Event) - Method in class com.norconex.collector.core.CollectorEvent
- isCombineFieldsAndContent() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Gets whether we are combining the fields and content checksums.
- isCrawlerShutdown() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
- isDisabled() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, not having a checksummer defined or setting one explicitly to
null
effectively disables it. - isDisabled() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, not having a checksummer defined or setting one explicitly to
null
effectively disables it. - isDocumentDeduplicate() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets whether to turn on deduplication based on document checksum.
- isEmpty() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- isEmpty() - Method in interface com.norconex.collector.core.store.IDataStore
- isEmpty() - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- isEmpty() - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- isEmpty() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- isGoodState() - Method in class com.norconex.collector.core.doc.CrawlState
-
Returns whether a reference should be considered "good" (the corresponding document is not in a "bad" state, such as being rejected or produced an error.
- isKeep() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Whether to keep the document checksum value as a new field in the document metadata.
- isKeep() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Whether to keep the metadata checksum value as a new metadata field.
- isMaxDocuments() - Method in class com.norconex.collector.core.crawler.Crawler
- isMetadataDeduplicate() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Gets whether to turn on deduplication based on metadata checksum.
- isNewOrModified() - Method in class com.norconex.collector.core.doc.CrawlState
-
Returns whether a state indicates new or modified.
- isOneOf(CrawlState...) - Method in class com.norconex.collector.core.doc.CrawlState
- isOrphan() - Method in class com.norconex.collector.core.doc.CrawlDoc
- ISpoiledReferenceStrategizer - Interface in com.norconex.collector.core.spoil
-
Decides which strategy to adopt for a given reference with a bad state.
- isProcessedEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- isQueueEmpty() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- isQueueInitialized() - Method in class com.norconex.collector.core.crawler.Crawler
- isRunning() - Method in class com.norconex.collector.core.Collector
- isSkipped() - Method in class com.norconex.collector.core.doc.CrawlState
-
Returns whether a state indicate the document is to be skipped (
CrawlState.UNMODIFIED
orCrawlState.PREMATURE
). - isStopped() - Method in class com.norconex.collector.core.crawler.Crawler
-
Whether the crawler job was stopped.
J
- JdbcDataStore<T> - Class in com.norconex.collector.core.store.impl.jdbc
- JdbcDataStoreEngine - Class in com.norconex.collector.core.store.impl.jdbc
-
Data store engine using a JDBC-compatible database for storing crawl data.
- JdbcDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
L
- launch(Collector, String[]) - Method in class com.norconex.collector.core.cmdline.CollectorCommandLauncher
- listenForStopRequest(Collector) - Method in interface com.norconex.collector.core.stop.ICollectorStopper
-
Setup and/or start the stopper, which can be terminated by invoking stop in the same or different JVM (see concrete implementation for details).
- listenForStopRequest(Collector) - Method in class com.norconex.collector.core.stop.impl.FileBasedStopper
- loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- loadChecksummerFromXML(XML) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- loadCollectorConfigFromXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
- loadConfig() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- loadCrawlerConfig(CrawlerConfig, XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
-
Loads a crawler configuration, which can be either the default crawler or real crawler configuration instances (keeping defaults).
- loadCrawlerConfigFromXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- loadCrawlerConfigs(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
- loadCrawlerConfigs(File) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
-
Deprecated.Since 2.0.0, use
CrawlerConfigLoader.loadCrawlerConfigs(Path)
instead - loadCrawlerConfigs(File, File) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
-
Deprecated.Since 2.0.0, use
CrawlerConfigLoader.loadCrawlerConfigs(Path, Path)
instead - loadCrawlerConfigs(Path) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
-
Loads crawler configurations.
- loadCrawlerConfigs(Path, Path) - Method in class com.norconex.collector.core.crawler.CrawlerConfigLoader
-
Loads crawler configurations.
- loadFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- loadFromXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- loadFromXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
- loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- loadFromXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- loadFromXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- loadFromXML(XML) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- loadFromXML(XML) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- lock() - Method in class com.norconex.collector.core.Collector
M
- markReferenceVariationsAsProcessed(CrawlDocInfo) - Method in class com.norconex.collector.core.crawler.Crawler
- MAX_REACHED - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
- MD5DocumentChecksummer - Class in com.norconex.collector.core.checksum.impl
-
Implementation of
IDocumentChecksummer
which returns a MD5 checksum value of the extracted document content unless one or more given source fields are specified, in which case the MD5 checksum value is constructed from those fields. - MD5DocumentChecksummer() - Constructor for class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- MdcUtil - Class in com.norconex.collector.core.monitor
-
Utility methods to simplify adding Mapped Diagnostic Context (MDC) to logging in a consistent way for crawlers and collectors, as well as offering filename-friendly version as well.
- metadataChecksumMD5(Properties, TextMatcher) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
- metadataChecksumMD5(Properties, String, List<String>) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
-
Deprecated.
- metadataChecksumPlain(Properties, TextMatcher) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
- metadataChecksumPlain(Properties, String, List<String>) - Static method in class com.norconex.collector.core.checksum.ChecksumUtil
-
Deprecated.
- MetadataFilter - Class in com.norconex.collector.core.filter.impl
-
Accepts or rejects a reference based on whether one or more metadata field values are matching.
- MetadataFilter() - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
- MetadataFilter(TextMatcher, TextMatcher) - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
- MetadataFilter(TextMatcher, TextMatcher, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.MetadataFilter
- MODIFIED - Static variable in class com.norconex.collector.core.doc.CrawlState
- MongoDataStore<T> - Class in com.norconex.collector.core.store.impl.mongodb
- MongoDataStoreEngine - Class in com.norconex.collector.core.store.impl.mongodb
-
Data store engine using MongoDB for storing crawl data.
- MongoDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- MVStoreDataStore<T> - Class in com.norconex.collector.core.store.impl.mvstore
- MVStoreDataStoreConfig - Class in com.norconex.collector.core.store.impl.mvstore
-
MVStore configuration parameters.
- MVStoreDataStoreConfig() - Constructor for class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- MVStoreDataStoreEngine - Class in com.norconex.collector.core.store.impl.mvstore
- MVStoreDataStoreEngine() - Constructor for class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
N
- NEW - Static variable in class com.norconex.collector.core.doc.CrawlState
- NORCONEX_ASCII - Static variable in class com.norconex.collector.core.Collector
-
Simple ASCI art of Norconex.
- NOT_FOUND - Static variable in class com.norconex.collector.core.doc.CrawlState
O
- OK - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
- onCollectorCleanBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorCleanEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorError(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorEvent(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorRunBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorRunEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorShutdown(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
-
Triggered when a collector is ending its execution on either a
CollectorEvent.COLLECTOR_ERROR
,CollectorEvent.COLLECTOR_RUN_END
orCollectorEvent.COLLECTOR_STOP_END
event. - onCollectorStopBegin(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCollectorStopEnd(CollectorEvent) - Method in class com.norconex.collector.core.CollectorLifeCycleListener
- onCrawlerCleanBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerCleanEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerEvent(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerInitBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerInitEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerRunBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerRunEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerRunThreadBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerRunThreadEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerShutdown(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
-
Triggered when a crawler is ending its execution on either a
CrawlerEvent.CRAWLER_RUN_END
orCrawlerEvent.CRAWLER_STOP_END
event. - onCrawlerStopBegin(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- onCrawlerStopEnd(CrawlerEvent) - Method in class com.norconex.collector.core.crawler.CrawlerLifeCycleListener
- open() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- openStore(String, Class<? extends T>) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- openStore(String, Class<? extends T>) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
P
- pollQueue() - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- PREFIX - Static variable in class com.norconex.collector.core.doc.CrawlDocMetadata
- PREMATURE - Static variable in class com.norconex.collector.core.doc.CrawlState
-
For collectors that support it, this state indicates a previously crawled document is not yet ready to be re-crawled.
- printErr() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- printErr(String) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- printOut() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- printOut(String) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- PROCESS - com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
-
Processing orphans tries to obtain and process them again, normally.
- processed(CrawlDocInfo) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- PROCESSED - com.norconex.collector.core.doc.CrawlDocInfo.Stage
- processNextReference(Crawler.ProcessFlags) - Method in class com.norconex.collector.core.crawler.Crawler
- processReferences(Crawler.ProcessFlags) - Method in class com.norconex.collector.core.crawler.Crawler
Q
- queue(CrawlDocInfo) - Method in class com.norconex.collector.core.doc.CrawlDocInfoService
- QUEUE_EMPTY - com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
- QUEUED - com.norconex.collector.core.doc.CrawlDocInfo.Stage
- QueueReferenceStage - Class in com.norconex.collector.core.pipeline.queue
-
Common pipeline stage for queuing documents.
- QueueReferenceStage() - Constructor for class com.norconex.collector.core.pipeline.queue.QueueReferenceStage
-
Constructor.
R
- ReferenceFilter - Class in com.norconex.collector.core.filter.impl
-
Filters URL based on a regular expression.
- ReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
- ReferenceFilter(TextMatcher) - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
- ReferenceFilter(TextMatcher, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.ReferenceFilter
- ReferenceFiltersStage - Class in com.norconex.collector.core.pipeline.queue
-
Common pipeline stage for filtering references.
- ReferenceFiltersStage() - Constructor for class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
- ReferenceFiltersStage(String) - Constructor for class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStage
- ReferenceFiltersStageUtil - Class in com.norconex.collector.core.pipeline.queue
-
Reference-filtering stage utility methods.
- RegexMetadataFilter - Class in com.norconex.collector.core.filter.impl
-
Deprecated.Since 2.0.0, use
MetadataFilter
instead. - RegexMetadataFilter() - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- RegexMetadataFilter(String, String) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- RegexMetadataFilter(String, String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- RegexMetadataFilter(String, String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- RegexReferenceFilter - Class in com.norconex.collector.core.filter.impl
-
Deprecated.Since 2.0.0, use
ReferenceFilter
- RegexReferenceFilter() - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- RegexReferenceFilter(String) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- RegexReferenceFilter(String, OnMatch) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- RegexReferenceFilter(String, OnMatch, boolean) - Constructor for class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- register(Crawler) - Static method in class com.norconex.collector.core.monitor.CrawlerMonitorJMX
- REJECTED - Static variable in class com.norconex.collector.core.doc.CrawlState
- REJECTED_BAD_STATUS - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected because the status obtained when trying to obtain it was not accepted (e.g., 500 HTTP error code).
- REJECTED_DUPLICATE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected since another document with a different reference was already processed with the same digital signature ( checksum).
- REJECTED_ERROR - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected because an error occurred when processing it.
- REJECTED_FILTER - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected by a filters.
- REJECTED_IMPORT - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected by the Importer module.
- REJECTED_NOTFOUND - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected because it could not be found (e.g., no longer exists at a given location).
- REJECTED_PREMATURE - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document could not be re-crawled because it is not yet ready to be re-crawled.
- REJECTED_UNMODIFIED - Static variable in class com.norconex.collector.core.crawler.CrawlerEvent
-
A document was rejected as it was not modified since last time it was crawled.
- renameStore(IDataStore<?>, String) - Method in interface com.norconex.collector.core.store.IDataStoreEngine
- renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- renameStore(IDataStore<?>, String) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- reprocessCacheOrphans() - Method in class com.norconex.collector.core.crawler.Crawler
- resolveDocumentChecksum(String, DocumentPipelineContext, Object) - Static method in class com.norconex.collector.core.pipeline.ChecksumStageUtil
- resolveMetaChecksum(String, DocumentPipelineContext, Object) - Static method in class com.norconex.collector.core.pipeline.ChecksumStageUtil
- resolveReferenceFilters(List<IReferenceFilter>, DocInfoPipelineContext, String) - Static method in class com.norconex.collector.core.pipeline.queue.ReferenceFiltersStageUtil
- resolveSpoiledReferenceStrategy(String, CrawlState) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- resolveSpoiledReferenceStrategy(String, CrawlState) - Method in interface com.norconex.collector.core.spoil.ISpoiledReferenceStrategizer
-
Establish which spoiled reference strategy to adopt.
- runCommand() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.CleanCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.ConfigCheckCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.StartCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.StopCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
- runCommand() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
S
- save(String, T) - Method in interface com.norconex.collector.core.store.IDataStore
- save(String, T) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStore
- save(String, T) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStore
- save(String, T) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStore
- saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- saveChecksummerToXML(XML) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- saveCollectorConfigToXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
- saveCrawlerConfigToXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- SaveDocumentStage - Class in com.norconex.collector.core.pipeline.importer
-
Common pipeline stage for saving documents.
- SaveDocumentStage() - Constructor for class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
- saveToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- saveToXML(XML) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- saveToXML(XML) - Method in class com.norconex.collector.core.CollectorConfig
- saveToXML(XML) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- saveToXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- saveToXML(XML) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- saveToXML(XML) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- saveToXML(XML) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- saveToXML(XML) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
- setAutoCommitBufferSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setAutoCommitDelay(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setAutoCompactFillRate(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setCacheConcurrency(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setCacheSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- setCaseSensitive(boolean) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- setCollectorId(String) - Static method in class com.norconex.collector.core.monitor.MdcUtil
-
Sets two representations of the supplied collector ID in the MDC:
- setCombineFieldsAndContent(boolean) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Sets whether to combine the fields and content checksums.
- setCommitter(ICommitter) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Deprecated.Since 2.0.0, use
CrawlerConfig.setCommitters(ICommitter...)
. - setCommitters(ICommitter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets Committers responsible for persisting information to a target location/repository.
- setCommitters(List<ICommitter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets Committers responsible for persisting information to a target location/repository.
- setCompress(Integer) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setConfigFile(Path) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- setConfigProperties(Properties) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- setConnectionString(String) - Method in class com.norconex.collector.core.store.impl.mongodb.MongoDataStoreEngine
- setContentChecksum(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
-
Sets the content checksum.
- setCrawlDate(ZonedDateTime) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
-
Sets the crawl date.
- setCrawlerConfigs(CrawlerConfig...) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets crawler configurations.
- setCrawlerConfigs(List<CrawlerConfig>) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets crawler configurations.
- setCrawlerId(String) - Static method in class com.norconex.collector.core.monitor.MdcUtil
-
Sets two representations of the supplied crawler ID in the MDC:
- setCrawlersStartInterval(Duration) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets the amount of time in between each concurrent crawlers are started.
- setDataStoreEngine(IDataStoreEngine) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the crawl data store factory.
- setDeferredShutdownDuration(Duration) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets the amount of time to defer the collector shutdown when it is done executing.
- setDisabled(boolean) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, not having a checksummer defined or setting one explicitly to
null
effectively disable it. - setDisabled(boolean) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, not having a checksummer defined or setting one explicitly to
null
effectively disable it. - setDocumentChecksummer(IDocumentChecksummer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the document checksummer.
- setDocumentDeduplicate(boolean) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets whether to turn on deduplication based on document checksum.
- setDocumentFilters(IDocumentFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets document filters.
- setDocumentFilters(List<IDocumentFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets document filters.
- setEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets event listeners.
- setEventListeners(IEventListener<?>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets event listeners.
- setEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets event listeners.
- setEventListeners(List<IEventListener<?>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets event listeners.
- setEventMatcher(TextMatcher) - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
-
Sets the event matcher used to identify which events can trigger a deletion request.
- setEventMatcher(TextMatcher) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
-
Sets the event matcher used to identify which events will be counted.
- setExtensions(String...) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- setExtensions(List<String>) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- setFallbackStrategy(SpoiledReferenceStrategy) - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- setField(String) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Sets the field matcher.
- setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Sets the field matcher.
- setFieldMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
-
Sets the field matcher.
- setId(String) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets this collector unique identifier.
- setId(String) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets this crawler unique identifier.
- setImporterConfig(ImporterConfig) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the Importer module configuration.
- setImporterResponse(ImporterResponse) - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
- setKeep(boolean) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Sets whether to keep the document checksum value as a new field in the document metadata.
- setKeep(boolean) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Sets whether to keep the metadata checksum value as a new metadata field.
- setMaxConcurrentCrawlers(int) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets the maximum number of crawlers that can be executed concurrently.
- setMaxDocuments(int) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the maximum number of documents that can be processed.
- setMaximum(long) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- setMaxMemoryInstance(long) - Method in class com.norconex.collector.core.CollectorConfig
- setMaxMemoryPool(long) - Method in class com.norconex.collector.core.CollectorConfig
- setMaxParallelCrawlers(int) - Method in class com.norconex.collector.core.CollectorConfig
-
Deprecated.Since 2.0.0, use
CollectorConfig.setMaxConcurrentCrawlers(int)
- setMetaChecksum(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- setMetadataChecksummer(IMetadataChecksummer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the metadata checksummer.
- setMetadataDeduplicate(boolean) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets whether to turn on deduplication based on metadata checksum.
- setMetadataFilters(IMetadataFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets metadata filters.
- setMetadataFilters(List<IMetadataFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets metadata filters.
- setNumThreads(int) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the maximum number of threads a crawler can use.
- setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- setOnMatch(OnMatch) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- setOnMultiple(StopCrawlerOnMaxEventListener.OnMultiple) - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- setOnSet(PropertySetter) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Sets the property setter to use when a value is set.
- setOnSet(PropertySetter) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Sets the property setter to use when a value is set.
- setOrphansStrategy(CrawlerConfig.OrphansStrategy) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the strategy to adopt when there are orphans.
- setPageSplitSize(Long) - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- setParentRootReference(String) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- setReferenceFilters(IReferenceFilter...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets reference filters.
- setReferenceFilters(List<IReferenceFilter>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets reference filters.
- setRegex(String) - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- setRegex(String) - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- setSourceFields(String...) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, use
GenericMetadataChecksummer.setFieldMatcher(TextMatcher)
. - setSourceFields(String...) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, use
MD5DocumentChecksummer.setFieldMatcher(TextMatcher)
. - setSourceFields(List<String>) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, use
GenericMetadataChecksummer.setFieldMatcher(TextMatcher)
. - setSourceFields(List<String>) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, use
MD5DocumentChecksummer.setFieldMatcher(TextMatcher)
. - setSourceFieldsRegex(String) - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
-
Deprecated.Since 2.0.0, use
GenericMetadataChecksummer.setFieldMatcher(TextMatcher)
. - setSourceFieldsRegex(String) - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
-
Deprecated.Since 2.0.0, use
MD5DocumentChecksummer.setFieldMatcher(TextMatcher)
. - setSpoiledReferenceStrategizer(ISpoiledReferenceStrategizer) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the spoiled state strategy resolver.
- setState(CrawlState) - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- setStopOnExceptions(Class<? extends Exception>...) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the exceptions we want to stop the crawler on.
- setStopOnExceptions(List<Class<? extends Exception>>) - Method in class com.norconex.collector.core.crawler.CrawlerConfig
-
Sets the exceptions we want to stop the crawler on.
- setTablePrefix(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- setTargetField(String) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Deprecated.Since 2.0.0, use
AbstractDocumentChecksummer.setToField(String)
. - setTargetField(String) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Deprecated.Since 2.0.0, use
AbstractMetadataChecksummer.setToField(String)
. - setTempDir(Path) - Method in class com.norconex.collector.core.CollectorConfig
-
/** Sets the temporary directory where files can be deleted safely by the OS or other processes when the collector is not running.
- setTextType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- setTimestapType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- setToField(String) - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
-
Sets the metadata field name to use to store the checksum value.
- setToField(String) - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
-
Sets the metadata field name to use to store the checksum value.
- setValueMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
-
Sets the value matcher.
- setValueMatcher(TextMatcher) - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
-
Sets the value matcher.
- setVarcharType(String) - Method in class com.norconex.collector.core.store.impl.jdbc.JdbcDataStoreEngine
- setVariablesFile(Path) - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- setWorkDir(Path) - Method in class com.norconex.collector.core.CollectorConfig
-
Sets the base directory location where files created during execution are created.
- SpoiledReferenceStrategy - Enum in com.norconex.collector.core.spoil
-
Markers indicating what to do with references that were once processed properly, but failed to get a good processing state a subsequent time around.
- start() - Method in class com.norconex.collector.core.Collector
-
Starts all crawlers defined in configuration.
- start() - Method in class com.norconex.collector.core.crawler.Crawler
-
Starts crawling.
- StartCommand - Class in com.norconex.collector.core.cmdline
-
Start the Collector.
- StartCommand() - Constructor for class com.norconex.collector.core.cmdline.StartCommand
- stop() - Method in class com.norconex.collector.core.Collector
-
Stops a running instance of this Collector.
- stop() - Method in class com.norconex.collector.core.crawler.Crawler
- StopCommand - Class in com.norconex.collector.core.cmdline
-
Stop the Collector.
- StopCommand() - Constructor for class com.norconex.collector.core.cmdline.StopCommand
- StopCrawlerOnMaxEventListener - Class in com.norconex.collector.core.crawler.event.impl
-
Alternative to
CrawlerConfig.setMaxDocuments(int)
for stopping the crawler upon reaching specific event counts. - StopCrawlerOnMaxEventListener() - Constructor for class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- StopCrawlerOnMaxEventListener.OnMultiple - Enum in com.norconex.collector.core.crawler.event.impl
- StoreExportCommand - Class in com.norconex.collector.core.cmdline
-
Export crawl store to specified file.
- StoreExportCommand() - Constructor for class com.norconex.collector.core.cmdline.StoreExportCommand
- StoreImportCommand - Class in com.norconex.collector.core.cmdline
-
Import crawl store from specified file.
- StoreImportCommand() - Constructor for class com.norconex.collector.core.cmdline.StoreImportCommand
- subject(Object) - Method in class com.norconex.collector.core.crawler.CrawlerEvent.Builder
- SUM - com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
-
Stop the crawler when the sum of all matching event counts have reached the maximum.
T
- toString() - Method in class com.norconex.collector.core.checksum.AbstractDocumentChecksummer
- toString() - Method in class com.norconex.collector.core.checksum.AbstractMetadataChecksummer
- toString() - Method in class com.norconex.collector.core.checksum.impl.GenericMetadataChecksummer
- toString() - Method in class com.norconex.collector.core.checksum.impl.MD5DocumentChecksummer
- toString() - Method in class com.norconex.collector.core.cmdline.AbstractSubCommand
- toString() - Method in class com.norconex.collector.core.cmdline.CollectorCommand
- toString() - Method in class com.norconex.collector.core.cmdline.ConfigRenderCommand
- toString() - Method in class com.norconex.collector.core.cmdline.StartCommand
- toString() - Method in class com.norconex.collector.core.cmdline.StoreExportCommand
- toString() - Method in class com.norconex.collector.core.cmdline.StoreImportCommand
- toString() - Method in class com.norconex.collector.core.Collector
- toString() - Method in class com.norconex.collector.core.CollectorConfig
- toString() - Method in class com.norconex.collector.core.crawler.Crawler
- toString() - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
- toString() - Method in class com.norconex.collector.core.crawler.CrawlerConfig
- toString() - Method in class com.norconex.collector.core.crawler.CrawlerEvent
- toString() - Method in class com.norconex.collector.core.crawler.event.impl.DeleteRejectedEventListener
- toString() - Method in class com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener
- toString() - Method in class com.norconex.collector.core.doc.CrawlDoc
- toString() - Method in class com.norconex.collector.core.doc.CrawlDocInfo
- toString() - Method in class com.norconex.collector.core.doc.CrawlState
- toString() - Method in class com.norconex.collector.core.filter.impl.ExtensionReferenceFilter
- toString() - Method in class com.norconex.collector.core.filter.impl.MetadataFilter
- toString() - Method in class com.norconex.collector.core.filter.impl.ReferenceFilter
- toString() - Method in class com.norconex.collector.core.filter.impl.RegexMetadataFilter
-
Deprecated.
- toString() - Method in class com.norconex.collector.core.filter.impl.RegexReferenceFilter
-
Deprecated.
- toString() - Method in class com.norconex.collector.core.pipeline.AbstractPipelineContext
- toString() - Method in class com.norconex.collector.core.pipeline.DocInfoPipelineContext
- toString() - Method in class com.norconex.collector.core.pipeline.DocumentPipelineContext
- toString() - Method in class com.norconex.collector.core.pipeline.importer.ImporterPipelineContext
- toString() - Method in class com.norconex.collector.core.spoil.impl.GenericSpoiledReferenceStrategizer
- toString() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreConfig
- toString() - Method in class com.norconex.collector.core.store.impl.mvstore.MVStoreDataStoreEngine
U
- unlock() - Method in class com.norconex.collector.core.Collector
- UNMODIFIED - Static variable in class com.norconex.collector.core.doc.CrawlState
- unregister(Crawler) - Static method in class com.norconex.collector.core.monitor.CrawlerMonitorJMX
- UNSUPPORTED - Static variable in class com.norconex.collector.core.doc.CrawlState
-
Typically when a reference cannot be processed since it is not supported by the collector or one of its configured component.
- upsert(CrawlDoc) - Method in class com.norconex.collector.core.crawler.CrawlerCommitterService
-
Updates or inserts a document using all accepting committers.
- urlToPath(String) - Static method in class com.norconex.collector.core.pipeline.importer.SaveDocumentStage
V
- valueOf(String) - Static method in enum com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in class com.norconex.collector.core.doc.CrawlState
- valueOf(String) - Static method in enum com.norconex.collector.core.spoil.SpoiledReferenceStrategy
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum com.norconex.collector.core.crawler.Crawler.ReferenceProcessStatus
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.norconex.collector.core.crawler.CrawlerConfig.OrphansStrategy
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.norconex.collector.core.crawler.event.impl.StopCrawlerOnMaxEventListener.OnMultiple
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.norconex.collector.core.doc.CrawlDocInfo.Stage
-
Returns an array containing the constants of this enum type, in the order they are declared.
- values() - Static method in enum com.norconex.collector.core.spoil.SpoiledReferenceStrategy
-
Returns an array containing the constants of this enum type, in the order they are declared.
All Classes All Packages