Class CrawlerEvent
java.lang.Object
java.util.EventObject
com.norconex.commons.lang.event.Event
com.norconex.collector.core.crawler.CrawlerEvent
- All Implemented Interfaces:
Serializable
A crawler event.
- Since:
- 2.0.0
- Author:
- Pascal Essiembre
- See Also:
-
Nested Class Summary
Nested Classes -
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final Stringstatic final Stringstatic final StringThe crawler began its initialization.static final StringThe crawler has been initialized.static final StringThe crawler is about to begin crawling.static final StringThe crawler completed crawling execution normally (without being stopped).static final StringThe crawler just started a new crawling thread.static final StringThe crawler completed execution of a crawling thread.static final StringIssued when a request to stop the crawler has been received.static final StringIssued when a request to stop the crawler has been fully executed (crawler stopped).static final StringA document was submitted to a committer for removal.static final StringA document was submitted to a committer for upsert.static final StringA document was successfully retrieved for processing.static final StringA document was imported.static final StringA document metadata fields were successfully retrieved.static final StringA document post-import processor was executed properly.static final StringA document pre-import processor was executed properly.static final StringA document was processed (successfully or not).static final StringA document reference was queued in the data store for processing.static final StringA document was saved.static final StringA document was rejected because the status obtained when trying to obtain it was not accepted (e.g., 500 HTTP error code).static final StringA document was rejected since another document with a different reference was already processed with the same digital signature ( checksum).static final StringA document was rejected because an error occurred when processing it.static final StringA document was rejected by a filters.static final StringA document was rejected by the Importer module.static final StringA document was rejected because it could not be found (e.g., no longer exists at a given location).static final StringA document could not be re-crawled because it is not yet ready to be re-crawled.static final StringA document was rejected as it was not modified since last time it was crawled.Fields inherited from class java.util.EventObject
source -
Method Summary
Methods inherited from class com.norconex.commons.lang.event.Event
getException, getMessage, getName, is, is
-
Field Details
-
CRAWLER_INIT_BEGIN
The crawler began its initialization.- See Also:
-
CRAWLER_INIT_END
The crawler has been initialized.- See Also:
-
CRAWLER_RUN_BEGIN
The crawler is about to begin crawling.- See Also:
-
CRAWLER_RUN_END
The crawler completed crawling execution normally (without being stopped). This event is triggered before the crawler resources are released.- See Also:
-
CRAWLER_RUN_THREAD_BEGIN
The crawler just started a new crawling thread.- See Also:
-
CRAWLER_RUN_THREAD_END
The crawler completed execution of a crawling thread.- See Also:
-
CRAWLER_STOP_BEGIN
Issued when a request to stop the crawler has been received.- See Also:
-
CRAWLER_STOP_END
Issued when a request to stop the crawler has been fully executed (crawler stopped).- See Also:
-
CRAWLER_CLEAN_BEGIN
- See Also:
-
CRAWLER_CLEAN_END
- See Also:
-
REJECTED_FILTER
A document was rejected by a filters.- See Also:
-
REJECTED_UNMODIFIED
A document was rejected as it was not modified since last time it was crawled.- See Also:
-
REJECTED_DUPLICATE
A document was rejected since another document with a different reference was already processed with the same digital signature ( checksum).- Since:
- 2.0.0
- See Also:
-
REJECTED_PREMATURE
A document could not be re-crawled because it is not yet ready to be re-crawled.- See Also:
-
REJECTED_NOTFOUND
A document was rejected because it could not be found (e.g., no longer exists at a given location).- See Also:
-
REJECTED_BAD_STATUS
A document was rejected because the status obtained when trying to obtain it was not accepted (e.g., 500 HTTP error code).- See Also:
-
REJECTED_IMPORT
A document was rejected by the Importer module.- See Also:
-
REJECTED_ERROR
A document was rejected because an error occurred when processing it.- See Also:
-
DOCUMENT_PREIMPORTED
A document pre-import processor was executed properly.- See Also:
-
DOCUMENT_IMPORTED
A document was imported.- See Also:
-
DOCUMENT_POSTIMPORTED
A document post-import processor was executed properly.- See Also:
-
DOCUMENT_COMMITTED_UPSERT
A document was submitted to a committer for upsert.- See Also:
-
DOCUMENT_COMMITTED_DELETE
A document was submitted to a committer for removal.- See Also:
-
DOCUMENT_METADATA_FETCHED
A document metadata fields were successfully retrieved.- See Also:
-
DOCUMENT_FETCHED
A document was successfully retrieved for processing.- See Also:
-
DOCUMENT_QUEUED
A document reference was queued in the data store for processing.- See Also:
-
DOCUMENT_PROCESSED
A document was processed (successfully or not).- See Also:
-
DOCUMENT_SAVED
A document was saved.- See Also:
-
-
Method Details
-
getCrawlDocInfo
Gets the crawl data holding contextual information about the crawled reference. CRAWLER_* events will return anullcrawl data.- Returns:
- crawl data
-
getSubject
Gets the subject. That is, other relevant source related to the event.- Returns:
- the subject
-
getSource
-
isCrawlerShutdown
public boolean isCrawlerShutdown() -
equals
-
hashCode
public int hashCode() -
toString
-