Package com.norconex.collector.http
Class HttpCollector
java.lang.Object
com.norconex.collector.core.Collector
com.norconex.collector.http.HttpCollector
Main application class.
Instances of this class can hold several crawler, running at once.
This is convenient when there are configuration setting to be shared amongst
crawlers. When you have many crawler jobs defined that have nothing
in common, it may be best to configure and run them separately, to facilitate
troubleshooting. There is no set rules for this, experimenting with your
target sites will help you.
- Author:
- Pascal Essiembre
-
Field Summary
Fields inherited from class com.norconex.collector.core.Collector
NORCONEX_ASCII -
Constructor Summary
ConstructorsConstructorDescriptionCreates a non-configured HTTP collector.HttpCollector(HttpCollectorConfig collectorConfig) Creates and configure an HTTP Collector with the provided configuration. -
Method Summary
Modifier and TypeMethodDescriptionprotected CrawlercreateCrawler(CrawlerConfig config) static voidInvokes the HTTP Collector from the command line.Methods inherited from class com.norconex.collector.core.Collector
clean, destroyCollector, exportDataStore, fireStopRequest, get, getCrawlers, getEventManager, getId, getReleaseVersions, getStreamFactory, getTempDir, getVersion, getWorkDir, importDataStore, initCollector, isRunning, lock, start, stop, toString, unlock
-
Constructor Details
-
HttpCollector
public HttpCollector()Creates a non-configured HTTP collector. -
HttpCollector
Creates and configure an HTTP Collector with the provided configuration.- Parameters:
collectorConfig- HTTP Collector configuration
-
-
Method Details
-
main
Invokes the HTTP Collector from the command line.- Parameters:
args- Invoke it once without any arguments to get a list of command-line options.
-
getCollectorConfig
- Overrides:
getCollectorConfigin classCollector
-
createCrawler
- Specified by:
createCrawlerin classCollector
-