public class HttpCollector extends AbstractCollector
HttpCollectorConfig
,
or by XML configuration, loaded using CollectorConfigLoader
.
Instances of this class can hold several crawler, running at once.
This is convenient when there are configuration setting to be shared amongst
crawlers. When you have many crawler jobs defined that have nothing
in common, it may be best to configure and run them separately, to facilitate
troubleshooting. There is no set rules for this, experimenting with your
target sites will help you.Constructor and Description |
---|
HttpCollector()
Creates a non-configured HTTP collector.
|
HttpCollector(HttpCollectorConfig collectorConfig)
Creates and configure an HTTP Collector with the provided
configuration.
|
Modifier and Type | Method and Description |
---|---|
protected ICrawler |
createCrawler(ICrawlerConfig config) |
HttpCollectorConfig |
getCollectorConfig() |
static void |
main(String[] args)
Invokes the HTTP Collector from the command line.
|
createJobSuite, getCrawlers, getId, getJobSuite, getState, setCrawlers, start, stop
public HttpCollector()
public HttpCollector(HttpCollectorConfig collectorConfig)
collectorConfig
- HTTP Collector configurationpublic static void main(String[] args)
args
- Invoke it once without any arguments to get a
list of command-line options.public HttpCollectorConfig getCollectorConfig()
getCollectorConfig
in interface ICollector
getCollectorConfig
in class AbstractCollector
protected ICrawler createCrawler(ICrawlerConfig config)
createCrawler
in class AbstractCollector
Copyright © 2009–2021 Norconex Inc.. All rights reserved.