usage: collector-fs[.bat|.sh] -a,--action <arg> Required: one of start|resume|stop|checkcfg -c,--config <arg> Required: File System Crawler configuration file. -v,--variables <arg> Optional: variable file. -k,--checkcfg Validates XML configuration. When combined with -a, prevents execution on configuration error.
The above File System Crawler startup script is found in the root directory of your installation (where you extracted the Zip file you downloaded). Refer to the Flow Diagram and Configuration pages for documentation on all configuration options. Refer to ConfigurationLoader Javadoc for details on the optional variables file.
If you are using Maven, simply add the
project dependency to your pom.xml
.
If you are not using Maven, you can add all JAR files found in your installation
"lib" folder to your application classpath. Configure the
FilesystemCollector class, by passing it a
FilesystemCollectorConfig
You can build the configuration using java, or by loading an XML configuration
file using the
CollectorConfigLoader class. Below is a sample code usage:
/* XML configuration: */ //FilesystemCollectorConfig config = (FilesystemCollectorConfig) // new CollectorConfigLoader(FilesystemCollectorConfig.class) // .loadCollectorConfig(myXMLFile, myVariableFile); /* Java configuration: */ FilesystemCollectorConfig collectorConfig = new FilesystemCollectorConfig(); collectorConfig.setId("MyFilesystemCollector"); collectorConfig.setLogsDir("/tmp/logs/"); ... FilesystemCrawlerConfig crawlerConfig = new FilesystemConfig(); crawlerConfig.setId("MyFilesystemCrawler"); crawlerConfig.setStartPaths( new String[]{"/home/joe/myfiles", "/home/jack/hisfiles"}); ... collectorConfig.setCrawlerConfigs(crawlerConfig); FilesystemCollector collector = new FilesystemCollector(collectorConfig); collector.start(true);
Refer to the File System Crawler Javadoc for more documentation or the Configuration page for XML configuration options.
To create your own feature implementations, create a new Java project in your
favourite IDE. Use Maven or add to your classpath all the files contained in
the lib
folder of the File System Crawler installation. Configure your project
to have its binary output directory to be the classes
folder of the
importer. Automatically, code created and stored under classes
will be
picked up by the File System Crawler when you run it.
To fetch documents using the SMB/JCIFS protocol, you will need manually download install the following library: jcifs-1.3.17.jar. Command-line users can simply add it to the Collector's "lib" folder. Maven users can use the following:
<dependency> <groupId>jcifs</groupId> <artifactId>jcifs</artifactId> <version>1.3.17</version> </dependency>
This extra step is required due to JCIFS licensing incompatibilities affecting distribution.
You should be using File System Crawler version 2.7.0 or higher for SMB/CIFS support.