usage: importer[.bat|.sh] -c,--config <arg> Optional: Importer XML configuration file. -e,--contentEncoding <arg> Optional: The content encoding (charset) of the input file. -i,--inputFile <arg> File to be imported (required unless "checkcfg" is used). -k,--checkcfg Validates XML configuration. When combined with -i, prevents execution on configuration error. -o,--outputFile <arg> Optional: File where the imported content will be stored. -r,--reference <arg> Optional: Alternate unique qualifier for the input file (e.g. URL). -t,--contentType <arg> Optional: The MIME Content-type of the input file. -v,--variables <arg> Optional: variable file.
The above Importer launch script is found in the root directory of your installation (where you extracted the Zip file you downloaded). Refer to the Configuration page for documentation on all configuration options. Refer to ConfigurationLoader Javadoc for details on the optional variables file.
If you are using Maven, simply add the
project dependency to your pom.xml
.
If you are not using Maven, you can add all JAR files found in your installation
"lib" folder to your application classpath. Configure the
Importer class, by passing it a
FilesystemCollectorConfig
You can build the configuration using java, or by loading an XML configuration
file using the
ImporterConfigLoader class. Below is a sample code usage:
/* XML configuration: */ //ImporterConfig config = ImporterConfigLoader.loadImporterConfig( // myXMLFile, myVariableFile); /* Java configuration: */ ImporterConfig config = new ImporterConfig(); config.setsetTaggers(new IDocumentTagger[]{/* taggers here */}); config.setTransformers(new IDocumentTransformer[] {/* transformers here */}); config.setFilters(new IDocumentFilter[]{/* taggers here */}); Importer importer = new Importer(config); File inputFile = ... // the file to be converted File outputFile = ... // the file that will contain the extracted text Metadata metadata = new Metadata(); boolean accepted = importer.importDocument(inputFile, outputFile, metadata); if (accepted) { System.out.println("File was imported to : " + outputFile); } else { System.out.println("File was rejected."); }
Refer to the Importer Javadoc for more documentation or the Configuration page for XML configuration options.
To create your own feature implementations, create a new Java project in your
favourite IDE. Use Maven or add to your classpath all the files contained in
the lib
folder of the Importer installation. Configure your project
to have its binary output directory to be the classes
folder of the
importer. Automatically, code created and stored under classes
will be
picked up by the Importer when you run it.