Package com.norconex.importer.parser
Class OCRConfig
java.lang.Object
com.norconex.importer.parser.OCRConfig
OCR configuration details. OCR relies the open-source Tesseract OCR product to be already installed on your system.
Since 2.10.0, it is recommended to specify the full path the Tesseract executable file (as opposed to its installation directory).
- Since:
- 2.1.0
- Author:
- Pascal Essiembre
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanGets the regular expression matching content types to restrict OCR to.Gets languages to use by OCR.getPath()Gets the Tesseract OCR engine executable file path.inthashCode()booleanisEmpty()voidsetContentTypes(String contentTypes) Sets the regular expression matching content types to restrict OCR to.voidsetLanguages(String languages) Sets languages to use by OCR.voidSets the Tesseract OCR engine executable file path.toString()
-
Constructor Details
-
OCRConfig
public OCRConfig()Constructor.
-
-
Method Details
-
getPath
Gets the Tesseract OCR engine executable file path.- Returns:
- path
-
setPath
Sets the Tesseract OCR engine executable file path.- Parameters:
path- installation path
-
getLanguages
Gets languages to use by OCR.- Returns:
- languages
-
setLanguages
Sets languages to use by OCR.- Parameters:
languages- languages to use by OCR.
-
getContentTypes
Gets the regular expression matching content types to restrict OCR to.- Returns:
- content types
-
setContentTypes
Sets the regular expression matching content types to restrict OCR to.- Parameters:
contentTypes- content types
-
isEmpty
public boolean isEmpty() -
equals
-
hashCode
public int hashCode() -
toString
-