Package com.norconex.importer.parser
Class OCRConfig
- java.lang.Object
-
- com.norconex.importer.parser.OCRConfig
-
public class OCRConfig extends Object
OCR configuration details. OCR relies the open-source Tesseract OCR product to be already installed on your system.
Since 2.10.0, it is recommended to specify the full path the Tesseract executable file (as opposed to its installation directory).
- Since:
- 2.1.0
- Author:
- Pascal Essiembre
-
-
Constructor Summary
Constructors Constructor Description OCRConfig()Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanequals(Object other)StringgetContentTypes()Gets the regular expression matching content types to restrict OCR to.StringgetLanguages()Gets languages to use by OCR.StringgetPath()Gets the Tesseract OCR engine executable file path.inthashCode()booleanisEmpty()voidsetContentTypes(String contentTypes)Sets the regular expression matching content types to restrict OCR to.voidsetLanguages(String languages)Sets languages to use by OCR.voidsetPath(String path)Sets the Tesseract OCR engine executable file path.StringtoString()
-
-
-
Method Detail
-
getPath
public String getPath()
Gets the Tesseract OCR engine executable file path.- Returns:
- path
-
setPath
public void setPath(String path)
Sets the Tesseract OCR engine executable file path.- Parameters:
path- installation path
-
getLanguages
public String getLanguages()
Gets languages to use by OCR.- Returns:
- languages
-
setLanguages
public void setLanguages(String languages)
Sets languages to use by OCR.- Parameters:
languages- languages to use by OCR.
-
getContentTypes
public String getContentTypes()
Gets the regular expression matching content types to restrict OCR to.- Returns:
- content types
-
setContentTypes
public void setContentTypes(String contentTypes)
Sets the regular expression matching content types to restrict OCR to.- Parameters:
contentTypes- content types
-
isEmpty
public boolean isEmpty()
-
-