Class OCRConfig

java.lang.Object
com.norconex.importer.parser.OCRConfig

public class OCRConfig extends Object

OCR configuration details. OCR relies the open-source Tesseract OCR product to be already installed on your system.

Since 2.10.0, it is recommended to specify the full path the Tesseract executable file (as opposed to its installation directory).

Since:
2.1.0
Author:
Pascal Essiembre
  • Constructor Details

    • OCRConfig

      public OCRConfig()
      Constructor.
  • Method Details

    • getPath

      public String getPath()
      Gets the Tesseract OCR engine executable file path.
      Returns:
      path
    • setPath

      public void setPath(String path)
      Sets the Tesseract OCR engine executable file path.
      Parameters:
      path - installation path
    • getLanguages

      public String getLanguages()
      Gets languages to use by OCR.
      Returns:
      languages
    • setLanguages

      public void setLanguages(String languages)
      Sets languages to use by OCR.
      Parameters:
      languages - languages to use by OCR.
    • getContentTypes

      public String getContentTypes()
      Gets the regular expression matching content types to restrict OCR to.
      Returns:
      content types
    • setContentTypes

      public void setContentTypes(String contentTypes)
      Sets the regular expression matching content types to restrict OCR to.
      Parameters:
      contentTypes - content types
    • isEmpty

      public boolean isEmpty()
    • equals

      public boolean equals(Object other)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object
    • toString

      public String toString()
      Overrides:
      toString in class Object