Class TextReader

  • All Implemented Interfaces:
    Closeable, AutoCloseable, Readable

    public class TextReader
    extends Reader
    Reads text form an input stream, splitting it wisely whenever the text is too large. First tries to split after the last paragraph. If there are no paragraph, it tries to split after the last sentence. If no sentence can be detected, it splits on the last word. If no words are found, it returns all it could read up to the maximum read size in characters. The default maximum number of characters to be read before splitting is 10 millions. Passing -1 as the maxReadSize will disable reading in batch and will read the entire text all at once.
    Since:
    1.6.0
    Author:
    Pascal Essiembre
    • Field Detail

    • Constructor Detail

      • TextReader

        public TextReader​(Reader reader)
        Create a new text reader, reading a maximum of 10 million characters at a time when readText() is called.
        Parameters:
        reader - a Reader
      • TextReader

        public TextReader​(Reader reader,
                          int maxReadSize)
        Constructor.
        Parameters:
        reader - a Reader
        maxReadSize - maximum to read at once with readText().
      • TextReader

        public TextReader​(Reader reader,
                          int maxReadSize,
                          boolean removeTrailingDelimiter)
        Constructor.
        Parameters:
        reader - a Reader
        maxReadSize - maximum to read at once with readText().
        removeTrailingDelimiter - whether to remove trailing delimiter
    • Method Detail

      • readText

        public String readText()
                        throws IOException
        Reads the next chunk of text, up to the maximum read size specified. It tries as much as possible to break long text into paragraph, sentences or words, before returning. See class documentation.
        Returns:
        text read
        Throws:
        IOException - problem reading text.