Package com.norconex.commons.lang.io
Class TextReader
- java.lang.Object
-
- java.io.Reader
-
- com.norconex.commons.lang.io.TextReader
-
- All Implemented Interfaces:
Closeable,AutoCloseable,Readable
public class TextReader extends Reader
Reads text form an input stream, splitting it wisely whenever the text is too large. First tries to split after the last paragraph. If there are no paragraph, it tries to split after the last sentence. If no sentence can be detected, it splits on the last word. If no words are found, it returns all it could read up to the maximum read size in characters. The default maximum number of characters to be read before splitting is 10 millions. Passing-1as themaxReadSizewill disable reading in batch and will read the entire text all at once.- Since:
- 1.6.0
- Author:
- Pascal Essiembre
-
-
Field Summary
Fields Modifier and Type Field Description static intDEFAULT_MAX_READ_SIZE
-
Constructor Summary
Constructors Constructor Description TextReader(Reader reader)Create a new text reader, reading a maximum of 10 million characters at a time whenreadText()is called.TextReader(Reader reader, int maxReadSize)Constructor.TextReader(Reader reader, int maxReadSize, boolean removeTrailingDelimiter)Constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidclose()intread(char[] cbuf, int off, int len)StringreadText()Reads the next chunk of text, up to the maximum read size specified.-
Methods inherited from class java.io.Reader
mark, markSupported, nullReader, read, read, read, ready, reset, skip, transferTo
-
-
-
-
Field Detail
-
DEFAULT_MAX_READ_SIZE
public static final int DEFAULT_MAX_READ_SIZE
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
TextReader
public TextReader(Reader reader)
Create a new text reader, reading a maximum of 10 million characters at a time whenreadText()is called.- Parameters:
reader- a Reader
-
TextReader
public TextReader(Reader reader, int maxReadSize)
Constructor.- Parameters:
reader- a ReadermaxReadSize- maximum to read at once withreadText().
-
TextReader
public TextReader(Reader reader, int maxReadSize, boolean removeTrailingDelimiter)
Constructor.- Parameters:
reader- a ReadermaxReadSize- maximum to read at once withreadText().removeTrailingDelimiter- whether to remove trailing delimiter
-
-
Method Detail
-
read
public int read(char[] cbuf, int off, int len) throws IOException- Specified by:
readin classReader- Throws:
IOException
-
readText
public String readText() throws IOException
Reads the next chunk of text, up to the maximum read size specified. It tries as much as possible to break long text into paragraph, sentences or words, before returning. See class documentation.- Returns:
- text read
- Throws:
IOException- problem reading text.
-
close
public void close() throws IOException- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Specified by:
closein classReader- Throws:
IOException
-
-