Class CachedInputStream

java.lang.Object
java.io.InputStream
com.norconex.commons.lang.io.CachedInputStream
All Implemented Interfaces:
ICachedStream, Closeable, AutoCloseable

public class CachedInputStream extends InputStream implements ICachedStream

InputStream wrapper that can be re-read any number of times. This class will cache the wrapped input steam content the first time it is read, and subsequent read will use the cache.

To create new instances of CachedInputStream, use the CachedStreamFactory class. Reusing the same factory will ensure all CachedInputStream instances created share the same combined maximum memory. Invoking one of the newInputStream(...) methods on this class have the same effect.

In order to re-use this InputStream, you must call rewind() first on it. Once done reading the stream, you will get the -1 character as expected, and it will remain at that until you rewind or dispose.

Starting reading the stream again will start reading bytes from the beginning (re)using its internal cache.

Calling InputStream.close() has no effect, and the cache data remains available for subsequent read.

To explicitly dispose of resources allocated to the cache, you can use the dispose() method. Attempting to read a disposed instance will throw an IOException. It is recommended you explicitly dispose of CachedInputStream instances to speed up the release of resources. Otherwise, resources are de-allocated automatically when the instance is finalized.

The internal cache stores read bytes into memory, up to to the configured maximum cache size. If content exceeds the cache limit, the cache transforms itself into a fast file-based cache of unlimited size. Default memory cache size is 128 KB.

Starting with 1.6.0, mark(int) is supported. The mark limit is always unlimited so the method argument is ignored.

Since:
1.5.0
Author:
Pascal Essiembre
See Also:
  • Method Details

    • cache

      public static CachedInputStream cache(InputStream is, CachedStreamFactory streamFactory)
      Casts to CachedInputSteam if argument is already of that type, else create a new CachedInputStream from the input stream argument using the given stream factory (or defaults if null).
      Parameters:
      is - input stream
      streamFactory - a stream factory
      Returns:
      a cached input stream
      Since:
      2.0.0
    • cache

      public static CachedInputStream cache(InputStream is)
      Casts to CachedInputSteam if argument is already of that type, else create a new CachedInputStream from the input stream argument using default CachedStreamFactory settings.
      Parameters:
      is - input stream
      Returns:
      a cached input stream
      Since:
      2.0.0
    • markSupported

      public boolean markSupported()
      Always true since 1.6.0.
      Overrides:
      markSupported in class InputStream
      Returns:
      true
    • mark

      public void mark(int readlimit)
      The read limit value is ignored. Limit is always unlimited. Supported since 1.6.0.
      Overrides:
      mark in class InputStream
      Parameters:
      readlimit - any value (ignored)
    • reset

      public void reset() throws IOException
      If no mark has previously been set, it resets to the beginning. Supported since 1.6.0.
      Overrides:
      reset in class InputStream
      Throws:
      IOException
    • isInMemory

      public boolean isInMemory()
      Whether caching is done in memory for this instance for what has been read so far. Otherwise, file-based caching is used.
      Returns:
      true if caching is in memory.
    • isEmpty

      public boolean isEmpty() throws IOException
      Returns true if this input stream is empty (zero-lenght content). Unless the stream has been fully read at least once, this method is more efficient than checking if length() is zero.
      Returns:
      true if empty
      Throws:
      IOException - problem checking if empty
      Since:
      2.0.0
    • read

      public int read() throws IOException
      Specified by:
      read in class InputStream
      Throws:
      IOException
    • read

      public int read(byte[] b, int off, int len) throws IOException
      Overrides:
      read in class InputStream
      Throws:
      IOException
    • enforceFullCaching

      public void enforceFullCaching() throws IOException
      If not already fully cached, forces the inner input stream to be fully cached.
      Throws:
      IOException - could not enforce full caching
    • rewind

      public void rewind()
      Rewinds this stream so it can be read again from the beginning. If this input stream was not fully read at least once, it will be fully read first, so its entirety is cached properly.
    • dispose

      public void dispose() throws IOException
      Throws:
      IOException
    • available

      public int available() throws IOException
      Overrides:
      available in class InputStream
      Throws:
      IOException
    • getCacheDirectory

      public final Path getCacheDirectory()
      Gets the cache directory where temporary cache files are created.
      Specified by:
      getCacheDirectory in interface ICachedStream
      Returns:
      the cache directory
    • isCacheEmpty

      public boolean isCacheEmpty()
      Returns true if was nothing to cache (no writing was performed) or if the stream was closed.
      Returns:
      true if empty
    • isDisposed

      public boolean isDisposed()
    • getMemCacheSize

      public long getMemCacheSize()
      Specified by:
      getMemCacheSize in interface ICachedStream
    • length

      public int length()

      Gets the length of the cached input stream. The length represents the number of bytes that were read from this input stream, after it was read entirely at least once.

      Note: Invoking this method when this stream is only partially read (on a first read) will force it to read entirely and cache the inner input stream it wraps. To prevent an unnecessary read cycle, it is always best to invoke this method after this stream was fully read through normal use first.

      Returns:
      the byte length
      Since:
      1.6.1
    • newInputStream

      public CachedInputStream newInputStream(Path file)
      Creates a new CachedInputStream using the same factory settings that were used to create this instance.
      Parameters:
      file - file to create the input stream from
      Returns:
      cached input stream
    • newInputStream

      public CachedInputStream newInputStream(InputStream is)
      Creates a new CachedInputStream using the same factory settings that were used to create this instance.
      Parameters:
      is - input stream
      Returns:
      cached input stream
    • getStreamFactory

      public CachedStreamFactory getStreamFactory()
    • finalize

      protected void finalize() throws Throwable
      Overrides:
      finalize in class Object
      Throws:
      Throwable