Read input stream twice
Asked Answered
F

11

166

How do you read the same inputstream twice? Is it possible to copy it somehow?

I need to get a image from web, save it locally and then return the saved image. I just thought it would be faster to use the same stream instead of starting a new stream to the downloaded content and then read it again.

Fiora answered 29/2, 2012 at 14:50 Comment(1)
Maybe use mark and resetTolentino
C
148

You can use org.apache.commons.io.IOUtils.copy to copy the contents of the InputStream to a byte array, and then repeatedly read from the byte array using a ByteArrayInputStream. E.g.:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
org.apache.commons.io.IOUtils.copy(in, baos);
byte[] bytes = baos.toByteArray();

// either
while (needToReadAgain) {
    ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
    yourReadMethodHere(bais);
}

// or
ByteArrayInputStream bais = new ByteArrayInputStream(bytes);
while (needToReadAgain) {
    bais.reset();
    yourReadMethodHere(bais);
}
Conley answered 29/2, 2012 at 14:59 Comment(8)
I think this is the only valid solution as mark isn't supported for all types.Fiora
@Paul Grime: IOUtils.toByeArray internally calls copy method from inside as well.Branchia
As @Branchia says, this solution is not valid for me, since the input is read internally and can't be reused.Radu
@Extreme, if in your case you don't have control over how the InputStream is read (it is read internally as you mention) then you might be out of luck. Do you have any access to the InputStream before the 'internal' read? I think the point @Branchia made was different and was just concerning the API.Conley
I know this comment is out of time, but, here in the first option, if you read the inputstream as a byte array, doesn't it means that you're loading all the data to memory? which could be a big problem if you're loading something like big files?Rejuvenate
@jaxkodex, yes that is correct. If you as a developer know more about the actual type of streams you're dealing with, then you can write more appropriate, custom behaviour. The supplied answer is a general abstraction.Conley
One could use IOUtils.toByteArray(InputStream) to get byte array in one call.Revile
This isn't a good answer whatsoever; it only works for tiny InputStreams. They can be literally infinite, or large enough that caching it isn't feasible, or 'slow' (and this doesn't do anything until the entire inputstream has been read). Consider infinite sources from a network, large files, teeing the inputstream of a Process, and so on. This is one of those 'SO is kinda broken' answers. This answer should go away but with 150 votes and an accepted mark, it never will. Unfortunate.Arthralgia
H
40

Depending on where the InputStream is coming from, you might not be able to reset it. You can check if mark() and reset() are supported using markSupported().

If it is, you can call reset() on the InputStream to return to the beginning. If not, you need to read the InputStream from the source again.

Hackneyed answered 29/2, 2012 at 14:54 Comment(2)
InputStream doesn't support 'mark' - you can call mark on an IS but it does nothing. Likewise, calling reset on an IS will throw an exception.Kellen
@Kellen InputStream subsclasses like BufferedInputStream does support 'mark'Mir
E
18

if your InputStream support using mark, then you can mark() your inputStream and then reset() it . if your InputStrem doesn't support mark then you can use the class java.io.BufferedInputStream,so you can embed your stream inside a BufferedInputStream like this

    InputStream bufferdInputStream = new BufferedInputStream(yourInputStream);
    bufferdInputStream.mark(some_value);
    //read your bufferdInputStream 
    bufferdInputStream.reset();
    //read it again
Emory answered 29/1, 2014 at 14:59 Comment(2)
A buffered input stream can only mark back to the buffer size, so if the source doesn't fit, you can't go all the way back to the beginning.Zina
@L.Blanc sorry but that doesn't seem correct. Take a look at BufferedInputStream.fill(), there is the "grow buffer" section, where the new buffer size is compared only to marklimit and MAX_BUFFER_SIZE.Falmouth
L
11

For splitting an InputStream in two, while avoiding to load all data in memory, and then process them independently:

  1. Create a couple of OutputStream, precisely: PipedOutputStream
  2. Connect each PipedOutputStream with a PipedInputStream, these PipedInputStream are the returned InputStream.
  3. Connect the sourcing InputStream with just created OutputStream. So, everything read it from the sourcing InputStream, would be written in both OutputStream. There is not need to implement that, because it is done already in TeeInputStream (commons.io).
  4. Within a separated thread read the whole sourcing inputStream, and implicitly the input data is transferred to the target inputStreams.

    public static final List<InputStream> splitInputStream(InputStream input) 
        throws IOException 
    { 
        Objects.requireNonNull(input);      
    
        PipedOutputStream pipedOut01 = new PipedOutputStream();
        PipedOutputStream pipedOut02 = new PipedOutputStream();
    
        List<InputStream> inputStreamList = new ArrayList<>();
        inputStreamList.add(new PipedInputStream(pipedOut01));
        inputStreamList.add(new PipedInputStream(pipedOut02));
    
        TeeOutputStream tout = new TeeOutputStream(pipedOut01, pipedOut02);
    
        TeeInputStream tin = new TeeInputStream(input, tout, true);
    
        Executors.newSingleThreadExecutor().submit(tin::readAllBytes);  
    
        return Collections.unmodifiableList(inputStreamList);
    }
    

Be aware to close the inputStreams after being consumed, and close the thread that runs: TeeInputStream.readAllBytes()

In case, you need to split it into multiple InputStream, instead of just two. Replace in the previous fragment of code the class TeeOutputStream for your own implementation, which would encapsulate a List<OutputStream> and override the OutputStream interface:

public final class TeeListOutputStream extends OutputStream {
    private final List<? extends OutputStream> branchList;

    public TeeListOutputStream(final List<? extends OutputStream> branchList) {
        Objects.requireNonNull(branchList);
        this.branchList = branchList;
    }

    @Override
    public synchronized void write(final int b) throws IOException {
        for (OutputStream branch : branchList) {
            branch.write(b);
        }
    }

    @Override
    public void flush() throws IOException {
        for (OutputStream branch : branchList) {
            branch.flush();
        }
    }

    @Override
    public void close() throws IOException {
        for (OutputStream branch : branchList) {
            branch.close();
        }
    }
}
Lengthen answered 28/4, 2019 at 13:54 Comment(2)
Please, could you explain a bit more the step 4? Why we have to trigger reading manually? Why the reading any of pipedInputStream do NOT trigger the reading of the source inputStream? And why we do that call asyncronously?Turk
To close the TeeOutputStream I have added tin.close in the Thread: ` Executors.newSingleThreadExecutor().submit(() -> { try { tin.readAllBytes(); tin.close(); } catch (IOException ioException) { ioException.printStackTrace(); } }); `Kaohsiung
V
9

You can wrap input stream with PushbackInputStream. PushbackInputStream allows to unread ("write back") bytes which were already read, so you can do like this:

public class StreamTest {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's wrap it with PushBackInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new PushbackInputStream(originalStream, 10); // 10 means that maximnum 10 characters can be "written back" to the stream

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    ((PushbackInputStream) wrappedStream).unread(readBytes, 0, readBytes.length);

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3


  }

  private static byte[] getBytes(InputStream is, int howManyBytes) throws IOException {
    System.out.print("Reading stream: ");

    byte[] buf = new byte[howManyBytes];

    int next = 0;
    for (int i = 0; i < howManyBytes; i++) {
      next = is.read();
      if (next > 0) {
        buf[i] = (byte) next;
      }
    }
    return buf;
  }

  private static void printBytes(byte[] buffer) throws IOException {
    System.out.print("Reading stream: ");

    for (int i = 0; i < buffer.length; i++) {
      System.out.print(buffer[i] + " ");
    }
    System.out.println();
  }


}

Please note that PushbackInputStream stores internal buffer of bytes so it really creates a buffer in memory which holds bytes "written back".

Knowing this approach we can go further and combine it with FilterInputStream. FilterInputStream stores original input stream as a delegate. This allows to create new class definition which allows to "unread" original data automatically. The definition of this class is following:

public class TryReadInputStream extends FilterInputStream {
  private final int maxPushbackBufferSize;

  /**
  * Creates a <code>FilterInputStream</code>
  * by assigning the  argument <code>in</code>
  * to the field <code>this.in</code> so as
  * to remember it for later use.
  *
  * @param in the underlying input stream, or <code>null</code> if
  *           this instance is to be created without an underlying stream.
  */
  public TryReadInputStream(InputStream in, int maxPushbackBufferSize) {
    super(new PushbackInputStream(in, maxPushbackBufferSize));
    this.maxPushbackBufferSize = maxPushbackBufferSize;
  }

  /**
   * Reads from input stream the <code>length</code> of bytes to given buffer. The read bytes are still avilable
   * in the stream
   *
   * @param buffer the destination buffer to which read the data
   * @param offset  the start offset in the destination <code>buffer</code>
   * @aram length how many bytes to read from the stream to buff. Length needs to be less than
   *        <code>maxPushbackBufferSize</code> or IOException will be thrown
   *
   * @return number of bytes read
   * @throws java.io.IOException in case length is
   */
  public int tryRead(byte[] buffer, int offset, int length) throws IOException {
    validateMaxLength(length);

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int bytesRead = 0;

    int nextByte = 0;

    for (int i = 0; (i < length) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        buffer[offset + bytesRead++] = (byte) nextByte;
      }
    }

    if (bytesRead > 0) {
      ((PushbackInputStream) in).unread(buffer, offset, bytesRead);
    }

    return bytesRead;

  }

  public byte[] tryRead(int maxBytesToRead) throws IOException {
    validateMaxLength(maxBytesToRead);

    ByteArrayOutputStream baos = new ByteArrayOutputStream(); // as ByteArrayOutputStream to dynamically allocate internal bytes array instead of allocating possibly large buffer (if maxBytesToRead is large)

    // NOTE: below reading byte by byte instead of "int bytesRead = is.read(firstBytes, 0, maxBytesOfResponseToLog);"
    // because read() guarantees to read a byte

    int nextByte = 0;

    for (int i = 0; (i < maxBytesToRead) && (nextByte >= 0); i++) {
      nextByte = read();
      if (nextByte >= 0) {
        baos.write((byte) nextByte);
      }
    }

    byte[] buffer = baos.toByteArray();

    if (buffer.length > 0) {
      ((PushbackInputStream) in).unread(buffer, 0, buffer.length);
    }

    return buffer;

  }

  private void validateMaxLength(int length) throws IOException {
    if (length > maxPushbackBufferSize) {
      throw new IOException(
        "Trying to read more bytes than maxBytesToRead. Max bytes: " + maxPushbackBufferSize + ". Trying to read: " +
        length);
    }
  }

}

This class has two methods. One for reading into existing buffer (defintion is analogous to calling public int read(byte b[], int off, int len) of InputStream class). Second which returns new buffer (this may be more effective if the size of buffer to read is unknown).

Now let's see our class in action:

public class StreamTest2 {
  public static void main(String[] args) throws IOException {
    byte[] bytes = new byte[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };

    InputStream originalStream = new ByteArrayInputStream(bytes);

    byte[] readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 1 2 3

    readBytes = getBytes(originalStream, 3);
    printBytes(readBytes); // prints: 4 5 6

    // now let's use our TryReadInputStream

    originalStream = new ByteArrayInputStream(bytes);

    InputStream wrappedStream = new TryReadInputStream(originalStream, 10);

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // NOTE: no manual call to "unread"(!) because TryReadInputStream handles this internally
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 1 2 3

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3);
    printBytes(readBytes); // prints 1 2 3

    // we can also call normal read which will actually read the bytes without "writing them back"
    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 1 2 3

    readBytes = getBytes(wrappedStream, 3);
    printBytes(readBytes); // prints 4 5 6

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); // now we can try read next bytes
    printBytes(readBytes); // prints 7 8 9

    readBytes = ((TryReadInputStream) wrappedStream).tryRead(3); 
    printBytes(readBytes); // prints 7 8 9


  }



}
Verdha answered 25/5, 2015 at 11:41 Comment(0)
C
7

If you are using an implementation of InputStream, you can check the result of InputStream#markSupported() that tell you whether or not you can use the method mark() / reset().

If you can mark the stream when you read, then call reset() to go back to begin.

If you can't you'll have to open a stream again.

Another solution would be to convert InputStream to byte array, then iterate over the array as many time as you need. You can find several solutions in this post Convert InputStream to byte array in Java using 3rd party libs or not. Caution, if the read content is too big you might experience some memory troubles.

Finally, if your need is to read image, then use :

BufferedImage image = ImageIO.read(new URL("http://www.example.com/images/toto.jpg"));

Using ImageIO#read(java.net.URL) also allows you to use cache.

Conditional answered 29/2, 2012 at 14:54 Comment(2)
a word of warning when using ImageIO#read(java.net.URL): some webservers and CDNs might reject bare calls (i.e. without a User Agent that makes the server believe the call comes from a web browser) made by ImageIO#read. In that case, using URLConnection.openConnection() setting the user agent to that connection + using `ImageIO.read(InputStream) will, most of times, do the trick.Eluvium
InputStream is not an interfaceFsh
B
4

In case anyone is running in a Spring Boot app, and you want to read the response body of a RestTemplate (which is why I want to read a stream twice), there is a clean(er) way of doing this.

First of all, you need to use Spring's StreamUtils to copy the stream to a String:

String text = StreamUtils.copyToString(response.getBody(), Charset.defaultCharset()))

But that's not all. You also need to use a request factory that can buffer the stream for you, like so:

ClientHttpRequestFactory factory = new BufferingClientHttpRequestFactory(new SimpleClientHttpRequestFactory());
RestTemplate restTemplate = new RestTemplate(factory);

Or, if you're using the factory bean, then (this is Kotlin but nevertheless):

@Bean
@Scope(ConfigurableBeanFactory.SCOPE_PROTOTYPE)
fun createRestTemplate(): RestTemplate = RestTemplateBuilder()
  .requestFactory { BufferingClientHttpRequestFactory(SimpleClientHttpRequestFactory()) }
  .additionalInterceptors(loggingInterceptor)
  .build()

Source: https://objectpartners.com/2018/03/01/log-your-resttemplate-request-and-response-without-destroying-the-body/

Bink answered 4/12, 2018 at 13:41 Comment(0)
A
3

How about:

if (stream.markSupported() == false) {

        // lets replace the stream object
        ByteArrayOutputStream baos = new ByteArrayOutputStream();
        IOUtils.copy(stream, baos);
        stream.close();
        stream = new ByteArrayInputStream(baos.toByteArray());
        // now the stream should support 'mark' and 'reset'

    }
Ashby answered 4/6, 2017 at 15:40 Comment(1)
That's a terrible idea. You put the entire stream contents in memory like that.Brush
O
2

Convert inputstream into bytes and then pass it to savefile function where you assemble the same into inputstream. Also in original function use bytes to use for other tasks

Olivares answered 29/2, 2012 at 14:57 Comment(1)
I say bad idea on this one, the resulting array could be huge and will rob the device of memory.Hackneyed
R
0

If you are using RestTemplate to make http calls Simply add an interceptor. Response body is cached by the implementation of ClientHttpResponse. Now inputstream can be retrieved from respose as many times as we need

ClientHttpRequestInterceptor interceptor =  new ClientHttpRequestInterceptor() {

            @Override
            public ClientHttpResponse intercept(HttpRequest request, byte[] body,
                    ClientHttpRequestExecution execution) throws IOException {
                ClientHttpResponse  response = execution.execute(request, body);

                  // additional work before returning response
                  return response 
            }
        };

    // Add the interceptor to RestTemplate Instance 

         restTemplate.getInterceptors().add(interceptor); 
Restaurant answered 20/2, 2020 at 18:34 Comment(0)
P
0
ByteArrayInputStream ins = new ByteArrayInputStream("Hello".getBytes());
System.out.println("ins.available() at begining:: " + ins.available());
ins.mark(0);
// Read input stream for some operations
System.out.println("ins.available() after reading :: " + ins.available());
    ins.reset();
    System.out.println("ins.available() after resetting :: " + ins.available());
    // ins is ready for reading once again.
Pontine answered 21/12, 2020 at 9:22 Comment(2)
The output of above statements are: ins.available() at begining:: :: 1028 ins.available() after reading :: 0 ins.available() after resetting :: 1028Pontine
This solution only works, if the stream is fully controlled by the code. If the ByteArrayInputStream is passed to some other method and this method closes the BAIS, then the underlying inputstream also gets closed and cannot be reset.Allargando

© 2022 - 2024 — McMap. All rights reserved.