writeTo PipedOutputStream just hangs

Asked 26/10, 2016 at 19:11 Answered 21/8, 2024 at 17:51

Solved java groovy inputstream aws-java-sdk

My goal is to:

reading a file from S3,
changing its metadata
Push it out to S3 again

AWS java SDK doesn't allow outputstreams to be pushed. Therefore, I have to convert the outputstream from step2 to inputstream. For this I decided to use PipedInputStream.

However, my code just hangs in the writeTo(out); step. This code is in a grails application. When the code hangs the CPU is not in high consumption:

import org.apache.commons.imaging.formats.jpeg.xmp.JpegXmpRewriter;

AmazonS3Client client = nfile.getS3Client() //get S3 client
S3Object object1 = client.getObject(
                  new GetObjectRequest("test-bucket", "myfile.jpg")) //get the object. 

InputStream isNew1 = object1.getObjectContent(); //create input stream
ByteArrayOutputStream os = new ByteArrayOutputStream();
PipedInputStream inpipe = new PipedInputStream();
final PipedOutputStream out = new PipedOutputStream(inpipe);

try {
   String xmpXml = "<x:xmpmeta>" +
    "\n<Lifeshare>" +
    "\n\t<Date>"+"some date"+"</Date>" +
    "\n</Lifeshare>" +
    "\n</x:xmpmeta>";/
   JpegXmpRewriter rewriter = new JpegXmpRewriter();
   rewriter.updateXmpXml(isNew1,os, xmpXml); //This is step2

   try {
new Thread(new Runnable() {
    public void run () {
        try {
            // write the original OutputStream to the PipedOutputStream
            println "starting writeto"
            os.writeTo(out);
            println "ending writeto"
        } catch (IOException e) {
            // logging and exception handling should go here
        }
    }
}).start();

         ObjectMetadata metadata = new ObjectMetadata();
         metadata.setContentLength(1024); //just testing
         client.putObject(new PutObjectRequest("test-bucket", "myfile_copy.jpg", inpipe, metadata));
         os.writeTo(out);

         os.close();
         out.close();
   } catch (IOException e) {
         // logging and exception handling should go here
   }

}
finally {
   isNew1.close()
   os.close()
   out.close()
}

The above code just prints starting writeto and hangs. it does not print ending writeto

Update By putting the writeTo in a separate thread, the file is now being written to S3, however, only 1024 bytes of it being are written. The file is incomplete. How can I write everything from outputstream to S3?

Gomorrah answered 26/10, 2016 at 19:11 Comment(2)

You write to the PipedOutputStream: os.writeTo(out), but do you read from the PipedInputStream which is connected to it ? – Subtlety 26/10, 2016 at 19:24

Yes, I do. I did not show that code since the program was hanging before that code but for completeness I've added it now. – Gomorrah 26/10, 2016 at 19:40

When you do os.writeTo(out), it will try to flush an entire stream to out, and since there is nobody reading from the other side of it (i.e. inpipe) yet, the internal buffer fills up and the thread stops.

You have to setup the reader before you write the data, and also make sure that it is executed in a separate thread (see javadoc on PipedOutputStream).

Subtlety answered 26/10, 2016 at 19:49 Comment(11)

I moved the code for consuming (S3object) above the writeTo but the result is the same... – Gomorrah 26/10, 2016 at 19:53

Please try using the TransferManager instead of AmazonS3Client, to make the upload in a separate thread (that is important as you can not read and write to the PipedOutputStream in the same thread) – Subtlety 26/10, 2016 at 19:55

Basically your client.putObject() will block trying to read from inpipe (because no data have been writter to it yet). – Subtlety 26/10, 2016 at 19:57

Ok I put the writetTo in a different thread and after that I use PutObjectRequest. Now the file is being written to S3, however, its size is only 1024. So it is incomplete. Please see the updated code. – Gomorrah 26/10, 2016 at 20:9

How can I write the entire file to s3? – Gomorrah 26/10, 2016 at 20:12

Do you really need this line? metadata.setContentLength(1024); //just testing – Subtlety 26/10, 2016 at 20:13

Without that line I get an error

Message: Write end dead ->>  311 | read                  in java.io.PipedInputStream - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - |    378 | read                  in     '' |     72 | read . . . . . . . .  in com.amazonaws.internal.SdkFilterInputStream

Perhaps the number needs to change from 1024 to something else but not sure where I get that number from.. – Gomorrah 26/10, 2016 at 20:16

Changing that number to 2048 writes a file to S3 with size 2048. So I think what I need is the actual length of the outputstream. – Gomorrah 26/10, 2016 at 20:19

Please remove metadata.setContentLength(1024) (you do not need it), and move out.close() to inside your writing Thread. What most probably happens is that it terminates before the putObject() is still working, hence the "dead end" exception. – Subtlety 26/10, 2016 at 20:22

I changed it to metadata.setContentLength(out.size()) which worked fine. Thanks a ton for your help – Gomorrah 26/10, 2016 at 20:27

this answer would benefit from code outlining the idea, rather than theory – Headmost 7/8, 2018 at 17:36

Per Bharal's request, I just worked through this issue myself thanks to the comments above. So adding that sample code. Hope it helps someone out!

public void doSomething() throws IOException {
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    baos.write("some bytes to stick in the stream".getBytes());

    InputStream inStr = toInputStream(baos);
}

public InputStream toInputStream(ByteArrayOutputStream orgOutStream) throws IOException{
    PipedInputStream in = new PipedInputStream();
    PipedOutputStream out = new PipedOutputStream();

    try{
        new Thread(((Runnable)() -> {
            try{
                orgOutStream.writeTo(out);
                out.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        })).start();
    } finally{
        orgOutStream.close();
    }
    return in;
}

The real trick is to ensure the Piped calls are done in a separate thread.

Simplify answered 10/7, 2020 at 23:46 Comment(0)

Such situation happens when outputStream started to be written-to before somebody tries to read from inputStream. Default pipe size is 1024b, so 1020b were written to outputStream, but nobody started to read from inputStream - thats why writting stops. I had similar situation and could not start reading earlier. So I set pipe size (new PipedInputStream(100 * 1024)) to 100*1024. I do not know the weight of specific file before inputStream creating, but my files' weight less than 100KB, so it is okay for me.

Mystery answered 21/8, 2024 at 17:51 Comment(0)

Recommended topics

Hot tags