s3.getObject().createReadStream() : How to catch the error?
Asked Answered
A

5

52

I am trying to write a program to get a zip file from s3, unzip it, then upload it to S3. But I found two exceptions that I can not catch.

1. StreamContentLengthMismatch: Stream content length mismatch. Received 980323883 of 5770104761 bytes. This occurs irregularly.

2. NoSuchKey: The specified key does not exist. This happens when I input the wrong key.

When these two exceptions occur, this program crashes.

I'd like to catch and handle these two exceptions correctly.

I want to prevent a crash.

   const unzipUpload = () => {
        return new Promise((resolve, reject) => {
            let rStream = s3.getObject({Bucket: 'bucket', Key: 'hoge/hoge.zip'})
                .createReadStream()
                    .pipe(unzip.Parse())
                    .on('entry', function (entry) {
                        if(entry.path.match(/__MACOSX/) == null){

                            // pause
                            if(currentFileCount - uploadedFileCount > 10) rStream.pause()

                            currentFileCount += 1
                            var fileName = entry.path;
                            let up = entry.pipe(uploadFromStream(s3,fileName))

                            up.on('uploaded', e => {
                                uploadedFileCount += 1
                                console.log(currentFileCount, uploadedFileCount)

                                //resume
                                if(currentFileCount - uploadedFileCount <= 10) rStream.resume()

                                if(uploadedFileCount === allFileCount) resolve()
                                entry.autodrain()
                            }).on('error', e => {
                                reject()
                            })
                        }

                    }).on('error', e => {
                        console.log("unzip error")
                        reject()
                    }).on('finish', e => {
                        allFileCount = currentFileCount
                    })
            rStream.on('error', e=> {
                console.log(e)
                reject(e)
            })
        })
    }

    function uploadFromStream(s3,fileName) {
        var pass = new stream.PassThrough();

        var params = {Bucket: "bucket", Key: "hoge/unzip/" + fileName, Body: pass};
        let request = s3.upload(params, function(err, data) {
            if(err) pass.emit('error')
            if(!err) pass.emit('uploaded')
        })
        request.on('httpUploadProgress', progress => {
            console.log(progress)
        })

        return pass
    }

This is the library I use when unzipping. https://github.com/mhr3/unzip-stream

Help me!!

Amphichroic answered 5/5, 2017 at 7:43 Comment(0)
V
45

If you'd like to catch the NoSuchKey error thrown by createReadStream you have 2 options:

  1. Check if key exists before reading it.
  2. Catch error from stream

First:

s3.getObjectMetadata(key)
  .promise()
  .then(() => {
    // This will not throw error anymore
    s3.getObject().createReadStream();
  })
  .catch(error => {
    if (error.statusCode === 404) {
      // Catching NoSuchKey
    }
  });

The only case when you won't catch error if file was deleted in a split second, between parsing response from getObjectMetadata and running createReadStream

Second:

s3.getObject().createReadStream().on('error', error => {
    // Catching NoSuchKey & StreamContentLengthMismatch
});

This is a more generic approach and will catch all other errors, like network problems.

Vanpelt answered 5/5, 2017 at 8:6 Comment(5)
Thank You !! Your first idea is an innovative idea for me. For the second idea, something did not work.Amphichroic
Hey, glad it helped you. I noticed you're new to Stackoverflow, so if you feel like the answer solves your problem - mark it as 'accepted' (green checkmark).Vanpelt
Your second solution doesn't work, it will not catch a NoSuchKey error. I haven't found a way to catch this error though so it seems that solution 1 is the only way here.Defunct
@Defunct thanks for noticing! I updated my 2nd example so it handles the error as well!Vanpelt
I don't believe that getObjectMetadata() is a valid method on the Node.js S3 SDK --- I think what you're looking for is s3.headObject({ Bucket: <bucket>, Key: <key> }): docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/…Sharasharai
D
14

You need to listen for the emitted error earlier. Your error handler is only looking for errors during the unzip part.

A simplified version of your script.

s3.getObject(params)
.createReadStream()
.on('error', (e) => {
  // handle aws s3 error from createReadStream
})
.pipe(unzip)
.on('data', (data) => {
  // retrieve data
})
.on('end', () => {
  // stream has ended
})
.on('error', (e) => {
  // handle error from unzip
});

This way, you do not need to make an additional call to AWS to find out if out if it exists.

Defunct answered 3/1, 2018 at 19:12 Comment(3)
This should work, but it doesn't for some reason. Errors from node_modules/aws-sdk/lib/request.js:31 are always escaping event listener and kill the process.Hole
I am using a similar code in a loop.I am getting (node:12533) MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 error listeners added. Use emitter.setMaxListeners() to increase limit error. Is there a way to close the pipe?Anthem
It will close automatically once it has completed. If your loop is non-blocking and you have many items in the array you are looping over, you might be creating too many listeners. If non-blocking, refactor it and see if you get the same problem. If your loop is blocking, check if your packages can be updated as is could be a bug in a dependancy.Defunct
L
4

You can listen to events (like error, data, finish) in the stream you are receiving back. Read more on events

function getObjectStream (filePath) {
  return s3.getObject({
    Bucket: bucket,
    Key: filePath
  }).createReadStream()
}

let readStream = getObjectStream('/path/to/file.zip')
readStream.on('error', function (error) {
  // Handle your error here.
})

Tested for "No Key" error.

it('should not be able to get stream of unavailable object', function (done) {
  let filePath = 'file_not_available.zip'

  let readStream = s3.getObjectStream(filePath)
  readStream.on('error', function (error) {
    expect(error instanceof Error).to.equal(true)
    expect(error.message).to.equal('The specified key does not exist.')
    done()
  })
})

Tested for success.

it('should be able to get stream of available object', function (done) {
  let filePath = 'test.zip'
  let receivedBytes = 0

  let readStream = s3.getObjectStream(filePath)
  readStream.on('error', function (error) {
    expect(error).to.equal(undefined)
  })
  readStream.on('data', function (data) {
    receivedBytes += data.length
  })
  readStream.on('finish', function () {
    expect(receivedBytes).to.equal(3774)
    done()
  })
})
Lisbethlisbon answered 28/10, 2017 at 0:48 Comment(0)
T
1

Setting .on('error', () => {}) after createReadStream() will not catch the error produced from getObject (NoSuchKey, StreamContentLengthMismatch), you need to set if before the createReadStream() function.

For example

s3.getObject().on('error', error => {
    // Catching StreamContentLengthMismatch or NoSuchKey errors.
}).createReadStream()
Tarsal answered 16/3, 2023 at 14:11 Comment(0)
L
0

To prevent a crash, you need to asynchronously listen to the object's head metadata, where it does not return the whole object, which will take less time. Try this one!

isObjectErrorExists = async functions () => {
  try {
const s3bucket = {
secret key: '',
client id: ''
}
  const params = {
       Bucket: 'your bucket name',
       Key: 'path to object'
};
    await s3bucket.headObject(params).promise(); // adding promise will let you add await to listen to process untill it completes.

    return true;
  } catch (err) {
      return false; // headObject threw error.
    }
    throw new Error(err.message); 
  }
}

public yourFunction = async() => {
if (await this.isObjectErrorExists()) {
s3Bucket.getObject().createReadStream(); // works smoothly
}
}
Lyophobic answered 5/2, 2021 at 10:27 Comment(2)
While this code may answer the question, providing additional context regarding why and/or how this code answers the question improves its long-term value.Logarithmic
@Logarithmic Correct, Since we are accessing the object's metadata the return duration of the promise is less & This solution helps as it can be used to check whether the object is gonna cause the crash and it can be handled easily. (The negative vote, for not writing a description is just not right. My solution works smoothly for the latest versions of aws-sdk library) the upvote is much-neededLyophobic

© 2022 - 2024 — McMap. All rights reserved.