I want to create a lambda that gets a zip file(which may contain a list of csv files) from S3, unzip it and upload back to s3. since lambda is limited by memory/disk size, I have to stream it from s3 and back into it. I use python (boto3) see my code below
count = 0
obj = s3.Object( bucket_name, key )
buffer = io.BytesIO(obj.get()["Body"].read())
print (buffer)
z = zipfile.ZipFile(buffer)
for x in z.filelist:
with z.open(x) as foo2:
print(sys.getsizeof(foo2))
line_counter = 0
out_buffer = io.BytesIO()
for f in foo2:
out_buffer.write(f)
# out_buffer.writelines(f)
line_counter += 1
print (line_counter)
print foo2.name
s3.Object( bucket_name, "output/"+foo2.name+"_output" ).upload_fileobj(out_buffer)
out_buffer.close()
z.close()
result is, creating empty files in the bucket. for example: if file: input.zip contained files: 1.csv,2.csv i get in the bucket 2 empty csv files with the corresponding names. also, i'm not sure it indeed stream the files, or just download all the zip file thanks