Openstack Swift Bulk Operations (Archive Auto Extraction) Best Practices
Asked Answered
T

0

6

I have tested uploading objects into swift with a multi-threaded app that made individual requests to create objects. With 20 threads, I was averaging ~6 objects per second per thread. The math showed that it was going to take quite a long time to finish. I turned to bulk operations, and now have a multi-threaded app running that uploads tar.gz files which include the files in their respective containers. It works, but it is slower than what I had running with individual object requests. I am running 10 threads that are each uploading a tar.gz with 4000 objects. Those 10 threads are running at a rate closer to ~2 objects per second per thread. Seems there is something I must be doing wrong.

It seems that swift receives the files within 5 to 10 seconds, but spends 300 to 1600 seconds uncompressing and placing the objects in their containers. I am not positive on that, it is based on watching the network traffic on the machine that is uploading to swift.

Thinking of factors that might impact performance:

  • objects created in a single container versus each object going to a different container
  • number of objects per bulk operation
  • number of concurrent bulk operations
  • bulk file type tar, gz compression level (full or none )

What are the best practices for this kind of operation?

Tamandua answered 2/8, 2017 at 16:0 Comment(1)
I'm having the same problem, I have lots of files to import in object storage (hosted by OVH) and however I approach it, I end up having <50KB/s bandwidth to transfer 100GB of data (across a few hundred of thousands of files). Maybe that's just not do-able in a reasonable amount of time?Dissipated

© 2022 - 2024 — McMap. All rights reserved.