No, there is no way to do this through HDFS.
In general, when I have this problem, I try to copy the data into a random temp location and then move the file once the copy is complete. This is nice because mv is pretty instantaneous, while copying takes longer. That way, if you check to see if anyone else is writing and then mv, the time period and "lock" is held for a shorter time
- Generate a random number
- Put the data into a new folder in hdfs://tmp/$randomnumber
- Check to see if the destination is OK (
hadoop fs -ls
perhaps)
hadoop fs -mv
the data to the latest
directory.
There is a slim chance that between 3 and 4 you might have someone clobber something. If that really makes you nervous, perhaps you can implement a simple lock in ZooKeeper. Curator can help you with that.