distcp Questions
0
I run a distcp command to copy the hdfs location of a table to another cluster.
The copy is scheduled to run every 8 hours.
I run the 'msck repair table' command but not always after the copy.
I ha...
3
Solved
I have a huge bucket of S3files that I want to put on HDFS. Given the amount of files involved my preferred solution is to use 'distributed copy'. However for some reason I can't get hadoop distcp ...
3
Solved
On our cluster we have set up dynamic resource pools.
The rules are set so that first yarn will look at the specified queue, then to the username, then to primary group ...
However with a distcp ...
Mauriciomaurie asked 5/11, 2015 at 9:25
1
Solved
I have the following folders in HDFS :
hdfs://x.x.x.x:8020/Air/BOOK/AE/DOM/20171001/2017100101
hdfs://x.x.x.x:8020/Air/BOOK/AE/INT/20171001/2017100101
hdfs://x.x.x.x:8020/Air/BOOK/BH/INT/20171001/...
Nonferrous asked 19/10, 2017 at 15:20
2
Solved
I copied some files from a directory to directory using
hadoop distcp -Dmapreduce.job.queuename=adhoc /user/comverse/data/$CURRENT_DATE_NO_DASH_*/*rcr.gz /apps/hive/warehouse/arstel.db/fair_usage/...
1
I like to copy data from our hadoop cluster (on premise) to s3. I can do it unencrypted. I can also run s3cmd put with client side encryption. How do I do distcp with client side encryption ?
1
Solved
I am using aws .net sdk to run a s3distcp job to EMR to concatenate all files in a folder with --groupBy arg. But whatever "groupBy" arg I have tried, it failed all the time or just copy the files ...
Malady asked 14/7, 2016 at 12:23
1
© 2022 - 2024 — McMap. All rights reserved.