How to download an entire bucket in GCP?
Asked Answered
L

5

22

I have a problem downloading entire folder in GCP. How should I download the whole bucket? I run this code in GCP Shell Environment:

gsutil -m cp -R gs://my-uniquename-bucket ./C:\Users\Myname\Desktop\Bucket

and I get an error message: "CommandException: Destination URL must name a directory, bucket, or bucket subdirectory for the multiple source form of the cp command. CommandException: 7 files/objects could not be transferred."

Could someone please point out the mistake in the code line?

Lindie answered 27/10, 2019 at 17:50 Comment(1)
Most likely the directory doesn't exist, the error message is misleadingTsan
Y
28

To download an entire bucket You must install google cloud SDK

then run this command

gsutil -m cp -R gs://project-bucket-name path/to/local

where path/to/local is your path of local storage of your machine

Yasui answered 26/3, 2021 at 12:10 Comment(0)
C
6

The error lies within the destination URL as specified by the error message.

I run this code in GCP Shell Environment

Remember that you are running the command from the Cloud Shell and not in a local terminal or Windows Command Line. Thus, it is throwing that error because it cannot find the path you specified. If you inspect the Cloud Shell's file system/structure, it resembles more that of a Unix environment in which you can specify the destination like such instead: ~/bucketfiles/. Even a simple gsutil -m cp -R gs://bucket-name.appspot.com ./ will work since Cloud Shell can identify the ./ directory which is the current directory.

A workaround to this issue is to perform the command on your Windows Command Line. You would have to install Google Cloud SDK beforehand.

Alternatively, this can also be done in Cloud Shell, albeit with an extra step:

  1. Download the bucket objects by running gsutil -m cp -R gs://bucket-name ~/ which will download it into the home directory in Cloud Shell
  2. Transfer the files downloaded in the ~/ (home) directory from Cloud Shell to the local machine either through the User Interface or by running gcloud alpha cloud-shell scp
Conwell answered 27/10, 2019 at 18:37 Comment(7)
Yes the issue lies within 'destination URL'. I still can't figure out what the syntax should be since I'm not a programmer... I tried to play with code in Cloud Shell and it seems to work very well copying files from bucket to bucket, but what should a destination URL look like when copying objects to local machine?Lindie
As I described above, this can be done much more easily if the command was ran from your local Command Line with Cloud SDK installed. However, this can also be done in Cloud Shell. The first step (what you described), is to use gsutil cp to download the bucket objects. To put it in simple terms, in Cloud Shell's perspective, it is the local machine when running gsutil cp. Thus, the destination URL` will always point somewhere in in Cloud Shell's VM instance. However, once the files have been downloaded in the Cloud Shell VM, you can then easily transfer them to your local machineConwell
by running [gcloud alpha cloud-shell scp](https://cloud.google.com/sdk/gcloud/reference/alpha/cloud-shell/scp) which allows you to transfer files between Cloud Shell and the local machine. Alternatively, you can do this through the Cloud Shell UI by clicking the "three-dots button" followed by "Download Files"Conwell
In your case, you can run gsutil -m cp -R gs://bucket-name ./ which will download the bucket objects in the ./ directory . In linux, ./ represents the current directory and so if you ran the command while in the ~ directory (home directory), it will download the files thereConwell
Thanks @JKleinne! At the end I have used Google Cloud SDK installed. I just missed the step "gs init". Otherwise works awesome! I'm just curious, let's say you upload data to Google Storage Coldline and forget everything. Do you think after 15-20 years you would be able to retrieve data using the same tools like today?Lindie
No worries. Coldline storage's optimal use case would be mainly for archival data and/or any data that you plan to access at most once a year, see here for more info, however, I can't say whether the tools for retrieving uploading data will remain the same, especially since Google always aims to improve its products and services.Conwell
If this or any answer has solved your question please consider accepting it by clicking the check-mark. This indicates to the wider community that you've found a solution and gives some reputation to both the answerer and yourself. There is no obligation to do this.Conwell
I
4

Your destination path is invalid:

./C:\Users\Myname\Desktop\Bucket

Change to:

/Users/Myname/Desktop/Bucket

C: is a reserved device name. You cannot specify reserved device names in a relative path. ./C: is not valid.

Irruptive answered 27/10, 2019 at 18:30 Comment(2)
Yes, I tried changing the code you suggested John, but it gives me the same error...Lindie
@Lindie - I updated my answer for Linux (Cloud Shell).Irruptive
G
3

There is not a one-button solution for downloading a full bucket to your local machine through the Cloud Shell.

The best option for an environment like yours (only using the Cloud Shell interface, without gcloud installed on your local system), is to follow a series of steps:

  • Downloading the whole bucket on the Cloud Shell environment
  • Zip the contents of the bucket
  • Upload the zipped file
  • Download the file through the browser
  • Clean up:
    • Delete the local files (local in the context of the Cloud Shell)
    • Delete the zipped bucket file
  • Unzip the bucket locally

This has the advantage of only having to download a single file on your local machine.

This might seem a lot of steps for a non-developer, but it's actually pretty simple:

First, run this on the Cloud Shell:

mkdir /tmp/bucket-contents/
gsutil -m cp -R gs://my-uniquename-bucket /tmp/bucket-contents/
pushd /tmp/bucket-contents/ 
zip -r /tmp/zipped-bucket.zip .
popd
gsutil cp /tmp/zipped-bucket.zip gs://my-uniquename-bucket/zipped-bucket.zip

Then, download the zipped file through this link: https://storage.cloud.google.com/my-uniquename-bucket/zipped-bucket.zip

Finally, clean up:

rm -rf /tmp/bucket-contents
rm /tmp/zipped-bucket.zip
gsutil rm gs://my-uniquename-bucket/zipped-bucket.zip

After these steps, you'll have a zipped-bucket.zip file in your local system that you can unzip with the tool of your choice.

Note that this might not work if you have too much data in your bucket and the Cloud Shell environment can't store all the data, but you could repeat the same steps on folders instead of buckets to have a manageable size.

Guggle answered 14/11, 2019 at 17:50 Comment(1)
This is really thorough, thanks!Korfonta
T
0

The following command worked for me in macOS terminal with gcloud tool installed locally:

gsutil -m cp -r gs://cloud-storage-bucket-name ~/Downloads

Make sure ~/Downloads directory exists otherwise you might get the following misleading error:

CommandException: Destination URL must name a directory, bucket, or bucket
subdirectory for the multiple source form of the cp command.
Tsan answered 15/5 at 19:8 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.