List buckets that match a bucket label with gsutil
Asked Answered
K

4

6

I have my google cloud storage buckets labeled

I can't find anything in the docs on how to do a gsutil ls but only filter buckets with a specific label- is this possible?

Kenrick answered 23/1, 2019 at 16:30 Comment(1)
Such a pain. Every other gcp resource you interact with it via gcloud which has advanced filtering capabilitiesNatalyanataniel
A
1

Nowadays is not possible to do what you want in one single step. You can do it in 3 steps:

  1. getting all the buckets of your GCP project.
  2. Get the labels of every bucket.
  3. Do the gsutil ls of every bucket that accomplish your criteria.

This is my python 3 code that I did for you.

import subprocess
out = subprocess.getoutput("gsutil ls")


for line in out.split('\n'):
    label = subprocess.getoutput("gsutil label get "+line)
    if "YOUR_LABEL" in str(label):
        gsout = subprocess.getoutput("gsutil ls "+line)
        print("Files in "+line+":\n")
        print(gsout)
Analyze answered 24/1, 2019 at 14:26 Comment(0)
H
3

Just had a use case where I wanted to list all buckets with a specific label. The accepted answer using subprocess was noticeably slow for me. Here is my solution using the Python client library for Cloud Storage:

from google.cloud import storage


def list_buckets_by_label(label_key, label_value):
    # List out buckets in your default project
    client = storage.Client()
    buckets = client.list_buckets() # Iterator

    # Only return buckets where the label key/value match inputs
    output = list()
    for bucket in buckets:
        if bucket.labels.get(label_key) == label_value:
            output.append(bucket.name)
    return output
Hazard answered 18/6, 2020 at 21:4 Comment(0)
A
1

Nowadays is not possible to do what you want in one single step. You can do it in 3 steps:

  1. getting all the buckets of your GCP project.
  2. Get the labels of every bucket.
  3. Do the gsutil ls of every bucket that accomplish your criteria.

This is my python 3 code that I did for you.

import subprocess
out = subprocess.getoutput("gsutil ls")


for line in out.split('\n'):
    label = subprocess.getoutput("gsutil label get "+line)
    if "YOUR_LABEL" in str(label):
        gsout = subprocess.getoutput("gsutil ls "+line)
        print("Files in "+line+":\n")
        print(gsout)
Analyze answered 24/1, 2019 at 14:26 Comment(0)
F
1

A bash only solution:

function get_labeled_bucket {
  # list all of the buckets for the current project
  for b in $(gsutil ls); do
    # find the one with your label
    if gsutil label get "${b}" | grep -q '"key": "value"'; then
      # and return its name
      echo "${b}"
    fi
  done
}

The section '"key": "value"' is just a string, replace with your key and your value. Call the function with LABELED_BUCKET=$(get_labeled_bucket)

In my opinion, making a bash function return more than one value is more trouble than it is worth. If you need to work with multiple buckets then I would replace the echo with the code that needs to run.

Folk answered 18/6, 2019 at 5:30 Comment(0)
P
0
from google.cloud import storage

client = storage.Client()
for blob in client.list_blobs('bucketname', prefix='xjc/folder'):
  print(str(blob))

enter image description here

Pent answered 28/1, 2020 at 6:39 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.