Is there a way to filter by tier in azure blob storage
Asked Answered
S

2

2

I would like to list all the files stored in a particular tier. This is what I tried:

az storage fs file list \
  --file-system 'cold-backup' \
  --query "[?contains(properties.blobTier, 'Cold')==\`true\`].properties.blobTier"

But it doesn't work. I also tried with "blobTier" only. No luck.

This is the error I get:

Invalid jmespath query supplied for '--query': In function contains(), invalid type for value: None, expected one of: ['array', 'string'], received: "null"

Sweet answered 6/5, 2021 at 18:6 Comment(0)
F
2

The command az storage fs file list is for ADLS Gen2 file system, there is no blobTier property in the output, so you could not query with it, also the blobTier should be Cool instead of Cold.

If you want to list the files filter with blobTier, you could use az storage blob list, it applies to blob storage, but it can also be used for ADLS Gen2 file system.

Sample:

az storage blob list --account-name '<storage-account-name>' --account-key 'xxxxxx' --container-name 'cold-backup' --query "[?properties.blobTier=='Cool']"

enter image description here

If you want to output the blobTier, use --query "[?properties.blobTier=='Cool'].properties.blobTier" instead in the command.

Fishtail answered 7/5, 2021 at 1:37 Comment(0)
J
0

The accepted answer works perfectly fine. However, if you have a lot of files then the results are going to be paginated. The CLI tool will return NextMarker which has to be used in the subsequent call using --marker parameter. In case of huge number of files, this will have to be scripted out using something like power shell.Also az storage blob list makes --container-name mandatory. Which means only one container can be queried at a time.

Blob Inventory

I have a ton of files and many containers. I found an alternate method that worked best for me. Under Data management there is an option called Blob Inventory. This will basically generate a report of all the blobs across all the containers in a storage account. The report can be customized to include the fields of your choice, for example: Name, Access Tier, Blob Type etc. There are also options to filter certain blobs (include and exclude filters).

The report will be generated in CSV or Parquet format and stored in the container of your choice at a daily or weekly frequency. The only downside is that the report can't be generated on-demand (only scheduled).

Further, If you wish to run SQL on the Inventory report (CSV/Parquet file) then simply use DBeaver to do this.

June answered 11/10, 2022 at 13:18 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.