I came across this question because i had a very similar case. There still isn't a great way to do that, but i recently found this tip which allows to use gsutil rsync and hack -x flag to act as inclusion rather than exclusion by adding negative lookahead.
For example, below would copy all json files found in any subdirectory of current directory, while preserving their paths in a bucket
gsutil -m rsync -r -x '^(?!.*\.json$).*' . gs://mybucket
This can be further adjusted to include multiple entries. For example, this command would copy all found json, yaml and yml files
gsutil -m rsync -r -x '^(?!.*\.(json|yaml|yml)$).*' . gs://mybucket
By itself this is not very useful for a case, where you have specified file list, but let's work on it. Let's use youtube-dl repo (https://github.com/ytdl-org/youtube-dl.git) as an example.
Let's take all md files from the repo and pretend they are our specified file list. Last file is in a subpath
find * -name "*.md"
CONTRIBUTING.md
README.md
docs/supportedsites.md
We use * to remove leading dots from the names to require less processing
# Read file paths into var
# For file with path list, use
# cat file|read -d '' flist
find * -name "*.md"|read -d '' flist
# Concat paths into what gsutil accepts as a file list in -x parameter
rx="^(?\!($(echo $flist|tr '\n' '|')$)).*"
# Preview rx variable (just for clarity)
echo $rx
^(?!(CONTRIBUTING.md|README.md|docs/supportedsites.md|$)).*
# Run sync in dry mode
gsutil -m rsync -n -r -x $rx . gs://mybucket
...
Would copy file://./CONTRIBUTING.md to gs://mybucket/CONTRIBUTING.md
Would copy file://./README.md to gs://mybucket/README.md
Would copy file://./docs/supportedsites.md to gs://mybucket/docs/supportedsites.md
While a little involved, it does allow use of -m flag for speed while preserving paths.
With some more processing it should be very possible to
- remove empty newline from
find
result
- handle paths beginning with
./
-r
does is copy the tree of a specified directory to a target, not any intervening directories for a specified file. Effectively,gsutil cp
has no idea about a "root"/"working" directory for the source. – Pris