There are some prior related questions (1, 2, 3), but nothing quite what I want, and I can't get the code example to work that Jenny Bryan posted in 2018.
I have a folder shared with me with some large files. The files are nested. So I want to recurse into the sub-directories and get all files from each. In my case, there are only two layers, but it would be nice with an approach that works for arbitrary number of layers.
The most obvious command to try is simply telling it to download the folder, hoping it will figure out the substructure:
#load the libraries
library(tidyverse)
library(googledrive)
#folder link to id
#hidden for privacy reasons
jp_folder = "https://drive.google.com/drive/folders/XXXXX"
folder_id = drive_get(as_id(jp_folder))
#download in entirety
drive_download(folder_id)
Unfortunately, this doesn't work because it apparently cannot deal with folders:
> drive_download(folder_id)
Error: Not a recognized Google MIME type:
* application/vnd.google-apps.folder
Here's my attempt at avoiding this issue by going into each subdir:
#load the libraries
library(tidyverse)
library(googledrive)
#folder link to id
#hidden for privacy reasons
jp_folder = "https://drive.google.com/drive/folders/XXXXX"
#get the id data frame
folder_id = drive_get(as_id(jp_folder))
#find files in folder
files = drive_ls(folder_id)
#loop dirs and download files inside them
for (i in seq_along(files$name)) {
i_dir = drive_ls(files$id[i])
#download files
walk(i_dir$id, ~ drive_download(as_id(.x)))
}
The files
object seems fine (replacing the strings with fillers):
# A tibble: 6 x 3
name id drive_resource
* <chr> <chr> <list>
1 A AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA <named list [32]>
2 B BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB <named list [32]>
3 C CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC <named list [31]>
4 D DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD <named list [31]>
5 E EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE <named list [31]>
6 F FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF <named list [31]>
However, when one attempts to get the contents of the subdir, it throws this error:
> i_dir = drive_ls(files$id[i])
Error: 'path' does not identify at least one Drive file.
What's wrong here?
library(stringr)
minimizes loads since tidyverse is huge (especially if you need to install it). – Phage