R How to read a file from google drive using R
Asked Answered
H

3

17

I would like to read in R a dataset from google drive as the screenshot indicated.

Neither

url <- "https://drive.google.com/file/d/1AiZda_1-2nwrxI8fLD0Y6e5rTg7aocv0"
temp <- tempfile()
download.file(url, temp)
bank <- read.table(unz(temp, "bank-additional.csv"))
unlink(temp)

nor

library(RCurl)
bank_url <- dowload.file(url, "bank-additional.csv", method = 'curl')

works.

I have been working on this for many hours. Any hints or solutions would be really appreciate.

Heaveho answered 17/12, 2017 at 2:40 Comment(1)
How about using the googledrive library from the tidyverse? googledrive.tidyverse.orgPowder
S
14

Try

temp <- tempfile(fileext = ".zip")
download.file("https://drive.google.com/uc?authuser=0&id=1AiZda_1-2nwrxI8fLD0Y6e5rTg7aocv0&export=download",
  temp)
out <- unzip(temp, exdir = tempdir())
bank <- read.csv(out[14], sep = ";")
str(bank)
# 'data.frame': 4119 obs. of  21 variables:
 # $ age           : int  30 39 25 38 47 32 32 41 31 35 ...
 # $ job           : Factor w/ 12 levels "admin.","blue-collar",..: 2 8 8 8 1 8 1 3 8 2 ...
 # $ marital       : Factor w/ 4 levels "divorced","married",..: 2 3 2 2 2 3 3 2 1 2 ...
 # <snip>

The URL should correspond to the URL that you use to download the file using your browser.

As @Mako212 points out, you can also make use of the googledrive package, substituting drive_download for download.file:

library(googledrive)
temp <- tempfile(fileext = ".zip")
dl <- drive_download(
  as_id("1AiZda_1-2nwrxI8fLD0Y6e5rTg7aocv0"), path = temp, overwrite = TRUE)
out <- unzip(temp, exdir = tempdir())
bank <- read.csv(out[14], sep = ";")
Skellum answered 17/12, 2017 at 3:58 Comment(4)
Thanks for your reply. The second method works perfectly. But for the first method, when I run out <- unzip(temp, exdir = tempdir()), I get warning messages " In unzip(temp, exdir = tempdir()) : internal error in 'unz' code "Heaveho
googledrive will ask to authenticate with your google account so this answer will not work if you don't or can't authenticate.Fonzie
When I try this I get a warning error 1 in extracting from zip file and out=NULL. Also the file that is downloaded is way too small to be right, which probably causes the problem. The link to google drive works outside of R though.Hierarchize
As commented by previous users -- this gives a warning for the download.file and cannot connect for googledrive option.Gains
D
7
  • The google drive share link is not the direct file link, so 1. download.file 2. RCurl first method in accepted answer only download the web page showing the file, not file itself. You can edit the downloaded file and see it's a html file.

  • You can find out the actual direct link to file with this. With the direct link all the regular download methods will work.

  • For very detailed discussions about getting the direct link or downloading it, see this question.

  • Google drive api require client to sign in, so googledrive package also ask you to sign in google if not already signed in.

Doubleganger answered 6/10, 2018 at 1:10 Comment(0)
I
1

You can do all this with the googledrive package.

It's a two-step process where you first find the folder in order to get it's ID, and then query for all files with that folder as the parents.

dir = drive_find(pattern='my_folder', type='folder')
query = paste('"', dir$id, '"',  ' in parents', sep='')
drive_find(q=query)

Note that drive_find may return multiple folders if you have multiple folders all named "my_folder" in different parts of Drive, so you may need to modify the query to be more specific (i.e. by searching by a parent folder). I would suggest throwing in a check that only one folder is returned by just doing nrow(dir) == 1. You can also change the query to use regex to indicate that it should only return an exact match on the folder name. In that case, replace the drive_find command with

drive_find(pattern='^my_folder$', type='folder')

You can find more details on parameters for drive_find at the documentation.

Insectile answered 6/10, 2021 at 16:57 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.