How to download an .xlsx file in R and load the data into a dataframe?
Asked Answered
F

2

6

I'm trying to download an .xlsx file from the eia and getting the following error.

The error is: "Error: ZipException (Java): invalid entry size (expected 2385 but got 2390 bytes)"

I have tried the following code:

library(XLConnect)
tmp = tempfile(fileext = ".xlsx")
download.file(url = "http://www.eia.gov/petroleum/drilling/xls/dpr-data.xlsx", destfile = tmp)
readWorksheetFromFile(file = tmp, sheet = "Eagle Ford Region", header = FALSE, startRow = 9, endRow = 151)

I have tried the other recommendations at: Read Excel file into R with XLConnect package from URL

Floccus answered 4/3, 2015 at 17:14 Comment(0)
B
22

You should use wb - binary mode while downloading the files, that are not plain text:

download.file(url = "http://www.eia.gov/petroleum/drilling/xls/dpr-data.xlsx", destfile = tmp, mode="wb")

This will solve the issue.

Bumboat answered 4/3, 2015 at 17:44 Comment(0)
H
8

I'm really late to the party, but I spent a lot of time stuck on this same error, and this didn't work for me. If you're only trying to download the file for the purpose of loading it from disk using read_xlsx, a better solution which is to skip the disk step entirely:

# install.packages(rio)
library(rio)

data = rio::import(url)

Cheers

Housekeeping answered 16/4, 2019 at 17:59 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.