Extract bz2 file in R
Asked Answered
P

5

39

I have bunch of .csv.bz2 files, which i have to download, extract, and read in R. I downloaded the file and want to extract it to current working directory, then read it. unz(filename,filename.csv) but it does not seem to work. How can I do that?

I heard somewhere that bzfiles can be read directly without decompressing. How can I do that?

Pitiable answered 20/9, 2014 at 12:31 Comment(0)
R
40

You can use any of these two commands:

  1. read.csv()command: with this command you can directly supply your compressed filename containing csv file.

    read.csv("file.csv.bz2")

  2. read.table() command: This command is generic version of read.csv() command. You can set delimiters and others options that read.csv() automatically sets. You don't need to uncompress the file separately. This command does it automatically for you.

    read.csv("file.csv.bz2", header = TRUE, sep = ",", quote = "\"",...)

Revolver answered 23/5, 2015 at 10:27 Comment(0)
M
27

Like this:

readcsvbz2file <- read.csv(bzfile("file.csv.bz2"))
Munniks answered 22/9, 2014 at 19:39 Comment(2)
bzfile()is not necessary, read.csv() can handle compressed files automatically. So it's just read.csv("file.csv.bz2"). Here is an example (first section "Loading the Data").Kalat
bzipfile() is a more general solution because it useful for other formats. ThanksQuieten
L
11

You can make use of the super fast fread which has built-in support for bz2-compressed files

require(data.table)
fread("file.csv.bz2")
Lanham answered 17/3, 2015 at 15:53 Comment(0)
A
8

Basically, you need to type:

library(R.utils)
bunzip2("dataset.csv.bz2", "dataset.csv", remove = FALSE, skip = TRUE)

dataset <- read.csv("dataset.csv")

See documentation here: bunzip2 {R.utils}.

Apothecary answered 27/9, 2015 at 13:50 Comment(0)
M
4

According to read.table description, one can read a compressed file directly.

read.table("file.csv.bz2")
Mexicali answered 22/9, 2014 at 19:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.