Reading fractions in csv file with R
Asked Answered
I

2

8

I have a text file of numerical data, with headers, where some numbers are entered as fractions, some are entered as integers, and some are entered as floats, e.g.:

col1name, col2name, col3name, col4name    
1, 2, 3, 4
0.5, 0.6, 0.7, 0.8
1/2, 2/3, 3/4, 4/5
1, 0.2, 3/3, 4

When I use read.csv, how do I have these expressions evaluated and stored as numbers?

Thanks...

Indubitability answered 11/10, 2016 at 18:14 Comment(2)
Can you post a sample of your text file somewhere that we can download? Then we'd be able to see exactly the structure of the data you're trying to import and provide tailored code.Scotch
@eipi I edited the example to show the structure of the file better.Indubitability
J
7

First, import your data as a vector of character strings. Using your toy example in the question we can do this by

txt = "1, 2, 3, 0.3, 2/5, 0.75, 1/3"
dat = read.table(text = txt, sep = ",", stringsAsFactors = F)

Once you have your data in a character vector, we can use eval(parse()) to evaluate the expressions as if they had been typed in at the console. Unfortunately eval is not vectorised, so we wrap it in sapply, to apply this function to each element of your data in turn

answer = sapply(dat, function(x) eval(parse(text = x)))

We can extend this to deal with multirow data by applying the above method to each column at a time. For example, like this

txt = "col1name, col2name, col3name, col4name
1, 2, 3, 4
0.5, 0.6, 0.7, 0.8
1/2, 2/3, 3/4, 4/5
1, 0.2, 3/3, 4"

dat = read.table(text = txt, sep = ",", stringsAsFactors = F, header = T)
answer = apply(dat, 2, function(this.col) sapply(this.col, function(x) eval(parse(text = x))))
#      col1name  col2name col3name col4name
# [1,]      1.0 2.0000000     3.00      4.0
# [2,]      0.5 0.6000000     0.70      0.8
# [3,]      0.5 0.6666667     0.75      0.8
# [4,]      1.0 0.2000000     1.00      4.0
Jessie answered 11/10, 2016 at 20:6 Comment(6)
Works for the toy example, but not for a text file with more than only row of data (it keeps only the last row).Indubitability
Then you need to use this on each row in turn. This is still the way to do it.Jessie
I assumed there would be - but that's the beauty of computers. They are very good a repetitive tasks. Joking aside, I'll update answer to show how.Jessie
Folks, we have a winner. ThanksIndubitability
I simplified slightly by operating by columns rather than rows. That way we don't need t() to transpose at the endJessie
A similar alternative could be sapply(dat, function(x) eval(as.call(c(c, parse(text = x)))))Fragrance
E
1

I would strongly suggest utilizing fread() within the "data.table" package. It's incredibly fast and very robust in almost all situations.

input.file <- fread("file_name.csv")

If your values are still not in the format you are looking for, you can utilize "as.integer()" or "as.numeric()":

input.file$`Column Name To Change` <- as.numeric(input.file$`Column Name To Change`)

Hope this helps!

Explicative answered 11/10, 2016 at 18:23 Comment(2)
@BenS. How are the fractions entered into the csv? For instance, in order to enter mine and keep them there I added an apostrophe before typing the fraction 2/5.Yates
@Richard They're not from Excel files. Just a plain text file that looks like the line I gave as an example.Indubitability

© 2022 - 2024 — McMap. All rights reserved.