R: read.table and missing values
Asked Answered
P

1

6

When I load my data file in tab delimited format in R, I got this error message:

Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec,  : line 3 did not have 5 elements

Here's my data:

KEY ID      code1   code2   name
1   sadsa   32423   344     ffsadsa
2   vdffsfs 21344   234     fsadfgg
3   3e4dsa  21321   #N/A    #N/A
4   dcxzc   23421   #N/A    #N/A
5   xzzcc   21223   124     erfsacf
6   sdas    21321   464     fsadfsa
7   assdad  32132   455     fsadfda

I can see that the error is caused by the "#N/A" value in my data. I have tried the read.table option such as na.strings or comment.char = "#" but it still did not work.

Is there any ways to keep the actual text (#N/A) or at least replace it with N/A when loading the data in R?

Plectron answered 7/8, 2018 at 2:40 Comment(5)
In the read.table, you can specify na.strings = "#NA"Immunogenic
Yes @akrun. I have tried this -> data = read.table("raw.dat", header=TRUE, sep="\t", stringsAsFactors=FALSE, quote = "", row.names = NULL, na.strings = "#NA"). It still pops out the error message " Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, : line 3 did not have 5 elements".Plectron
if it is #N/A, use that specific value. I commented earlier by omitting the /Immunogenic
Hi @akrun. Have changed to this -> data = read.table("raw.dat", header=TRUE, sep="\t", stringsAsFactors=FALSE, quote = "", row.names = NULL, na.strings = "#N/A"). Still the same error message.Plectron
You may also have to set the argument comment.char = "" to stop "#" being interpreted as a comment.Wycoff
S
12

You can try to use the read.table function with fill= TRUE.

read.table(file =file, sep = sep, fill=TRUE)

If this does not work, I would suggest to try the readLines function instead of read.table.

readLines(...)
Shufu answered 7/8, 2018 at 9:29 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.