Use of fread() from data.table causes R session to abort
Asked Answered
M

1

6

I am working on a project for a MOOC, and was tinkering around with the data.table package in RStudio. Use of the fread() function to import the data files initially worked fine:

fread("UCI HAR Dataset/features.txt")->features
fread("UCI HAR Dataset/test/y_test.txt")->ytest

However, when I tried to run the following line of code, I received a pop-up that said "R Session Aborted: R encountered a fatal error. The session was terminated."

fread("UCI HAR Dataset/test/X_test.txt")->xtest

I don't understand what the problem is. I checked the file names and paths to make sure I had correctly spelled and capitalized everything, and it all checks out. The equivalent code using read.table() works fine and does not cause R to abort. I also tried renaming the file to "x_test.txt", but the same issue occurred.

According to ?fread, only the function will only work with "regular delimited files." As far as I can tell, the file is a "regular delimited file", in that all rows have the same number of columns. There are no cells containing "NA" when I use read.table instead; I checked using anyNA(). Is there a quick way to determine whether a file is a delimited "regularly" or not? Is there something else about the original file that could be causing the problem?


UPDATE

After further research and searching through the reported issues listed on the developer's github, I think that my problem lies in having two white spaces at the beginning of each row, which is discussed here. I am unsure why R aborted instead of giving me a warning. The latest development version of data.table (1.9.5) isn't causing the session to abort under the same conditions, though.

Margoriemargot answered 9/6, 2015 at 1:19 Comment(2)
That is the sort of error that one should report to the package maintainer hopefully with a test case that reliably causes the abend. There is little chance that we can replicate your observations.Christoperchristoph
it is probably because of special characters \r or similar or nested quotations. fread fails on things like that, open your document in emacs or a text editor where you can see the special charactersBillie
C
2

Although I do believe you should have contacted the package maintainer first for any situation where the R session was aborted (and it was not due to your mucking with C-code), I can offer a strategy for your last request which is not really specific to fread but I've found useful with regular-reads(). I'm going to assume that this is a comma separated file but if it;'s whitespace separated you could change the sep="," to sep="".

filcnts <- count.fields("UCI HAR Dataset/test/X_test.txt", sep=",")
table(filcnts)

That should be a single items table. If not, try switching parameters such as quote, sep, blank.lines.skip, or comment.char

Christoperchristoph answered 9/6, 2015 at 1:30 Comment(1)
Thanks for recommending that I contact the developer first. Installing the in-development version of data.table() (as recommended on their bug report instructions) actually gave me a meaningful error message while using fread() instead of crashing the program.Margoriemargot

© 2022 - 2024 — McMap. All rights reserved.