The following consistently crashes my R session.
Tested on two machines, Ubuntu and Mac OS X with similar results on both.
Brief Description:
Calling write.table
on a data.frame with factor column of all NA's.
The original data set is rather large, and I've managed to isolate the offending column and then create a similar vector, named PROBLEM_DATA
below, which causes the same crash.
Interestingly, sometimes R
crashes outright, othertimes it simply throws the following error:
Error in write.table(x, file, nrow(x), p, rnames, sep, eol, na, dec, as.integer(quote), :
'getCharCE' must be called on a CHARSXP
Any thoughts as to the cause of the crash or should it be submitted as a bug?
Offending data and call:
PROBLEM_DATA <- structure(114:116, .Label = c("String1", "String2", "String3", "String4", "String5", "String6",
"String7", "String8", "String9", "String10", "String11", "String12", "String13", "String14", "String15"), class = "factor")
# This will cause a crash
write.table(PROBLEM_DATA, file=path.expand("~/test.csv"))
# This will also crash
write.table(PROBLEM_DATA, file=path.expand("~/test.csv"), fileEncoding="UTF-8")
SESSION INFO OF EACH MACHINE
UBUNTU
R version 2.15.3 (2013-03-01)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C LC_COLLATE=C
[5] LC_MONETARY=C LC_MESSAGES=C LC_PAPER=C LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] gdata_2.12.0 ggplot2_0.9.3 stringr_0.6.1 RMySQL_0.9-3 DBI_0.2-5
[6] data.table_1.8.8
loaded via a namespace (and not attached):
[1] MASS_7.3-23 RColorBrewer_1.0-5 colorspace_1.2-0 dichromat_1.2-4
[5] digest_0.5.2 grid_2.15.3 gtable_0.1.1 gtools_2.7.0
[9] labeling_0.1 munsell_0.4 plyr_1.7.1 proto_0.3-9.2
[13] reshape2_1.2.1 scales_0.2.3 tools_2.15.3
Mac OS X
R version 2.15.3 (2013-03-01)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
> head(PROBLEM_DATA) [1] <NA> <NA> <NA> 15 Levels: String1 String2 String3 String4 String5 String6 String7 ... String15 > write.table(PROBLEM_DATA, file=path.expand("~/test.csv")) *** caught segfault *** address 0x1, cause 'memory not mapped' Traceback: 1: write.table(PROBLEM_DATA, file = path.expand("~/test.csv")) Possible actions: 1: abort (with core dump, if enabled) 2: normal R exit 3: exit R without saving workspace 4: exit R saving workspace
– TheocritusPD=structure(11:12,.Label=c("Foo","Bar"),class="factor")
. I say check the changelog and nightly R and then report as a bug. – Quinquevalentas.numeric(PROBLEM_DATA)
as well asas.numeric(as.character(PROBLEM_DATA))
(per the R_FAQ). You end up with a bunch of levels which have the same (nonexistent) name. – Spavinedgdata
package. Perhaps related to https://mcmap.net/q/371864/-reordering-factor-gives-different-results-depending-on-which-packages-are-loaded/892313 ? In general, crashes are always bugs. The question is if it is a bug ingdata
or base R. – Coevalrbindlist
at some point in creating the larget DT? – Rune