While trying to convert a column of characters (strings of numbers, e.g "0.1234") into numeric, using as.numeric, some of the values are returned NA
with the warning 'NAs introduced by coercion'. The characters that are returned as NA
s don't seem to be any different from the ones that are returned as numeric correctly. Does anyone know what can be the problem?
Already tried to look for any characters that are not numeric (as ',') that can hide inside some of the values. I did find strings containing '-' (e.g "-0.123") that really turned into NA
s, but these are only part of the strings turned into NA
s. Also, tried to look for spaces inside the strings. that doesn't seem to be the problem as well.
data$y
[1] "0.833250539" "0.820323535" "0.462284612" "0.792943985" "0.860587952" "0.729665177" "0.461503956" "0.625871118"
[9] "0.740999346" "0.962727964" "0.971089266" "0.869004848" "0.828651766" "0.900648732" "0.970326033" "0.898123286"
[17] "0.911640765" "0.902442126" "0.843392097" "0.763421844" "0.892426243" "0.380433624" "0.925017633" "0.725470821"
[25] "0.699924767" "0.689061225" "0.907462936" "0.888064239" "0.913547115" "-0.625103904" "0.897385961" "0.889727462"
[33] "0.90127339" "0.947012474" "0.948883588" "0.845845512" "0.97866966" "0.796247738" "0.864627056" "0.266656189"
[41] "0.894915463" "0.969690678" "0.771365656" "0.88304436" "0.954039006" "0.836952199" "0.731558669" "0.907224294"
[49] "0.622059127" "0.887742343" "0.917550343" "0.97240334" "0.902841957" "0.617403052" "0.82926708" "0.674903846"
[57] "0.947132958" "0.929213613" "-0.297844476" "0.871767367"
y = as.numeric(data$y)
Warning message: NAs introduced by coercion
y
[1] 0.8332505 0.8203235 0.4622846 0.7929440 0.8605880 0.7296652 0.4615040 0.6258711 0.7409993 0.9627280 0.9710893 0.8690048 0.8286518
[14] 0.9006487 0.9703260 0.8981233 0.9116408 0.9024421 0.8433921 0.7634218 0.8924262 0.3804336 0.9250176 0.7254708 0.6999248 0.6890612
[27] 0.9074629 0.8880642 0.9135471 NA 0.8973860 0.8897275 0.9012734 0.9470125 0.9488836 0.8458455 0.9786697 0.7962477 0.8646271
[40] NA 0.8949155 0.9696907 NA 0.8830444 0.9540390 0.8369522 NA 0.9072243 0.6220591 0.8877423 0.9175503 NA
[53] 0.9028420 0.6174031 0.8292671 0.6749038 0.9471330 NA NA 0.8717674
dput
command? If your data frame is too large, then just a subset that includes some of the values that are returned asNA
. – Mensch