I would like to add random NA
to a data.frame in R. So far I've looked into these questions:
R: Randomly insert NAs into dataframe proportionaly
How do I add random NA
s into a data frame
add random missing values to a complete data frame (in R)
Many solutions were provided here, but I couldn't find one that comply with these 5 conditions:
- Add really random NA, and not the same amount by row or by column
- Work with every class of variable that one can encounter in a data.frame (numeric, character, factor, logical, ts..), so the output must have the same format as the input data.frame or matrix.
- Guarantee an exact number or proportion [note] of NA in the output (many solutions result in a smaller number of NA since several are generated at the same place)
- Is computationnaly efficient for big datasets.
- Add the proportion/number of NA independently of already present NA in the input.
Anyone has an idea? I have already tried to write a function to do this (in an answer of the first link) but it doesn't comply with points N°3&4. Thanks.
[note] the exact proportion, rounded at +/- 1NA of course.
sum(is.na(df) / (nrow(df)*ncol(df)) )
and cheking if it's in an acceptable range, if not, do the NA adding again. – Sean