Count NAs per row in dataframe [duplicate]
Asked Answered
L

2

46

I've got dataframe that has batch ID and the results of six tests performed on each batch. The data looks like this:

batch_id  test1  test2  test3  test4  test5  test6
001       0.121     NA  0.340  0.877  0.417  0.662
002       0.229  0.108     NA  0.638     NA  0.574

(there are a few hundred rows in this dataframe, only one row per batch_id)

I'm looking for a way to count how many NAs there are for each batch_id (for each row). I feel like this should be do-able with a few lines of R code at the most, but I'm having trouble actually coding it. Any ideas?

Lauro answered 14/6, 2016 at 0:54 Comment(5)
@BenBolker Generally, I have the impression that answers to recent posts are often more appropriate, modern, or efficient than those in the alleged duplicates - especially if the duplicate post is several years old (not the case here). In this specific case, however, I'm not even sure that we're dealing with a duplicate since the linked question specifically asked for a dplyr solution, unlike the OP of this post.Tientiena
OK, although this particular question isn't that old (Feb of this year) and the answers (esp. @windrunn3r.1990's answer) overlap a lot . Should I/we vote to reopen?Geomancy
@BenBolker I did not see the question you linked to when I searched for a solution. The answer to that question by Justin is what I was looking for. Should I delete my question?Lauro
No, duplicates are fine as long as they're marked as such.Geomancy
@ BenBolker OK. Should select one of the answers to the question I posted? Tim Biegeleisen posted a solution that works well, so I feel that he should get some credit.Lauro
P
45

You could add a new column to your data frame containing the number of NA values per batch_id:

df$na_count <- apply(df, 1, function(x) sum(is.na(x)))
Perigordian answered 14/6, 2016 at 0:58 Comment(1)
Thanks. That works. I ended up using this, which is a bit simpler:<br/> df$na_count <- apply(is.na(df), 1, sum)Lauro
M
105

You can count the NAs in each row with this command:

rowSums(is.na(dat))

where dat is the name of your data frame.

Morbilli answered 14/6, 2016 at 1:30 Comment(2)
This solution is excellent and vectorized. Thank you.Preston
this solution should be selectedTimaru
P
45

You could add a new column to your data frame containing the number of NA values per batch_id:

df$na_count <- apply(df, 1, function(x) sum(is.na(x)))
Perigordian answered 14/6, 2016 at 0:58 Comment(1)
Thanks. That works. I ended up using this, which is a bit simpler:<br/> df$na_count <- apply(is.na(df), 1, sum)Lauro

© 2022 - 2024 — McMap. All rights reserved.