I am running a sum function of one hive table in Hue, and get a return value of NaN.
Here is my code:
select sum(v1) from hivedb.tb1;
I don't know why it is giving me a NaN result. I checked if any of my v1 values are null:
select * from hivedb.tb1 where v1 is null;
, and it turns out no record has null value. The table has 100 million rows, so I can not do a manual check for each record.
- Does anybody know why I am getting a NaN result?
- And if it is because I have some abnormal value in some rows, how can I find them?
Any help is appreciated. Thank you in advance!
UPDATE 1 I manually screened the first 1000 rows, and luckily spotted some abnormal values of NaN in tb1. It is resulted from some rounding error from the previous steps. So my question 1 is probably answered. Please feel free to comment on it, if you think there could be other reasons.
I still don't know how to use an efficient way to spot the rows with NaN values. So I am still looking forward to any answers to my question #2. Please feel free to share. I appreciate your help.
UPDATE 2 The problem is solved with help in the accepted answer below, in the discussion section. There are multiple ways to deal with it.
- Use a condition selection of v1+1 >v1. It will select rows with non NaN values.
- Use a condition selection of cast(v1 as String) ='NaN'. It will select rows with NaN values.
v1
, by the way? And did you trysum(cast(v1 as Double))
? – Technique