How to compare 2 dataframes in python unittest using assert methods
Asked Answered
C

2

5

I'm writing unittest for a method that returns a dataframe, but, while testing the output using:

self.asserEquals(mock_df, result)

I'm getting ValueError:

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Right now I'm comparing properties that serves the purpose now,

self.assertEqual(mock_df.size, result.size)
self.assertEqual(mock_df.col_a.to_list(), result.col_a.to_list())
self.assertEqual(mock_df.col_b.to_list(), result.col_b.to_list())
self.assertEqual(mock_df.col_c.to_list(), result.col_c.to_list())

but curious how do I assert dataframes.

Cohin answered 12/10, 2021 at 5:9 Comment(0)
P
6
import unittest
import pandas as pd

class TestDataFrame(unittest.TestCase):
    def test_dataframe(self):
        df1 = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
        df2 = pd.DataFrame({'a': [1, 2], 'b': [3.0, 4.0]})
        self.assertEqual(True, df1.equals(df2))

if __name__ == '__main__':
    unittest.main()
Peculium answered 12/10, 2021 at 5:19 Comment(3)
Thanks, Mahi, though it's working in some cases. For a few I get AssertionError visually the data looks the same and is indexed properly.Cohin
I also had an issue with this approach not working. I found an alternate approach that worked and posted that as an alternate answer.Wireless
I think you can simplify slightly using self.assertTrue(df1.equals(df2))Assai
W
3

The accepted answer from @Mahi did not work for me. It failed for two Dataframes that should have been equal. Not sure why.

As I discovered here under "DataFrame equality", there are some functions built into Pandas for testing.

The following worked for me. I tested it several times, but not exhaustively, to make sure it would work repeatedly.

import unittest
import pandas as pd

class test_something(unittest.TestCase):
    def test_method(self):
        #... create dataframes df1 and df2...
        pd.testing.assert_frame_equal(df1,df2)

Here is related pandas reference for the above function.

Wireless answered 29/6, 2022 at 0:56 Comment(1)
assert_frame_equal is good for testing. If the test fails, it also shows the differences between the two dataframes.Phillipphillipe

© 2022 - 2024 — McMap. All rights reserved.