Python - TypeError: expecting string or bytes object
Asked Answered
W

2

6

After much research I cannot figure out why I receive this error in my code.

I'm trying to export a Pandas Dataframe to my Oracle table. I have successfully done this hundreds of times on other data tables but this one keeps producing errors.

Here is my Dataframe, which I read in with pd.read_excel and appended three of my own columns with simple df['column_name'] = variable commands:

S USTAINABLE H ARVEST S ECTOR| QUOTA LISTING APRIL 16 2013 Unnamed: 1  \
1                                                DATE           TRADE ID   
2                                            04/02/13             130014   
3                                                   0                  0   
4                                                   0                  0   
5                                                   0                  0   
6                                 FY13 QUOTA – TO BUY                  0   
7                                                DATE           TRADE ID   
8                                             3/26/13             130006   
9                                              4/9/13             130012   
10                                            3/26/13             130007   
11                                            3/26/13             130001   
12                                            3/26/13             130009   
13                                             4/9/13             130013   
14                                            3/26/13             130010   
15                                            3/26/13             130008   
16                                            3/26/13             130011   
17                                                  1                  0   

         Unnamed: 2     Unnamed: 3                     Unnamed: 4 email_year  \
1   AVAILABLE STOCK         AMOUNT                      BUY PRICE       2013   
2        WINTER SNE          12000            TRADE IN RETURN FOR       2013   
3                 0              0                   HADDOCK GOM,       2013   
4                 0              0             YELLOWTAIL GOM, OR       2013   
5                 0              0                 WITCH - OFFERS       2013   
6                 0              0                              0       2013   
7     DESIRED STOCK         AMOUNT                      BUY PRICE       2013   
8           COD GBE            ANY                         OFFERS       2013   
9           COD GBW  UP TO 100,000                            0.3       2013   
10          COD GBW            ANY                         OFFERS       2013   
11          COD GOM        INQUIRE                            1.5       2013   
12        WINTER GB            ANY                         OFFERS       2013   
13       WINTER SNE  UP TO 100,000                            0.3       2013   
14       WINTER SNE            ANY                         OFFERS       2013   
15    YELLOWTAIL GB            ANY                         OFFERS       2013   
16   YELLOWTAIL GOM            ANY  TRADE FOR GB STOCKS -\nOFFERS       2013   
17                0              0                              0       2013   

   email_month email_day  
1            4        16  
2            4        16  
3            4        16  
4            4        16  
5            4        16  
6            4        16  
7            4        16  
8            4        16  
9            4        16  
10           4        16  
11           4        16  
12           4        16  
13           4        16  
14           4        16  
15           4        16  
16           4        16  
17           4        16  

My code fails on the export line cursor.executemany(sql_query, exported_data) with the error:

Traceback (most recent call last):
  File "Z:\Code\successful_excel_pdf_code.py", line 74, in <module>
    cursor.executemany(sql_query, exported_data)
TypeError: expecting string or bytes object

Here is my relevant code:

df = pd.read_excel(file_path)


df = df.fillna(0)
df = df.ix[1:]


cursor = con.cursor()
exported_data = [tuple(x) for x in df.values]
#exported_data = [str(x) for x in df.values]
#print("exported_data:", exported_data)

sql_query = ("INSERT INTO FISHTABLE(date_posted, stock_id, species, pounds, advertised_price, email_year, email_month, email_day, sector_name, ask)" "VALUES(:1, :2, :3, :4, :5, :6, :7, :8, 'Sustainable Harvest Sector', '1')")

cursor.executemany(sql_query, exported_data)

con.commit() #commit to database

cursor.close()
con.close()

Here is a printout of exported_data:

[('DATE', 'TRADE ID', 'AVAILABLE STOCK', 'AMOUNT', 'BUY PRICE', '2013', '4', '16'), ('04/02/13', 130014, 'WINTER SNE', 12000, 'TRADE IN RETURN FOR', '2013', '4', '16'), (0, 0, 0, 0, 'HADDOCK GOM,', '2013', '4', '16'), (0, 0, 0, 0, 'YELLOWTAIL GOM, OR', '2013', '4', '16'), (0, 0, 0, 0, 'WITCH - OFFERS', '2013', '4', '16'), ('FY13 QUOTA – TO BUY', 0, 0, 0, 0, '2013', '4', '16'), ('DATE', 'TRADE ID', 'DESIRED STOCK', 'AMOUNT', 'BUY PRICE', '2013', '4', '16'), ('3/26/13', 130006, 'COD GBE', 'ANY', 'OFFERS', '2013', '4', '16'), ('4/9/13', 130012, 'COD GBW', 'UP TO 100,000', 0.3, '2013', '4', '16'), ('3/26/13', 130007, 'COD GBW', 'ANY', 'OFFERS', '2013', '4', '16'), ('3/26/13', 130001, 'COD GOM', 'INQUIRE', 1.5, '2013', '4', '16'), ('3/26/13', 130009, 'WINTER GB', 'ANY', 'OFFERS', '2013', '4', '16'), ('4/9/13', 130013, 'WINTER SNE', 'UP TO 100,000', 0.3, '2013', '4', '16'), ('3/26/13', 130010, 'WINTER SNE', 'ANY', 'OFFERS', '2013', '4', '16'), ('3/26/13', 130008, 'YELLOWTAIL GB', 'ANY', 'OFFERS', '2013', '4', '16'), ('3/26/13', 130011, 'YELLOWTAIL GOM', 'ANY', 'TRADE FOR GB STOCKS -\nOFFERS', '2013', '4', '16'), (1, 0, 0, 0, 0, '2013', '4', '16')]

1) I thought the error could be from a lot of NaNs being scattered throughout the Dataframe, so I replaced them with 0's and it still fails.

2) I then thought the error could be from trying to export the first couple rows which held no valuable information, so I deleted the first row with df = df.ix[1:] but it still fails.

3) I also thought it could be failing because of the values in my email_year/month/day columns, so I changed them all to strings before putting them into my Dataframe, but it still fails.

4) I tried changing the exported_data command to a str instead of a tuple but that only changed the error to cx_Oracle.DatabaseError: ORA-01036: illegal variable name/number. Also, it has always worked fine as a tuple when exporting other Dataframes.

5) I thought the error could be from my Oracle columns not allowing either numbers or letters, but they are all set to all VarChar2 so that isn't the cause of the error either.

I'd appreciated any help solving this, thanks.

Weltpolitik answered 6/3, 2017 at 20:44 Comment(6)
Please add the full error traceback to your question!Coax
Just added it allWeltpolitik
Can you create a standalone test case that can be run? The problem is likely due to changing types. For a particular column, one row may be an string and for the next row it may be an integer. That is one reason for that error to be raised.Heaveho
I actually thought that that could be a reason the error occurred....but I don't know how to isolate or capture single columns from this Dataframe in particular because the columns are not named and trying print("row 1:", ['DATE'] failsWeltpolitik
@AnthonyTuininga so I've been able to print each column individually, all the unnamed columns and the first one that starts with S USTAINABLE, and I've confirmed they're all dtype: object. I don't know if that's good or bad or even significant at allWeltpolitik
@Weltpolitik all data frame columns are necessarily of dtype object. to see if the conversion worked, try printing some rows to console (look for e.g. '1234')Bogus
H
12

Based on the export data noted above, the problem you are experiencing is due to the fact that the data in one row is not the same type as the data in subsequent rows. In your case, in one row you have the value '04/02/13' (as a string) and in the next row you have the value 0 (as an integer). You will need to make sure that the data type is consistent for the column across all rows.

Heaveho answered 8/3, 2017 at 15:46 Comment(3)
Ohh ok that makes enough sense... is there an easy enough way to do that? To change all datatypes of a Dataframe to the same one?Weltpolitik
So I figured it out and was able to change all the values to strings. Who knows why creating the Dataframe made some values strings and others integers. But thanks a ton for pointing out what was wrong with it, now I can proceed with exporting a ton of dataWeltpolitik
Data type should be consistent across all rows is important.Verleneverlie
S
0

Like it was said before, the problem could come from a column that has values from different types. To catch the problematic columns you can try this command :

for col in df.columns:
if len(df[col].apply(type).drop_duplicates().tolist())>1:
    print(col)
Soft answered 25/10, 2023 at 12:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.