pyarrow add column to pyarrow table
Asked Answered
A

1

7

I have a pyarrow table name final_table of shape 6132,7 I want to add column to this table

 list_ = ['IT'] * 6132
 final_table.append_column('COUNTRY_ID', list_)

but I am getting following error ArrowInvalid: Added column's length must match table's length. Expected length 6132 but got length 12264

Aggrade answered 11/8, 2020 at 3:44 Comment(0)
S
12

According to the documentation:

Append column at end of columns.

Parameters
field (str or Field) – If a string is passed then the type is deduced from the column data.

column (Array, list of Array, or values coercible to arrays) – Column data.

Returns
pyarrow.Table – New table with the passed column added.

I think pyarrow is assuming that you're providing a list of Array. To avoid the confusion you should pass an arrow array instead

col_a = pa.array([1, 2, 3], pa.int32())
col_b = pa.array(["X", "Y", "Z"], pa.string())

table = pa.Table.from_arrays(
    [col_a, col_b],
    schema=pa.schema([
        pa.field('a', col_a.type),
        pa.field('b', col_b.type),
    ])
)

table = table.append_column('COUNTRY_ID', pa.array(['IT'] * len(table), pa.string()))
Screwball answered 11/8, 2020 at 8:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.