Its possible to append row groups to already existing parquet file using fastparquet.
Here is my SO answer on the same topic.
From fast parquet docs
append: bool (False) or ‘overwrite’ If False, construct data-set from
scratch; if True, add new row-group(s) to existing data-set. In the
latter case, the data-set must exist, and the schema must match the
input data.
from fastparquet import write
write('output.parquet', df, append=True)
EXAMPLE UPDATE:
Here is a PY script. The first run, it will create a file with one row group. Subsequent runs, it will append row groups to the same parquet file.
import os.path
import pandas as pd
from fastparquet import write
df = pd.DataFrame(data={'col1': [1, 2,], 'col2': [3, 4]})
file_path = "C:\\Users\\nsuser\\dev\\write_parq_row_group.parquet"
if not os.path.isfile(file_path):
write(file_path, df)
else:
write(file_path, df, append=True)