How can I open a .snappy.parquet file in python?
Asked Answered
U

3

10

How can I open a .snappy.parquet file in python 3.5? So far, I used this code:

import numpy
import pyarrow

filename = "/Users/T/Desktop/data.snappy.parquet" 
df = pyarrow.parquet.read_table(filename).to_pandas()

But, it gives this error:

AttributeError: module 'pyarrow' has no attribute 'compat'

P.S. I installed pyarrow this way:

pip install pyarrow
Upheave answered 5/10, 2018 at 1:2 Comment(0)
G
8

I have got the same issue and managed to solve it by following the solutio proposed in https://github.com/dask/fastparquet/issues/366 solution.

1) install python-snappy by using conda install (for some reason with pip install, I couldn't download it)

2) Add the snappy_decompress function.

from fastparquet import ParquetFile
import snappy
def snappy_decompress(data, uncompressed_size):
    return snappy.decompress(data)
pf = ParquetFile('filename') # filename includes .snappy.parquet extension
dff=pf.to_pandas()
Godparent answered 21/2, 2020 at 17:50 Comment(1)
Add the snappy_decompress function → why? The pf = ParquetFile('filename') and dff=pf.to_pandas() lines are doing all magic by itself. import snappy and snappy_decompress function is doing nothing in this example.Dollar
A
8

You can use pandas to read snppay.parquet files into a python pandas dataframe.

import pandas as pd
filename = "/Users/T/Desktop/data.snappy.parquet"
df = pd.read_parquet(filename)
Argentiferous answered 25/3, 2022 at 10:56 Comment(0)
C
4

The error AttributeError: module 'pyarrow' has no attribute 'compat' is sadly a bit misleading. To execute the to_pandas() function on a pyarrow.Table instance you need pandas installed. The above error is a sympton of the missing requirement.

pandas is a not a hard requirement of pyarrow as most of its functionality is usable with just Python built-ins and NumPy. Thus users of pyarrow which include pandas can work with it without needing to have pandas pre-installed.

Capriola answered 5/10, 2018 at 7:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.