I have some Parquet files that I've written in Python using PyArrow (Apache Arrow):
pyarrow.parquet.write_table(table, "example.parquet")
Now I want to read these files (and preferably get an Arrow Table) using a Java program.
In Python, I can simply use the following to get an Arrow Table from my Parquet file:
table = pyarrow.parquet.read_table("example.parquet")
Is there an equivalent and easy solution in Java?
I couldn't really find any good / working examples nor any usefull documentation for Java (only for Python). Or some examples don't provide all needed Maven dependencies. I also don't want to use a Hadoop file system, I just want to use local files.
Note: I also found out that I can't use "Apache Avro" because my Parquet files contains column names with the symbols [
, ]
and $
which are invalid characters in Apache Avro.
Also, can you please provide Maven dependencies if your solution uses Maven.
I am on Windows and using Eclipse.
Update (November 2020): I never found a suitable solution and just stuck with Python for my usecase.