I have some Apache Parquet file. I know I can execute parquet file.parquet
in my shell and view it in terminal. But I would like some GUI tool to view Parquet files in more user-friendly format. Does such kind of program exist?
There is Tad utility, which is cross-platform. Allows you to open Parquet files and also pivot them and export to CSV. Uses DuckDB as it's backend. More info on the DuckDB page:
GUI option for Windows, Linux, MAC
You can now use DBeaver to
- view parquet data
- view metadata and statistics
- run sql query on one or multiple files. (supports glob expressions)
- generate new parquet files.
DBeaver leverages DuckDB driver to perform operations on parquet file. Features like Projection and predicate pushdown are also supported by DuckDB.
Simply create an in-memory instance of DuckDB using Dbeaver and run the queries like mentioned in this document. Right now Parquet and CSV is supported.
Here is a Youtube video that explains the same - https://youtu.be/j9_YmAKSHoA
Check out this utility. Works for all windows versions: https://github.com/mukunku/ParquetViewer
There is a GUI tool to view Parquet and also other binary format data like ORC and AVRO. It's pure Java application so that can be run at Linux, Mac and also Windows. Please check Bigdata File Viewer for details.
It supports complex data type like array, map, struct etc. And you can save the read file in CSV format.
Actually I found some Windows 10 specific solution. However, I'm working on Linux Mint 18 so I would like to some Linux (or ideally cross-platform) GUI tool. Is there some other GUI tool?
https://www.channels.elastacloud.com/channels/parquet-net/how-about-viewing-parquet-files
There is webassembly viewer which works fully offline: https://aloneguid.github.io/parquet-online/
JetBrains (IntelliJ, PyCharm etc) has a plugin for this, if you have a professional version: https://plugins.jetbrains.com/plugin/12494-big-data-tools
There is the Tab Lab Parquet Viewer. It lets you view and filter Parquet files. You can also make graphs and query parquet with sql.
- TAD - https://github.com/antonycourtney/tad
- Big data viewer: https://github.com/Eugene-Mark/bigdata-file-viewer
- DBeaver
- qStudio - https://www.timestored.com/qstudio/parquet-file-viewer
- ParquetViewer - https://github.com/mukunku/ParquetViewer
From simplest to most powerful it would be: TAD < ParquetViewer < qStudio / DBeaver
If you want the data in a spreadsheet, Row Zero has native Parquet support for up to billions of rows. You can upload parquet files from your computer or import them directly from Amazon S3. It also lets your filter/sort/pivot/graph and export to CSV/Snowflake.
© 2022 - 2024 — McMap. All rights reserved.