GUI tools for viewing/editing Apache Parquet
Asked Answered
B

10

33

I have some Apache Parquet file. I know I can execute parquet file.parquet in my shell and view it in terminal. But I would like some GUI tool to view Parquet files in more user-friendly format. Does such kind of program exist?

Bethina answered 19/3, 2018 at 16:3 Comment(1)
Please see my answer here on how to use DBeaver to view parquet files.Feldspar
C
32

There is Tad utility, which is cross-platform. Allows you to open Parquet files and also pivot them and export to CSV. Uses DuckDB as it's backend. More info on the DuckDB page:

GH here: https://github.com/antonycourtney/tad

enter image description here

Civet answered 13/10, 2022 at 22:49 Comment(2)
This is exactly what I'm looking for thx @Gabe. It works on macos 2!Voyage
Great suggestion! Running on masOS Ventura 13.1 for viewing parquet filesPresentational
S
25

GUI option for Windows, Linux, MAC

You can now use DBeaver to

  • view parquet data
  • view metadata and statistics
  • run sql query on one or multiple files. (supports glob expressions)
  • generate new parquet files.

DBeaver leverages DuckDB driver to perform operations on parquet file. Features like Projection and predicate pushdown are also supported by DuckDB.

Simply create an in-memory instance of DuckDB using Dbeaver and run the queries like mentioned in this document. Right now Parquet and CSV is supported.

Here is a Youtube video that explains the same - https://youtu.be/j9_YmAKSHoA

enter image description here

Stratocracy answered 10/10, 2022 at 5:7 Comment(0)
O
9

Check out this utility. Works for all windows versions: https://github.com/mukunku/ParquetViewer

Odds answered 24/6, 2018 at 13:29 Comment(6)
Thanks for your suggestion, I've tried it, but for parquet with complex structure, like a JSON, this type of utility doesn't work. It works with parquet with a plain structure, like a CSV.Adjacency
I tried this one as well. In my parquet file it seems to mess up every 2nd row by inserting an incorrect 0 value in the first column and moving all the correct values down a row for each 0 it inserts. I tried the BigDataFileViewer which can view my files correctly but only if you open the file twice. The first time it throws an error about incorrect magic numbers in the tail, but then seemingly works correctly when you open the file a 2nd time. The schema and table data seem correct.Kendrick
@Kendrick maybe open an issue ticket on the repo with the example file?Odds
@Odds logged a ticket at github.com/mukunku/ParquetViewer/issues/20 just nowKendrick
I like it, it's fast and simple, i just want to (pre)view tabular parquet filesFarver
Tevis has been heavily tested with large parquet files, with very complex and nested data structures and works perfectly fine. Give it a try! Demo here.Extortioner
G
5

There is a GUI tool to view Parquet and also other binary format data like ORC and AVRO. It's pure Java application so that can be run at Linux, Mac and also Windows. Please check Bigdata File Viewer for details.

It supports complex data type like array, map, struct etc. And you can save the read file in CSV format.

enter image description here

Golter answered 9/2, 2020 at 17:49 Comment(1)
Currently the tool does not work without fiddling with java, since javafx seems to be missing. see github.com/Eugene-Mark/bigdata-file-viewer/issues/25Petrina
B
4

Actually I found some Windows 10 specific solution. However, I'm working on Linux Mint 18 so I would like to some Linux (or ideally cross-platform) GUI tool. Is there some other GUI tool?

https://www.channels.elastacloud.com/channels/parquet-net/how-about-viewing-parquet-files

Bethina answered 8/4, 2018 at 14:18 Comment(1)
Is there anything similar for Windows 8 ?Powerhouse
S
2

There is webassembly viewer which works fully offline: https://aloneguid.github.io/parquet-online/

enter image description here

Slay answered 28/3, 2023 at 14:30 Comment(1)
Link gives 404?Angelia
C
1

JetBrains (IntelliJ, PyCharm etc) has a plugin for this, if you have a professional version: https://plugins.jetbrains.com/plugin/12494-big-data-tools

Chiapas answered 23/3, 2022 at 9:12 Comment(1)
they only display basic data types and fall short on anything a bit more complex like structures, lists, maps etc.Slay
B
0

There is the Tab Lab Parquet Viewer. It lets you view and filter Parquet files. You can also make graphs and query parquet with sql.

Tab Lab Parquet Viewer gif

Battalion answered 19/10, 2023 at 20:26 Comment(0)
P
0

From simplest to most powerful it would be: TAD < ParquetViewer < qStudio / DBeaver

Philippa answered 29/5 at 8:22 Comment(0)
J
0

If you want the data in a spreadsheet, Row Zero has native Parquet support for up to billions of rows. You can upload parquet files from your computer or import them directly from Amazon S3. It also lets your filter/sort/pivot/graph and export to CSV/Snowflake.

Row Zero importing a parquet file of SSA baby name data

Junejuneau answered 7/7 at 14:26 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.