What are the fundamental architectural, SQL compliance, and data use scenario differences between Presto and Impala?

About

Asked 7/11, 2013 at 16:16 Answered 31/1, 2020 at 22:36

Can some experts give some succinct answers to the differences between Presto and Impala from these perspectives?

Fundamental architecture design
SQL compliance
Real-world latency
Any SPOF or fault-tolerance functionality
Structured and unstructured data use scenario performance

Burning answered 7/11, 2013 at 16:16 Comment(1)

Ok, since no one would be able to answer this question. I would like to add some comments from my own findings. The largest difference I can see so far (maybe not very accurate due to the scarcity of Presto paper): Impala uses a push-down approach while Presto uses a connector approach, which means Impala runs the optimized fragmented queries on the node where the data resides in the HDFS system while Presto connector approach runs more or less like HAWQ or SQL-H by importing the data from HDFS to the query engine. – Burning 14/11, 2013 at 16:17

Apache Impala is a query engine for HDFS/Hive systems only.

PrestoDB, as well as the community version Trino, on the other hand are a generic query engine, which support HDFS as just one of many choices. There is a long list of connectors available, Hive/HDFS support is just one of them. This also means that you can query different data source in the same system, at the same time.

Hare answered 31/1, 2020 at 22:36 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags