Presto performance tuning, queries are much slower when performed in parallel - McMap

About

Presto performance tuning, queries are much slower when performed in parallel

Asked 12/7, 2018 at 9:20 Answered 20/7, 2018 at 8:16

sql performance presto sqlperformance

D

1

9

I have a presto cluster configured with 12 workers that is being queried by Java applications. The cluster is capable of performing 30 concurrent requests (if there are more, they are queued).

The applications might send around 80-100 distinct queries, which I expect to be handled by cluster.

Problem: When queries are performed sequentially they complete significantly faster than when they are performed in parallel.

For instance, if I run 100 queries sequentially each of them takes 1-12 seconds to complete and they all are completed in around 2 minutes. But if I start all of them in parallel it takes around 8-12 minutes to complete them all. At corner cases it takes up to 30 minutes.

If I look on the presto console I see that most of the queries are blocked and only 1-3 are in fact in Running state.

Unfortunately I can't post any of the queries. They usually access different schemas (up to 6 in one query), they are full of joins and nested queries. At the same time most of them are written following presto best practices.

Question: How can I improve performance? At least what areas should I investigate to find out the root cause?

Here are some metrics for one of the slowest queries (may be the numbers will say something to you).

Resource Utilization Summary

CPU Time            8.42m
Scheduled Time      26.04m
Blocked Time        4.77d
Input Rows          298M
Input Data          9.94GB
Raw Input Rows      323M
Raw Input Data      4.34GB
Peak Memory         10.18GB
Memory Pool         reserved
Cumulative Memory   181G seconds

Timeline

Parallelism         477
Scheduled Time/s    1.47K
Input Rows/s        281K
Input Bytes/s       9.60MB
Memory Utilization  0B

Datha answered 12/7, 2018 at 9:20 Comment(2)

Do your queries access the same tables? Do you read or write data? Even when you cannot post an actual query you use yourself, you could come up with something `like' it for a Minimal, Complete, and Verifiable example. – Buckshot 15/7, 2018 at 8:31

@N.Wouda yes, it often happens that they access the same table. I only read data (presto db is readonly sql engine on top of nosql data base). I'm not sure I can post "similar" queries. As I said there are around 100 of them, they all are different and in many cases they don't feet into a single screen. I also found no correlation between query complexity and it's performance. Different executions show different slow queries. – Datha 15/7, 2018 at 10:18

D

1

It seems like I figured out the issue myself.

Presto is a distributed SQL query engine. And the key word here is distributed. It guarantees that if you run a query it is efficiently distributed among workers and performed with high speed.

Performing parallel queries and expecting that Presto will figure out how to efficiently parallel them is most likely a misuse. It is more like relational database approach which unfortunately doesn't work in Presto.

Datha answered 20/7, 2018 at 8:16 Comment(4)

Did you also look into optimising this? I have seen some tweak-able parameters in Presto tuning documentation which talk about optimising concurrent queries. – Tinkle 21/3, 2020 at 5:17

@Tinkle I didn't work much with Presto after I posted this question. I supposes a lot has been changed since than. – Datha 21/3, 2020 at 12:55

@SashaShpota hi I am building similar jdbc application. I have 4 nodes on aws emr with presto installed, 3 are worker. I need to run 300 queries, currently I am creating 15 threads each thread runs 20 queries. It takes around 3 hours to complete. Is it right approach or should I go for single thread which executes 300 queries. – Cheeks 13/1, 2021 at 5:8

@Cheeks I am not an expert in Presto to suggest you anything. Shortly after I posted this question, we switched to a different architecture and I have not been using Presto since then. For our application, we figured out experimentally that 4 threads gave optimal performance. If we used more or less threads it would get slower. Maybe try experimenting and see what number of threads works the best in your particular case. It's been more than two years, I am pretty sure they improved Presto in some way. Maybe there is already something new that can help you solve the issue. – Datha 13/1, 2021 at 10:9

Recommended topics

#Godot #Unity #Godot 4.X #Mongodb

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

© 2022 - 2024 — McMap. All rights reserved.