How to disable using cache results in Redshift Query?
Asked Answered
S

2

9

I am interested in performance testing my query in Redshift.

I would like to disable the query from using any cached results from prior queries. In other words, I would like the query to run from scratch. Is it possible to disable cached results only for the execution of my query?

I would not like to disable cached results for the entire database/all queries.

Sabir answered 21/2, 2019 at 10:47 Comment(1)
I'm sure you know this, but you also need to ensure that you are aware of compile time and memory usage. even with "SET enable_result_cache_for_session TO OFF;" you still will get far slower performance on your first execution than on subsequent. In normal circumstances (unless your sql is generated and is highly dynamic) you should benchmark excluding the first run.Scathing
P
21
SET enable_result_cache_for_session TO OFF;

From enable_result_cache_for_session - Amazon Redshift:

Specifies whether to use query results caching. If enable_result_cache_for_session is on, Amazon Redshift checks for a valid, cached copy of the query results when a query is submitted. If a match is found in the result cache, Amazon Redshift uses the cached results and doesn’t execute the query. If enable_result_cache_for_session is off, Amazon Redshift ignores the results cache and executes all queries when they are submitted.

Prevent answered 21/2, 2019 at 11:18 Comment(0)
D
4

Ran across this during a benchmark today and wanted to add an alternative to this. The benchmark tool I was using has a setup and teardown, but they don't run in the same session/transaction, so the enable_result_cache_for_session setting was having no effect. So I had to get a little clever.

From the Redshift documentation:

Amazon Redshift uses cached results for a new query when all of the following are true:

  • The user submitting the query has access permission to the objects used in the query.

  • The table or views in the query haven't been modified.

  • The query doesn't use a function that must be evaluated each time it's run, such as GETDATE.

  • The query doesn't reference Amazon Redshift Spectrum external tables.

  • Configuration parameters that might affect query results are unchanged.

  • The query syntactically matches the cached query.

In my case, I just added a GETDATE() column to the query to force it to not use the result cache on each run.

Disassemble answered 3/12, 2022 at 6:8 Comment(1)
Simple workaround that worked perfectly for me without having to change any cluster state! +1Penetralia

© 2022 - 2024 — McMap. All rights reserved.