I'm doing some database benchmarking in Python using the cx_Oracle module. To benchmark results, I'm running 150 unique queries and timing the execution of each one. I'm running something like this:
c = connection.cursor()
starttime = time.time()
c.execute('SELECT /*+ NOCACHE */ COUNT (*) AS ROWCOUNT FROM (' + sql + ')')
endtime = time.time()
runtime = endtime - starttime
Each query is passed in through the variable sql
, and they vary significantly in length, runtime, and the tables they access. That being said, all queries exhibit the following behavior:
1st run: very slow (relatively)
2nd run: significantly faster (takes anywhere from 1/2 - 1/5 the time)
3rd run: marginally faster than 2nd run
All subsequent runs >= 4: approximately equal to 3rd run
I need the cache disabled to get accurate results, but the first few runs are really throwing off my data; it's as if NOCACHE
isn't working at all... what's going on here?
Edit: Allan answered my question, but for anyone who might be interested, I did a little more research and came across these two pages which were also helpful:
sql
? The way you're currently passing in the query doesn't allow using bind variables, i.e. if the queries contain input parameters, these will be embedded into the query literal, which will cause a hard parse to precede each execution, which I assume is causing the lag in #1. – Voicelessalter system flush buffer_cache;
would only help make realistic tests in a data warehouse. Unfortunately, this makes your testing much more difficult - without perfectly matching queries, data, and environments it's difficult to replicate cache behavior. – Sundae