I run the exact same Python function, one as a PostgreSQL PL/Python, and the other one outside PostgreSQL as a usual Python script.
Surprisingly, when I call the PostgreSQL PL/Python using select * from pymax7(20000);
, it takes on average 65 seconds, while when I call the usual Python script python myscript.py 20000
it takes an average 48 seconds. The averages were computed running the queries and scripts 10 times.
Should such a difference be expected? How does Python inside the PostgreSQL RDBMS (PL/Python) compares with Python outside it in terms of performances?
I'm running PostgreSQL 9.1 and Python 2.7 on Ubuntu 12.04 64bits.
PostgreSQL PL/Python:
CREATE FUNCTION pymax7 (b integer)
RETURNS float
AS $$
a = 0
for i in range(b):
for ii in range(b):
a = (((i+ii)%100)*149819874987)
return a
$$ LANGUAGE plpythonu;
Python:
import time
import sys
def pymax7 (b):
a = 0
for i in range(b):
for ii in range(b):
a = (((i+ii)%100)*149819874987) # keeping Python busy
return a
def main():
numIterations = int(sys.argv[1])
start = time.time()
print pymax7(numIterations)
end = time.time()
print "Time elapsed in Python:"
print str((end - start)*1000) + ' ms'
if __name__ == "__main__":
main()