I'm making a Monte-Carlo simulation. And as a part of this task I generate samples uniformly distributed over an interval (0,100)
.
generate = lambda: uniform(0,100)
The iterations stop when all the closest generated points' pairs meet the criteria.
check = lambda a,b: True if (b-a)<5 else False
I need to have some structure to effectively keep all the generated points and be able to go through them in ascending order to perform check
on all the subsequent pairs.
There is a heapq
module in Python which supports a very effective heap structure. And I decided to use it.
I faced a problem. I have found no traversal procedure supported by this module. The only way I found to access the values of the heap in ascending order is to use heapq.heappop
. But it deletes the values from the heap.
I found the workaround for this and just copied the heap object into the new one and iterated with heappop
over the new one. But I don't think it's quite effective to copy the whole structure in memory one every iteration.
Is there any other way I can go to do what I'm trying to do more effectively?
The simplified code for illustration.
import heapq
from random import uniform
from itertools import tee, izip, count
from copy import copy
def pairwise(iterable): #get values from iterator in pairs
a, b = tee(iterable)
next(b, None)
return izip(a, b)
check = lambda a,b: True if (b-a)<5 else False
generate = lambda: uniform(0,100)
def iterate_heap(heap):
heap = copy(heap) #Here I have to copy the heap to be able to traverse
try:
while True:
yield heapq.heappop(heap)
except IndexError:
return
def trial():
items = []
for i in count():
item = generate()
heapq.heappush(items, item)
it = iterate_heap(items)
it = pairwise(it)
if i>0 and all(check(a,b) for a,b in it): #if i==0 then 'it' returns no values and 'all' returns True
return i
print "The solution is reached. It took %d iterations." % trial()
paiwise
function is from recipe from here.
Update:
In this implementation with heappop
the complexity on each iteration is O(n*log(n))
:
Copying heap: O(n)
Adding a new value to the heap: O(log(n))
Traversing: n
elements * O(log(n))
on popping each value from heap -> O(n*log(n))
.
Result: O(n+log(n)+n*log(n)) = O(n*log(n)
But I expect the traversal to be O(n)
, so the resultant complexity would be O(n)
.
By the way, if we use just sorted list, we would need to sort the list on each adding, so O(n*log(n))
, but the traversal would be n*O(1) -> O(n)
. So, the resultant complexity is still O(n*log(n))
.
I have found a solution. It's to use bisect
module. Finding the place to add would be O(log(n))
. Adding to the list is of O(n)
(because of the implementation all the values after the insertion in place have to be moved). Traversing is O(n)
. So, the resultant complexity is O(n)
.
Still, I wounder, if there is a way to solve this task using heaps in Python.
True if (b-a)<5 else False
is equivalent to just(b - a) < 5
. – HaematinicO(log(n))
complexity, then I insert it withO(n)
complexity. Then I just go through the list from left to right (because the values are already sorted) and do some calculations with value pairs. All these complexities are added up (not multiplied) since these operations are not nested (they are consecutive). – Impolite