Segment tree implementation in Python
Asked Answered
G

2

8

I am solving this problem using segment tree but I get time limit error. Below is my raw code for range minimum query and by changing min to max in my code the above problem can be solved . I don't know how I can improve the performance of my code. Can you help me with its performance issues ?

t = [None] * 2 * 7      # n is length of list


def build(a, v, start, end):
    '''
    A recursive function that constructs Segment Tree for list a.
    v is the starting node
    start and end are the index of array
    '''

    n = len(a)
    if start == end:
        t[v] = a[start]
    else:
        mid = (start + end) / 2
        build(a, v * 2, start, mid)     # v*2 is left child of parent v
        # v*2+1 is the right child of parent v
        build(a, v * 2 + 1, mid + 1, end)
        t[v] = min(t[2 * v], t[2 * v + 1])
    return t

print build([18, 17, 13, 19, 15, 11, 20], 1, 0, 6)

inf = 10**9 + 7


def range_minimum_query(node, segx, segy, qx, qy):
    '''
    returns the minimum number in range(qx,qy)
    segx and segy represent the segment index

    '''
    if qx > segy or qy < segx:      # query out of range
        return inf
    elif segx >= qx and segy <= qy:  # query range inside segment range
        return t[node]
    else:
        return min(range_minimum_query(node * 2, segx, (segx + segy) / 2, qx, qy), range_minimum_query(node * 2 + 1, ((segx + segy) / 2) + 1, segy, qx, qy))

print range_minimum_query(1, 1, 7, 1, 3)

# returns 13

Can this be implemented iteratively ?

Ganglion answered 11/12, 2016 at 11:18 Comment(6)
can you help me with [the code's] performance issues? Do you want hints to solve the problem yourself, or do you want analysed & coded solutions? Where do segments come into the picture? (Did you read the description of the segment tag?) (upvoted for supplying docstrings - consider renaming rmq to reflect range minimum query out of this context.) My 2cents: Your problem is not recursive vs. iterative.Recaption
@Recaption I want analysed and coded solutions . While adding tags I wrote segment trees but it broke into tree and segment tags (sorry for that) .Ganglion
(sorry for that - I take it you know how to edit tags.) Anybody heard of a priority search tree?Recaption
For any performance issues, I recommend the Python "thread" module. It allows running multiple things at the same time.Laure
If this was tagged segment-tree, you might get doubts expressed whether build() does build a segment tree: it is more usual to have explicit coordinates for the boundaries of the atomic intervals than any valid index, and to have segments of the set overlap . Your choice to have end inclusive seems unconventional, and the comment query range inside segment range backwards. (I have no idea why your code should be more than a factor of, say, two from optimum.)Recaption
if the data is homogeneous, integers here, replace list with array.arrayBehindhand
C
13

Choice of Language

First, you probably will never pass the grader if you use python. If you look at the status of all past solutions here, http://www.spoj.com/status/GSS1/start=0 , you will see that every almost every accepted solution has been written in C++. I do not think you have a choice but to use C++. Notice how the time limit is 0.115s-0.230s. That's an "only-for-C/C++" time limit. For problems that will accept solutions in other languages, the time limit will be a "round" number like 1 second. Python is about 2-4x slower than C++ in this type of environment.

Segment Tree Implementation Issues...?

Second, I'm not sure if your code is actually constructing a segment tree. Specifically, I don't understand why this line is there:

t[v]=min(t[2*v],t[2*v+1]) 

I'm pretty sure a node in a segment tree stores the sum of its children, so if your implementation is close to correct, I think it should instead read

t[v] = t[2*v] + t[2*v+1]

If your code is "correct", then I would question how you are finding the maximum interval sum within the range [x_i, y_i] if you don't even store the interval sums.

Iterative Segment Tree

Third, a segment tree can be implemented iteratively. Here is a tutorial in C++: http://codeforces.com/blog/entry/18051 .

Segment tree isn't supposed to be fast enough for this...

Finally, I don't understand how a segment tree is going to help you for this problem. A segment tree lets you query the sum of a range in log(n). This problem asks for the maximum possible sum of any range. I have not heard of a segment tree that allows for "range minimum query" or "range maximum query."

A naive solution will be O(n^3) (try all n^2 possible start and end points and compute the sum in O(n) operations) for 1 query. And, if you use a segment tree, you can get the sum in O(log(n)) instead of O(n). That only speeds you up to O(n^2 log(n)), which cannot work for N = 50000.

Alternative algorithm

I think you should be looking at this instead, which runs in O(n) per query: http://www.geeksforgeeks.org/largest-sum-contiguous-subarray/ . Write it in C/C++ and be efficient with IO as one commenter suggested.

Cliffcliffes answered 19/12, 2016 at 2:26 Comment(5)
I am solving the range minimum query problem : " We have an array arr[0 . . . n-1]. We should be able to efficiently find the minimum value from index qs (query start) to qe (query end) where 0 <= qs <= qe <= n-1. " And my code is doing the same . Segment tree is best algorithm to solve this problem . Read this geeksforgeeks.org/segment-tree-set-1-range-minimum-queryGanglion
That algorithm gives you the minimum value in an interval. It does not give you the minimum/maximum continuous sum in an interval, which the SPOJ problem asks for.Cliffcliffes
I do agree, that python will most probably time out, but without knowing the maximal value for M it is hard to tell with certainty. Yet I do think, that the problem can be solved with help of a segment tree, but not the version we discuss here - changing min to max just doesn't cut it.Hangdog
@Hangdog Now that I think about it again, it does seem like there may be a way to combine segment tree with Kadane's algorithm. Just plain Kadane's algorithm will take O(M*n). Perhaps there is a way to construct a segment tree and apply the idea in Kadane's algorithm so we can query for the maximum contiguous subarray sum in log(n). This would give a complexity similar to O(n + M*log(n)).Cliffcliffes
"I do not think you have a choice but to use C++" is still valid when we have Cython and PyPy?Behindhand
R
2

You can give a try with generators as you can go around a lot of limitations. However you did not provide a dataset which shows your performances issues clearly - can you provide a problematic dataset ?

Here you can try :

t=[None]*2*7
inf=10**9+7

def build_generator(a, v, start, end):
    n = len(a)

    if start == end:
        t[v] = a[start]
    else:
        mid = (start + end) / 2
        next(build_generator(a, v * 2, start, mid))
        next(build_generator(a, v * 2 + 1, mid + 1, end))
        t[v] = min(t[2 * v], t[2 * v + 1])
    yield t



def range_minimum_query_generator(node,segx,segy,qx,qy):
    if qx > segy or qy < segx:
        yield inf
    elif segx >= qx and segy <= qy:
        yield t[node]
    else:
        min_ = min(
            next(range_minimum_query_generator(node*2,segx,(segx+segy)/2,qx,qy)),
            next(range_minimum_query_generator(node*2+1,((segx+segy)/2)+1,segy,qx,qy))
        )
        yield min_

next(build_generator([18,17,13,19,15,11,20],1,0,6))
value = next(range_minimum_query_generator(1, 1, 7, 1, 3))
print(value)

EDIT

In fact that may not fix your issues. There is another way to workaround any recursion limits (as described by D. Beazley in its tutorial on generators - https://www.youtube.com/watch?v=D1twn9kLmYg&t=9588s around the timecode 2h00)

Rosemari answered 16/12, 2016 at 22:9 Comment(1)
Thanks for the video reference. Insane stuff!Salesgirl

© 2022 - 2024 — McMap. All rights reserved.