remove extra parenthesis

Asked 2/11, 2012 at 23:36 Answered 3/11, 2012 at 0:4

c++string dictionary expression parentheses

Question:

Remove extra parentheses from the string.
e.g.

    ((a+b))*c       => (a+b)*c  
    (a+b)+c         => (a+b)+c  
    ((a+b)/(c+d))   => ((a+b)/(c+d))   
    (a+(((b-c)))*d) => (a+(b-c)*d)  and so on.

I have come up with the following solution:
Approach: I scan through the string and remember (using a map) the index of a opening parenthesis and whether it's extra or not (by default it's extra). If I find a closing parenthesis I check the corresponding opening parenthesis from map and if it's extra then delete both.

void removeExtraParentheses(string& S){
  map<int, bool> pmap;
  for(int i = 0; i < S.size(); i++){
    map<int, bool>::iterator it;
    if(S.at(i) == '('){
        pmap[i] = true;
    }
    else if(S.at(i) == ')'){
        it = pmap.end();
        it--;
        if(!(*it).second){
            pmap.erase(it);
        }
        else{
            S.erase(S.begin() + i);
            S.erase(S.begin() + (*it).first);
            pmap.erase(it);
            i = i - 2;
        }
    }
    else{
        if(!pmap.empty()){
            it = pmap.end();
            it--;
            (*it).second= false;
        }
    }
  }
}

Time complexity: O(n2)
Space: O(n)
I'm not too happy with my solution because I'm using extra storage and doing it in quadratic time.

Could we do this in O(n) time and O(1) space? If not what is the best one can do?

Janitor answered 2/11, 2012 at 23:36 Comment(7)

when you say "extra parens" do you just mean places where there is (( or do you mean places where the math does not require them? for example (a+b)+c also yields a+b+c since addition is commutative. if not, just do a string replace of "(((" "((" with "(", and ")))" and "))" with ")". – Yan 2/11, 2012 at 23:43

I think the second sample input I gave clears your doubt. I'm not sure " just do a string replace of "(((" "((" with "(", and ")))" and "))" with ")" " works – Janitor 2/11, 2012 at 23:45

@FrankThomas First, the examples shows that math doesn't factor in and only double parens are replaced. Second, that's going to fail. Think what it would do to ((a + b) + (c + d)) – Sordello 2/11, 2012 at 23:46

@jasper, I believe it would return (a+b) + (c+d). good point, those too could be removed. – Yan 2/11, 2012 at 23:49

@FrankThomas You've raised an interesting issue. For floating point types (and potentially for signed integral types as well, but it's not the case on any of the usual architectures), addition is not associative, and (a + b) + c is the same as a + b + c, but in the case of a + (b + c), you need the parentheses. Or perhaps his problem is underspecified: on a Windows machine, or on most Unices, in the last case, you don't need the parentheses for int, but on some exotic machine, you might. – Muriah 2/11, 2012 at 23:53

@FrankThomas Re (a + b) + (c + d): the first can clearly be removed. The second can't at least with floating point. In other words, (a + b) + (c + d) gives the same results as a + b + (c +d), but different results than (a + b) + c + d. – Muriah 2/11, 2012 at 23:56

@FrankThomas I meant that judging by the samples, he does not take any operator precedence in account, after all he doesn't remove anything in (a+b)+c (if that was about special border cases like @JamesKenze is suggesting, he would have mentioned it). As such, I believe he only wants to remove actual double pairs, in which case your plan doesn't work because it removes something from ((a + b) + (c + d)) (or ((a + b) + (c + d)) * e if you prefer) – Sordello 3/11, 2012 at 0:16

Build an expression tree, then regenerate the expression with minimum parentheses. Any parentheses in the original which aren't in the generates are unnecessary.

A simple, almost correct solution would be to assign a precedence to each operator. You then need parentheses any time a node directly under the node you are working on is an operator which has a lower precedence than that of the node; e.g if you are at a * (multiplication) node, and one of the two child nodes has is a + (addition) node. Plus some logic to handle left or right binding: if you're at a + node, and the right hand node is also a + node, you need parentheses.

This is only partially correct, because there are a few constructs in C++ which can't really be mapped to a precedence grammar: some of the type casting constructs, or the ternary operator, come to mind. At least in the case of the ternary operator, however, the special handling isn't that difficult.

EDIT:

With regards to your big-O questions: this obviously isn't O(1) in space, since you need to construct the entire expression in memory. I don't think an O(1) for memory is possible, since potentially, you can have unlimited nesting, and you can't tell whether the parentheses are necessary or not until an indeterminate time later. I've not actually analysed it, but I think it is O(n) in time. The upper bound on the number of nodes is n (the length of the string), and you need to visit each node exactly once.

Muriah answered 2/11, 2012 at 23:45 Comment(5)

+1: Judging by the submitted code, the OP may be unfamiliar with operator precedence parsing, so this may be above his/her head, but it is none-the-less the correct answer. I wouldn't worry about C++ casting constructs etc; the OPs question seems only to deal with id and binary-op expression (id is itself a unary expression, but I'm not sure if they understand that or not either). Also, I see no unary-minus in there, and I hope for their sake it is not implied. – Mortify 2/11, 2012 at 23:54

@Mortify I suspect you're right with regards to what the OP may know. But this looks very much like a homework assignment, and the goal is probably that he implement a simple parser to solve it. If he doesn't know operator precedence parsing as a technique or a term, he probably does (or he should, if he's been following the course) have at least an intuitive idea about precedence and know how to write a simple parser which takes it into account. – Muriah 3/11, 2012 at 0:5

@Mortify And FWIW, I never mentioned operator precedence parsing; I just said "build an expression tree" (which can be done very easily with recursive descent). pokey909 didn't talk about parsing, but his pseudo-code is true precedence parsing. Which I really like (even though it won't handle cases like the ?: operator correctly). – Muriah 3/11, 2012 at 0:11

I'm not sure how you would reduce your expression tree without a precedence-decision. I would certainly need one. On that I think you're both right. pokeys answer wasn't up when I made my original comment, but now that I see it I concur with you. – Mortify 3/11, 2012 at 0:17

@Mortify C++ doesn't have a precedence grammar, and there are a few odd corners (probably not relevant here) where it cannot be expressed in a precedence grammar. There is no precedence in C++ (but it may be helpful to present it using precedence, as long as the odd corners aren't forgotten). – Muriah 5/11, 2012 at 10:43

More or less found on the web...

Given input: ((A+B)*C) Expected output: (A+B)*C

Assumptions:

Peek (queue) just tells the element in front end of queue without deleting it.
precedence( ) is a function which looks a precedence table of operators

Pseudo code below:

Convert infix expression to RPN (e.g. Shunting-yard algo O(n))

AB+C*
Insert only operators in queue Q

(front)+ -------- *(rear)
Parse postfix expression
If operand, push to stack 'S'
If operator
- y=Delete(Q)
- If precedence(y) > precedence(peek(Q)), then Push (S, "Pop(S) y Pop(S)")
- If precedence(y) < precedence(peek(Q)), then Push (S, "( Pop(S) y Pop(S) )")
Final result on top of S

All should be O(n).

Baras answered 2/11, 2012 at 23:58 Comment(0)

I thought I'd take a stab a this. This is the solution to the problem that sprang to mind for me. Do note that this is pseudo-code and isn't meant to be run directly.

(Actually, it's rather C++-ish, but it's been a while since I last wrote actual C++ and I didn't feel like putting effort into getting everything right when my intention was to get across an algorithm.)

queue<tuple<int, bool> > q = new queue<tuple<int, bool> >();

for (int i = 0; i < str.length; i++)
{
    char a = str.charAt(i);

    if (a == '(')
    {
        q.push(new tuple<int, bool>(i, false));
    }
    else if (a == ')')
    {
        if (q.top().bool)
        {
            // remove top.int and i from string
        }

        q.pop();
        q.top().bool = true;
    }
    else
    {
        q.top().bool = false;
    }
}

It does the job in O(n) and uses O(n) space (or actually, the amount of space used is actually based on the deepest nesting level present in the string, but it's guaranteed to be lower than n)

(Do note that // remove top.int and i from string can't actually be done in O(1). However, if you get a little creative you can do something similar in O(1). For example, you could actually build a list of characters for the output and store an iterator instead of the int, then you can remove the two characters in O(1). At the end you can build your final string by iterating over the list in O(n). An alternative solution would be to actually work on an array (or vector) of DummyOrCharacters, that are either a dummy and contain nothing or contain a character. Once again, you can replace a character by a dummy in O(1). Once again, you'll iterate over the structure and build your output string in O(n))

Sordello answered 3/11, 2012 at 0:4 Comment(0)

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags