Is it possible to always eliminate goto's?
Asked Answered
W

5

6

While making everthing with goto's is easy (as evidenced by f.ex. IL), I was wondering if it is also possible to eliminate all goto statements with higher level expressions and statements - say - using everything that's supported in Java.

Or if you like: what I'm looking for is 'rewrite rules' that will always work, regardless of the way the goto is created.

It's mostly intended as a theoretical question, purely as interest; not as a good/bad practices.

The obvious solution that I've thought about is to use something like this:

while (true)
{
    switch (state) {
       case [label]: // here's where all your goto's will be
          state = [label];
          continue;
       default:
          // here's the rest of the program.
    }
}

While this will probably work and does fits my 'formal' question, I don't like my solution a single bit. For one, it's dead ugly and for two, it basically wraps the goto into a switch that does exactly the same as a goto would do.

So, is there a better solution?


Update 1

Since a lot of people seem to think the question is 'too broad', I'm going to elaborate a bit more... the reason I mentioned Java is because Java doesn't have a 'goto' statement. As one of my hobby projects, I was attempting to transform C# code into Java, which is proving to be quite challenging (partly because of this limitation in Java).

That got me thinking. If you have f.ex. the implementation of the 'remove' method in Open addressing (see: http://en.wikipedia.org/wiki/Open_addressing - note 1), it's quite convenient to have the 'goto' in the exceptional case, although in this particular case you could rewrite it by introducing a 'state' variable. Note that this is just one example, I've implemented code generators for continuations, which produce tons and tons of goto's when you're attempting to decompile them.

I'm also not sure if rewriting in this matter will always eliminate the 'goto' statement and if it is allowed in every case. While I'm not looking for a formal 'proof', some evidence that elimination is possible in this matter would be great.

So about the 'broadness', I challenge all the people that think there are 'too many answers' or 'many ways to rewrite a goto' to provide an algorithm or an approach to rewriting the general case please, since the only answer I found so far is the one I've posted.

Welton answered 16/9, 2014 at 20:2 Comment(12)
goto is a hack that can improve performance in edge cases (from a practical perspective: tokenizers, IL execution engines, and kernels). It's never unreplaceable in any case.Marni
Are you aware of that while, for and other looping constructs internally uses goto? So you want to avoid using loops? Trying to avoid switch statement makes no sense. Well, you could refactor the switch to use polymorphism but there should be atleast one.Unwish
@SriramSakthivel Many languages do happen to implement those constructs through the use of jump statements, but that isn't a conceptual requirement; there are other ways of implementing those constructs, they just aren't as commonly used in mainstream programming languages.Squirm
@Squirm I'm talking about c# as that's what question is about. And I'd like to know what are the other strategies followed to implement loops? Any references would be helpful.Unwish
@SriramSakthivel C# is merely a specs for code, to which anyone can write an implementation of. The fact that the Microsoft implementation of the C# language happens to have a particular implementation detail does not make that property a fact of the C# language itself.Squirm
@Squirm You got me there :)Unwish
@HABO You're diverting. Turing machines are based on the fact that you need a 'state machine', which is quite similar to what my 'switch' solution does. The reason I mentioned Java is because it has all the higher level statements, but lacks a 'goto'. If your point is that I shouldn't have tagged it with C#, then I suppose you have a point - but that doesn't get us closer to a solution.Welton
@NielsKeurentjes That's a easy statement, one that I found very difficult to 'formalize' in an algorithm... So go ahead: give me the algorithm that replaces goto's in the general case? That was after all my question. :-)Welton
You will always need goto or jump statements to control the flow of your program. But that's why we have keywords for flow control, like if, else, for, etc... They replace every scenario where you might need an explicit goto.Prowl
@StefandeBruijn goto in itself is not evil. It's the lack of transparency of its operation and the control flow that is considered 'better to avoid'. Every while could also be written in a for, but neither of them has the intrinsic capability to generate true spaghetti, that's why both of them exist as a convenience. The flexibility of goto also implies that formalization is going to be very hard, if not impossible, as it's essentially an encapsulation of an assembly-level construct. I doubt you're going to fully succeed there :)Marni
@NielsKeurentjes I'm well aware of best practices and what the use of goto is. Dijstra's wrote a paper on 'go to considered harmful' quite a while back that you might find relevant. Still, while what you're saying might be true, I'm not looking for easy solutions or unfounded arguments - I'm researching if it is possible and I'm trying to look for evidence that supports if it's possible (or not) before drawing any conclusions. BTW Patrice's solution sounds good so far.Welton
Both your question and the answer already got my upvote yesterday - I find the theoretical discussion fascinating as well :)Marni
C
13

This 1994 paper: Taming Control Flow: A Structured Approach to Eliminating Goto Statements proposes an algorithm to eradicate all goto statements in a C program. The method is applicable to any program written in C# or any language that uses common constructs like if/switch/loop/break/continue (AFAIK, but I don't see why it wouldn't).

It begins with the two simplest transformations:

  • Case 1

    Stuff1();
    if (cond) goto Label;
    Stuff2();
    
    Label:
    Stuff3();
    

    becomes:

    Stuff1();
    if (!cond)
    {
      Stuff2();
    }
    Stuff3();
    
  • Case 2

    Stuff1();
    Label:
    Stuff2();
    Stuff3();
    if (cond) goto Label;
    

    becomes:

    Stuff1();
    do
    {
      Stuff2();
      Stuff3();
    } while (cond);
    

and builds on that to examine each complex case and apply iterative transformations that lead to those trivial cases. It then rounds off with the ultimate gotos/labels eradication algorithm.

This is a very interesting read.

UPDATE: Some other interesting papers on the subject (not easy to get your hands on, so I copy direct links here for reference):

A Formal Basis for Removing Goto Statements

A Goto-Elimination Method And Its Implementation For The McCat C Compiler

Casta answered 17/9, 2014 at 8:37 Comment(7)
Thanks, this is along the lines I was looking for! I'll read the paper and see if this answers my question.Welton
Patrice, the paper is a great read and definitely a very good step towards the solution I'm looking for; either way it is elegant and answers my question. I'll have to test it though; I'm not entirely sure what kind of code it will produce (think f.ex. about logical operators), and wonder if a lot more AST optimizations are necessary, such as inlining (copy'ing) pieces of code, wrapping pieces of code in helper functions, etc. Either way it's definitely worth implementing. Do you happen to know if there is anything on that as well?Welton
I like the paper. I wonder what the transformation would make of Duff's Device (en.wikipedia.org/wiki/Duff%27s_device).Rattlebrained
@PeterSchneider That's easy: it doesn't change anything there, since it only works on 'goto' statements. :-)Welton
@Stefan No, I'm not aware of any implementation (except of course the authors' one). Let me know if you start something.Casta
@PatriceGahide I'm going to build it on short notice, somewhere in the next few days probably, along with a few more flow analysis things that I have in mind; I'm quite interested to see what the output is.Welton
@PatriceGahide It's not always very pretty, but the mechanism seems to work. That said, the idea is more generally applicable and I think it's possible to add more rules along this line of reasoning, so I'm going to build on this. Thanks again.Welton
B
3

A situation where a goto can be avoided, but I think it is better to use it:

When I need to exit a inner loop and the loop:

for(int i = 0; i < list.Count; i++)
{
    // some code that initializes inner
    for(int j = 0; j < inner.Count; j++)
    {
       // Some code.
       if (condition) goto Finished;
    }
}
Finished:
// Some more code.

To avoid the goto you should do something like this:

for(int i = 0; i < list.Count; i++)
{
    // some code that initializes inner
    bool conditon = false;
    for(int j = 0; j < inner.Count; j++)
    {
       // Some code that might change condition
       if (condition) break;
    }
    if (condition) break;
}
// Some more code.

I think it looks much nicer with the goto statement.

The second situation is okay if the inner loop was in a different method.

void OuterLoop(list)
{
    for(int i = 0; i < list.Count; i++)
    {
        // some code that initializes inner
        if (InnerLoop(inner)) break;
    }
}
bool InnerLoop(inner)
{
    for(int j = 0; j < inner.Count; j++)
    {
       // Some code that might change condition
       if (condition) return true;
    }
    return false;
}
Boson answered 17/9, 2014 at 9:35 Comment(1)
I like your post but the current version of InnerLoop() always returns false which kindof defeats its purpose. You may want to always return condition (and declare amd initialize it to false in InnerLoop()).Rattlebrained
H
2

I have some practical experience of attempting to take an unstructured program (in COBOL, no less) and render it as structured by removing every instance of GOTO. The original programmer was a reformed Assembly programmer, and though he may have known about the PERFORM statement, he didn't use it. It was GOTO GOTO GOTO. And it was serious spaghetti code -- several hundred lines worth of spaghetti code. I spent a couple of weeks worth of spare time trying to rewrite this monstrous construct, and eventually I had to give up. It was a huge steaming pile of insanity. It worked, though! Its job was to parse user instructions sent in to the mainframe in a textual format, and it did it well.

So, NO, it is not always to possible to completely eliminate GOTO -- if you are using manual methods. This is an edge case, however -- existing code that was written by a man with an apparently twisted programming mind. In modern times, there are tools available which can solve formerly intractable structural problems.

Since that day I have coded in Modula-2, C, Revelation Basic, three flavors of VB, and C# and have never found a situation that required or even suggested GOTO as a solution. For the original BASIC, however, GOTO was unavoidable.

Harping answered 16/9, 2014 at 20:42 Comment(7)
Thanks for the constructive reaction. The same holds for me; I've written COMX BASIC in 1987 and that simply doesn't have any functions, calls, etc - the only alternative is excessive use of spaghetti GOTO's. :-) After that I've rarely used them (Open Addressing is one of the few exceptions). Still, that suggests that somewhere in my mind there's a 'goto eliminination algorithm' - what I'm trying to do here is to formalize that 'algorithm'.Welton
@Stefan I assume that if one was to re-code something from back then in Basic one would still do it differently. Like my C programming style has changed after learning C++, tending towards poor man's OO: One would do poor man's control structures.Rattlebrained
@PeterSchneider I just don't think it's only a matter of experience - if I can rewrite goto statements as OO patterns I'm happy as well. My personal experience is that you learn 'code patterns' that solve specific problems, expressed as 'best practice' in a certain language. If you translate a piece of code, you attempt to deduce the problem that's solved and 'insert' the solution template. In the old days, most of these patterns involved tons of 'goto' statements; the question is: how can we formalize this and does that solution apply to every goto statement that's out there?Welton
So you think that the transformation strategies presented in the paper in @Patrice's post wouldn't work with your old programs? (I think they would.) Perhaps it's difficult to do manually but it should work with a programmed tool.Rattlebrained
@PeterSchneider, I have no reason to doubt they would work. I post this answer to your comment although the comment stream seems to be a conversation between you and Stefan, so I'm sure you didn't address this to me!Harping
No, I was talking to you, comenting on your "So, NO, it is not always to possible to completely eliminate GOTO.". I understand from your comment now that it would be "impossible" (with reasonable effort) to eliminate gotos manually from your old programs. You know that because you tried, as you write in your post. But with machine support similar to the one described in the paper mentioned by Patrice it should be possible. Perhaps you can make your "NO" statement I cited less absolute in order to avoid misunderstandings.Rattlebrained
@PeterSchneider you make a good point, so I have modified my answer correspondingly. Thanks for your input!Harping
R
1

Since you said it's a theoretical question, here is the theoretical answer.

Java is Turing complete so of course, yes. You can express any C# program in Java. You can also express it in Minecraft Redstone or Minesweeper. Compared to these alternatives expressing it in Java should be easy.

An obviously more practical answer though, namely an intelligible algorithm to do the transformation, has been given by Patrice.

Rattlebrained answered 17/9, 2014 at 8:46 Comment(4)
Strictly speaking, the fact that Java is Turing complete doesn't mean you can express any C# program in Java. Elephant is an animal, lion is an animal, but elephant can't wave with his mane. Waving with ears is similar in practice, but it's not the same thing.Stain
@Stain I'm actually more a computer engineer than scientist but: "Given any computer algorithm, a Turing machine capable of simulating that algorithm's logic can be constructed.", and "a programming language is said to be Turing complete or computationally universal if it can be used to simulate any Turing machine". In other words, any algorithm can be expressed in any Turing-complete language.Rattlebrained
@Stain So if you want a metaphor it's rather something like "if two sets A and B both have a bijective projection into the natural numbers then they have the same "amount of elements", or more precisely, the same cardinality, namely $\aleph_0$. (Oh, MathJax doesn't work here, does it...) It's not a common subset, it's a common equivalence.Rattlebrained
Any algorithm - yes. You should be able to port any pure function to any Turing-complete language, and it will give you the same result. With exception of side-effects (time & place of execution, resources used, external calls, etc.). The problem is some programs (if not most of them) have side-effects and even sometimes rely on them. So, no, you can't express any C# program in Java.Stain
G
1

I don't like my solution a single bit. For one, it's dead ugly and for two, it basically wraps the goto into a switch that does exactly the same as a goto would do.

That's what you're always going to get when looking for a general pattern which can replace any use of goto, something which does exactly the same as a goto would do. What else could it be? Different uses of goto should be replaced with best matching language construct. That's why constructs as switch statements and for loops exist in the first place, to make it easier and less error prone to create such a program flow. The compiler will still generate goto's (or jumps), but it will do so consistently where we will screw up. On top of that we don't have to read what the compiler generates, but we get to read (and write) something that's easier to understand.

You'll find that most compiler constructs exist to generalize a specific use of goto, those constructs where create based on the common patterns of goto usage which existed before it. The paper Patrice Gahide mentions sort of shows that process in reverse. If you are finding goto's in your code you are either looking at one of those patterns, in which case you should replace it with the matching language construct. Or you are looking at essentially unstructured code in which case you should actually structure the code (or leave it alone). Changing it to something unstructured but without goto can only make matters worse. (The same generalization process is still going on by the way, consider how foreach is being added to compilers to generalize a very common usage of for...)

Guileless answered 17/9, 2014 at 10:13 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.