To use goto or not?
Asked Answered
C

9

21

This question may sound cliched, but I am in a situation here.

I am trying to implement a finite state automaton to parse a certain string in C. As I started writing the code, I realised the code may be more readable if I used labels to mark the different states and use goto to jump from one state to another as the case comes.

Using the standard breaks and flag variables is quite cumbersome in this case and hard to keep track of the state.

What approach is better? More than anything else I am worried it may leave a bad impression on my boss, as I am on an internship.

Chloro answered 23/6, 2010 at 21:37 Comment(13)
A must link: xkcd.com/292Andreasandree
I am curious. Can you post a before goto/after goto snapshot of the code?Skysail
We don't know your boss and what prejudices he has, so if you want to look clever, use function pointers (see my answer to other question here: https://mcmap.net/q/129430/-state-machines-tutorials-closed/…)Needlecraft
Finite state machines are the penultimate antithesis of what computer science has been trying to achieve since its inception. There must be a dozen better ways to parse a string. Honestly, try to find one.Innards
@Amardeep: Bah, party pooper.Kinna
@Amardeep- You do realize that "penultimate" means "second from the last", right?Apish
@bta: Yes, I do. The ultimate one is the if-then-elseif-elseif-elseif-elseif-elseif... construct.Innards
@Needlecraft thanks for the link. That was a very helpful code, and something of a new C programming technique for me. :)Chloro
@Moron I hadn't written the code at the time of posting this question. Just as I started thinking about how to write the code, it occurred to me that may be using goto was more intuitive as we naturally think that way when making state machines on paper.Chloro
@Abhinav: I see. Think of it this way: say you designed a FSM and had a digram with maybe 10 states. Now If I forced you to line the states up close together along a straight line and then draw the FSM with arrows going back and forth, do you think it would be readable? Essentially that is what you will be doing with the code if you use gotos and jump around in one method!Skysail
@Kimmo Puputti- Don't you mean "goto xkcd.com/292 " ?Apish
@Apish No, I was afraid to use goto. :)Andreasandree
Possible duplicate of Use GOTO or not?.Lifeless
A
20

Using a goto for implementing a state machine often makes good sense. If you're really concerned about using a goto, a reasonable alternative is often to have a state variable that you modify, and a switch statement based on that:

typedef enum {s0,s1,s2,s3,s4,...,sn,sexit} state;

state nextstate;
int done = 0;

nextstate = s0;  /* set up to start with the first state */
while(!done)
   switch(nextstate)
      {
         case s0:
            nextstate = do_state_0();
            break;
         case s1:
            nextstate = do_state_1();
            break;
         case s2:
            nextstate = do_state_2();
            break;
         case s3:
             .
             .
             .
             .
         case sn:
            nextstate = do_state_n();
            break;
         case sexit:
            done = TRUE;
            break;
         default:
            /*  some sort of unknown state */
            break;
      }
Aerophobia answered 23/6, 2010 at 21:42 Comment(0)
A
38

There is nothing inherently wrong with goto. The reason they are often considered "taboo" is because of the way that some programmers (often coming from the assembly world) use them to create "spaghetti" code that is nearly impossible to understand. If you can use goto statements while keeping your code clean, readable, and bug-free, then more power to you.

Using goto statements and a section of code for each state is definitely one way of writing a state machine. The other method is to create a variable that will hold the current state and to use a switch statement (or similar) to select which code block to execute based on the value of the state variable. See Aidan Cully's answer for a good template using this second method.

In reality, the two methods are very similar. If you write a state machine using the state variable method and compile it, the generated assembly may very well resemble code written using the goto method (depending on your compiler's level of optimization). The goto method can be seen as optimizing out the extra variable and loop from the state variable method. Which method you use is a matter of personal choice, and as long as you are producing working, readable code I would hope that your boss wouldn't think any different of you for using one method over the other.

If you are adding this code to an existing code base which already contains state machines, I would recommend that you follow whichever convention is already in use.

Apish answered 23/6, 2010 at 22:3 Comment(2)
there is wisdom in following whichever convention is already in useHaemolysis
Thanks. That's a pretty useful advice . This state machine problem just came inside my own project, and the code provided by others is really clean, so I will take that approach.Chloro
A
20

Using a goto for implementing a state machine often makes good sense. If you're really concerned about using a goto, a reasonable alternative is often to have a state variable that you modify, and a switch statement based on that:

typedef enum {s0,s1,s2,s3,s4,...,sn,sexit} state;

state nextstate;
int done = 0;

nextstate = s0;  /* set up to start with the first state */
while(!done)
   switch(nextstate)
      {
         case s0:
            nextstate = do_state_0();
            break;
         case s1:
            nextstate = do_state_1();
            break;
         case s2:
            nextstate = do_state_2();
            break;
         case s3:
             .
             .
             .
             .
         case sn:
            nextstate = do_state_n();
            break;
         case sexit:
            done = TRUE;
            break;
         default:
            /*  some sort of unknown state */
            break;
      }
Aerophobia answered 23/6, 2010 at 21:42 Comment(0)
K
15

I'd use a FSM generator, like Ragel, if I wanted to leave a good impression on my boss.

The main benefit of this approach is that you are able to describe your state machine at a higher level of abstraction and don't need to concern yourself of whether to use goto or a switch. Not to mention in the particular case of Ragel that you can automatically get pretty diagrams of your FSM, insert actions at any point, automatically minimize the amount of states and various other benefits. Did I mention that the generated FSMs are also very fast?

The drawbacks are that they're harder to debug (automatic visualization helps a lot here) and that you need to learn a new tool (which is probably not worth it if you have a simple machine and you are not likely to write machines frequently.)

Kinna answered 23/6, 2010 at 21:39 Comment(6)
The problems I've had in using non-trivial code generators are 1) the debugging, and 2) the need to learn a new tool... If the tool gives me something useful that would be difficult to achieve without it, or if I'll use the tool enough to make up for the time investment in learning it (note debuggability as an important factor here), great, I'll do it. I wouldn't bother for a trivial FSM, but might for a more complicated one.Kurt
I wouldn't advocate GOTO because the FSM generators do it. There is a difference between a code generator (which presumably has been tested) creating GOTOs and a programmer doing it. I am not sure which way to go here. But I wouldn't let the fact that the generators use GOTOs affect the decision either way (except of course if I was writing a generator to do it...).Haemolysis
@Aidan: I mostly agree. Although some generators have debugging facilites that alleviate the debugging problem, they do not alleviate the 'learn a new tool' problem. For simple machines they are probably overkill, unless you already know the tool, which might be an incentive to learn it anyway :-)Kinna
@Cervo: Correct. My intention wasn't to advocate the use of goto for user generated code, although rereading the paragraph it does seem that way.Kinna
@Aidan, @Cervo: Edited reflecting your comments.Kinna
@Vinko Thanks, that's really a very useful tool. I will definitely explore it for. Right now I only need to build a small state machine consisting of not more than 6 states, so I might better use a state variable and switch cases approach.Chloro
M
10

I would use a variable that tracks what state you are in and a switch to handle them:

fsm_ctx_t ctx = ...;
state_t state = INITIAL_STATE;

while (state != DONE)
{
    switch (state)
    {
    case INITIAL_STATE:
    case SOME_STATE:
        state = handle_some_state(ctx)
        break;

    case OTHER_STATE:
        state = handle_other_state(ctx);
        break;
    }
}
Mariquilla answered 23/6, 2010 at 21:41 Comment(0)
O
8

Goto isn't neccessary evil, and I have to strongly disagree with Denis, yes goto might be a bad idea in most cases, but there are uses. The biggest fear with goto is so called "spagetti-code", untraceable code paths. If you can avoid that and if it will always be clear how the code behaves and you don't jump out of the function with a goto, there is nothing against goto. Just use it with caution and if you are tempted to use it, really evaluate the situation and find a better solution. If you unable to do this, goto can be used.

Oilcup answered 23/6, 2010 at 21:43 Comment(0)
M
8

Avoid goto unless the complexity added (to avoid) is more confusing.

In practical engineering problems, there's room for goto used very sparingly. Academics and non-engineers wring their fingers needlessly over using goto. That said, if you paint yourself into an implementation corner where a lot of goto is the only way out, rethink the solution.

A correctly working solution is usually the primary objective. Making it correct and maintainable (by minimizing complexity) has many life cycle benefits. Make it work first, and then clean it up gradually, preferably by simplifying and removing ugliness.

Masakomasan answered 23/6, 2010 at 21:47 Comment(0)
K
4

I don't know your specific code, but is there a reason something like this:

typedef enum {
    STATE1, STATE2, STATE3
} myState_e;

void myFsm(void)
{
    myState_e State = STATE1;

    while(1)
    {
        switch(State)
        {
            case STATE1:
                State = STATE2;
                break;
            case STATE2:
                State = STATE3;
                break;
            case STATE3:
                State = STATE1;
                break;
        }
    }
}

wouldn't work for you? It doesn't use goto, and is relatively easy to follow.

Edit: All those State = fragments violate DRY, so I might instead do something like:

typedef int (*myStateFn_t)(int OldState);

int myStateFn_Reset(int OldState, void *ObjP);
int myStateFn_Start(int OldState, void *ObjP);
int myStateFn_Process(int OldState, void *ObjP);

myStateFn_t myStateFns[] = {
#define MY_STATE_RESET 0
   myStateFn_Reset,
#define MY_STATE_START 1
   myStateFn_Start,
#define MY_STATE_PROCESS 2
   myStateFn_Process
}

int myStateFn_Reset(int OldState, void *ObjP)
{
    return shouldStart(ObjP) ? MY_STATE_START : MY_STATE_RESET;
}

int myStateFn_Start(int OldState, void *ObjP)
{
    resetState(ObjP);
    return MY_STATE_PROCESS;
}

int myStateFn_Process(int OldState, void *ObjP)
{
    return (process(ObjP) == DONE) ? MY_STATE_RESET : MY_STATE_PROCESS;
}

int stateValid(int StateFnSize, int State)
{
    return (State >= 0 && State < StateFnSize);
}

int stateFnRunOne(myStateFn_t StateFns, int StateFnSize, int State, void *ObjP)
{
    return StateFns[OldState])(State, ObjP);
}

void stateFnRun(myStateFn_t StateFns, int StateFnSize, int CurState, void *ObjP)
{
    int NextState;

    while(stateValid(CurState))
    {
        NextState = stateFnRunOne(StateFns, StateFnSize, CurState, ObjP);
        if(! stateValid(NextState))
            LOG_THIS(CurState, NextState);
        CurState = NextState;
    }
}

which is, of course, much longer than the first attempt (funny thing about DRY). But it's also more robust - failure to return the state from one of the state functions will result in a compiler warning, rather than silently ignore a missing State = in the earlier code.

Kurt answered 23/6, 2010 at 21:42 Comment(2)
...isn't that just goto in effect?Copyedit
No, because the compiler handles the labels, the jumping, and the stack. It's like suggesting that a million lines of an expressive language like Lisp or Python could be assembler, in effect. Sure, they might do the same task after everything is said and done, but they're not remotely the same.Morgenthaler
K
1

I would recommend you the "Dragon book": Compilers, Principles-Techniques-Tools from Aho, Sethi and Ullman. (It is rather expensive to buy but you for sure will find it in a library). There you will find anything you will need to parse strings and build finite automatons. There is no place I could find with a goto. Usually the states are a data table and transitions are functions like accept_space()

Kristianson answered 23/6, 2010 at 21:43 Comment(0)
H
1

I can't see much of a difference between goto and switch. I might prefer switch/while because it gives you a place guaranteed to execute after the switch (where you could throw in logging and reason about your program). With GOTO you just keep jumping from label to label, so to throw in logging you'd have to put it at every label.

But aside from that there shouldn't be much difference. Either way, if you didn't break it up into functions and not every state uses/initializes all local variables you may end up with a mess of almost spaghetti code not knowing which states changed which variables and making it very difficult to debug/reason about.

As an aside, can you maybe parse the string using a regular expression? Most programming languages have libraries that allow using them. The regular expressions often create an FSM as part of their implementation. Generally regular expressions work for non arbitrarily nested items and for everything else there is a parser generator(ANTLR/YACC/LEX). It is generally much easier to maintain a grammar/regex than the underlying state machine. Also you said you were on an internship, and generally they might give you easier work than say a senior developer, so there is a strong chance that a regex may work on the string. Also regular expressions generally aren't emphasized in college so try using Google to read up on them.

Haemolysis answered 23/6, 2010 at 22:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.