table of functions vs switch in golang
Asked Answered
T

3

6

im am writing a simple emulator in go (should i? or should i go back to c?). anyway, i am fetching the instruction and decoding it. at this point i have a byte like 0x81, and i have to execute the right function.

should i have something like this

func (sys *cpu) eval() {
    switch opcode {
    case 0x80:
        sys.add(sys.b)
    case 0x81:
        sys.add(sys.c)
    etc
    }
}

or something like this

var fnTable = []func(*cpu) {
    0x80: func(sys *cpu) {
        sys.add(sys.b)
    },
    0x81: func(sys *cpu) {
        sys.add(sys.c)
    }
}
func (sys *cpu) eval() {
    return fnTable[opcode](sys)
}

1.which one is better?
2.which one is faster?
also
3.can i declare a function inline?
4.i have a cpu struct in which i have the registers etc. would it be faster if i have the registers and all as globals? (without the struct)

thank you very much.

Tameshatamez answered 29/3, 2012 at 15:10 Comment(3)
I'd be surprised if the compiler didn't create a jump table for something like this, particularly if opcode is a byte. I'll be interested to see the results of your benchmarks.Irizarry
Just a note for your next questions: Your question 4 is not really on topic of your general question. IMO, although a question may include several questions on aspects of the general question/topic, such off-topic things should be asked in a separate question.Commutate
Just check the assembly output when in doubt.Brathwaite
P
3
  1. The first version looks better to me, YMMV.

  2. Benchmark it. Depends how good is the compiler at optimizing. The "jump table" version might be faster if the compiler doesn't try hard enough to optimize.

  3. Depends on your definition of what is "to declare function inline". Go can declare and define functions/methods at the top level only. But functions are first class citizens in Go, so one can have variables/parameters/return values and structured types of function type. In all this places a function literal can [also] be assigned to the variable/field/element...

  4. Possibly. Still I would suggest to not keep the cpu state in a global variable. Once you possibly decide to go emulating multicore, it will be welcome ;-)

Pilsen answered 29/3, 2012 at 15:35 Comment(2)
thanks for the answer. should i make a test input file with 10000 operations of 80 and 81 lets say, and time them? or should i have many more opcodes in there? i can do that kind of benchmarking. about the inline thing, i was thinking of inline functions of C, so that the code of the function can be integrated to the places it is used. can i do that? so i should keep the state like i do now, in a struct. sacrifice a bit of speed for some convenience. lastly, should i keep doing it in go, or should i do it in C? i am learning go (as a new systems lang), and that is why i started it in go.Tameshatamez
- The compiler may probably generate different code for switch over [non]contiguous value ranges, ranges covering all values (256 for a byte expression) etc. Best is to experiment when one doesn't know compiler the internals (me not). - Inlining is automatic in Go for functions which fit compiler's criteria (like being small/short or consisting solely of an return expr statement, ...). - The emulator will be probably a bit faster in C (Go does e.g. array bounds checks). But I guess you'll get it working sooner/easier in Go. I would choose Go, at least you're gonna learn something new ;-)Pilsen
N
18

I did some benchmarks and the table version is faster than the switch version once you have more than about 4 cases.

I was surprised to discover that the Go compiler (gc, anyway; not sure about gccgo) doesn't seem to be smart enough to turn a dense switch into a jump table.

Update: Ken Thompson posted on the Go mailing list describing the difficulties of optimizing switch.

Nocturn answered 30/3, 2012 at 2:19 Comment(5)
I've just run some benchmarks of my own in this area and found switch-case to be an order of magnitude faster. My instruction set size is around 30 at the moment. Perhaps gc has been improved recently?Resigned
It's strange to me that a switch would be faster than an array lookup, particularly for large entries. I don't understand that. :/Callimachus
For sufficiently large switches, it's not faster. Smaller switches can be faster because they're better able to take advantage of the CPU's branch predictor and code cache.Nocturn
Another thing is that w/ jump table, you have function calls which can be costly.Nip
@Nip Function JMP cost is miniscule compared the 15-20 cycles lost to potential branch misprediction. Go with the function lookup table in every case where performance matters, especially in critical sections (inner loops).Mimimimic
P
3
  1. The first version looks better to me, YMMV.

  2. Benchmark it. Depends how good is the compiler at optimizing. The "jump table" version might be faster if the compiler doesn't try hard enough to optimize.

  3. Depends on your definition of what is "to declare function inline". Go can declare and define functions/methods at the top level only. But functions are first class citizens in Go, so one can have variables/parameters/return values and structured types of function type. In all this places a function literal can [also] be assigned to the variable/field/element...

  4. Possibly. Still I would suggest to not keep the cpu state in a global variable. Once you possibly decide to go emulating multicore, it will be welcome ;-)

Pilsen answered 29/3, 2012 at 15:35 Comment(2)
thanks for the answer. should i make a test input file with 10000 operations of 80 and 81 lets say, and time them? or should i have many more opcodes in there? i can do that kind of benchmarking. about the inline thing, i was thinking of inline functions of C, so that the code of the function can be integrated to the places it is used. can i do that? so i should keep the state like i do now, in a struct. sacrifice a bit of speed for some convenience. lastly, should i keep doing it in go, or should i do it in C? i am learning go (as a new systems lang), and that is why i started it in go.Tameshatamez
- The compiler may probably generate different code for switch over [non]contiguous value ranges, ranges covering all values (256 for a byte expression) etc. Best is to experiment when one doesn't know compiler the internals (me not). - Inlining is automatic in Go for functions which fit compiler's criteria (like being small/short or consisting solely of an return expr statement, ...). - The emulator will be probably a bit faster in C (Go does e.g. array bounds checks). But I guess you'll get it working sooner/easier in Go. I would choose Go, at least you're gonna learn something new ;-)Pilsen
T
0

If you have the ast of some expression, and you want to eval it for a big amount of data rows, then you may only once compile it into the tree of lambdas, and do not calculate any switches on each iteration at all;

For example, given such ast: {* (a, {+ (b, c)})}

Compile function (in very rough pseudo language) will be something like this:

func (e *evaluator) compile(brunch ast) {
    switch brunch.type {
    case binaryOperator:
        switch brunch.op {
        case *: return func() {compile(brunch.arg0) * compile(brunch.arg1)}
        case +: return func() {compile(brunch.arg0) + compile(brunch.arg1)}
        }
    case BasicLit: return func() {return brunch.arg0}
    case Ident: return func(){return e.GetIdent(brunch.arg0)} 
    }
}

So eventually compile returns the func, that must be called on each row of your data and there will be no switches or other calculation stuff at all. There remains the question about operations with data of different types, that is for your own research ;) This is an interesting approach, in situations, when there is no jump-table mechanism available :) but sure, func call is more complex operation then jump.

Typecase answered 13/5, 2016 at 12:17 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.