How to find out if Prolog performs Tail Call Optimization
Asked Answered
Z

1

6

Using the development version of SWI Prolog (Win x64), I wrote a DCG predicate for a deterministic lexer (hosted on github) (thus all external predicates leave no choice points):

read_token(parser(Grammar, Tables),
       lexer(dfa-DFAIndex, last_accept-LastAccept, chars-Chars0),
       Token) -->
(   [Input],
    {
     dfa:current(Tables, DFAIndex, DFA),
     char_and_code(Input, Char, Code),
     dfa:find_edge(Tables, DFA, Code, TargetIndex)
    }
->  { table:item(dfa_table, Tables, TargetIndex, TargetDFA),
      dfa:accept(TargetDFA, Accept),
      atom_concat(Chars0, Char, Chars),
      NewState = lexer(dfa-TargetIndex,
                       last_accept-Accept,
                       chars-Chars)
    },
    read_token(parser(Grammar, Tables), NewState, Token)
;   {
     (   LastAccept \= none
     ->  Token = LastAccept-Chars0
     ;   (   ground(Input)
         ->  once(symbol:by_type_name(Tables, error, Index, _)),
             try_restore_input(Input, FailedInput, InputR),
             Input = [FailedInput | InputR],
             format(atom(Error), '~w', [FailedInput]),
             Token = Index-Error
         ;   once(symbol:by_type_name(Tables, eof, Index, _)),
             Token = Index-''
         )
     )
    }
).

Now using (;) and -> a lot, I was wondering SWI-Prolog can optimize the recursive read_token(parser(Grammar, Tables), NewState, Token) using Tail-Call-Optimization, or if I have to split up the predicate into several clauses manually. I just don't know how to find out what the interpreter does, especially knowing that TCO is disabled when running the debugger.

Zephyr answered 15/3, 2013 at 13:16 Comment(6)
Does the stack overflow when running the interpreter without debugging?Lanyard
Looking at your code snippet, it seems the necessary condition for TCO is met (there is no choice point open when read_token/3 invokes itself). I realize you are asking whether SWI-Prolog really takes advantage of TCO in this specific case.Lanyard
No, as I'm on x64 bit, the stack size is pretty high. I'm just asking for advice on the programming technique. I have no files big enough for the lexer to cause an Overflow I suppose (as I can debug the method just fine, where no TCO is applied)Zephyr
I could answer with command line options that reduce the stack size. That would be my rough-and-ready approach to testing for TCO.Lanyard
What you are interested in is the local stack size threshold at which your code fails. See discussion at bottom of page How do I enlarge the stacks?.Lanyard
Well. I'm more interested if my way of programming does actively prevent TCO. If so I would change all my predicates, if not, I will Keep it this way.Zephyr
C
6

To answer your question, I first looked for "trivial" goals that might prevent last call optimization. If found some:

     ;   (   ground(Input)
         ->  once(symbol:by_type_name(Tables, error, Index, _)),
             try_restore_input(Input, FailedInput, InputR),
             Input = [FailedInput | InputR],
             format(atom(Error), '~w', [FailedInput]),
             Token = Index-Error
         ;   once(symbol:by_type_name(Tables, eof, Index, _)),
             Token = Index-''
         )

In these two cases, LCO is prevented by those goals alone.

Now, I compliled your rule and looked at the expansion with listing:

?- listing(read_token).
read_token(parser(O, B), lexer(dfa-C, last_accept-T, chars-J), Q, A, S) :-
        (   A=[D|G],
            dfa:current(B, C, E),
            char_and_code(D, K, F),
            dfa:find_edge(B, E, F, H),
            N=G
        ->  table:item(dfa_table, B, H, I),
            dfa:accept(I, L),
            atom_concat(J, K, M),
            P=lexer(dfa-H, last_accept-L, chars-M),
            R=N,
            read_token(parser(O, B),
                       P,
                       Q,
                       R,
                       S)        % 1: looks nice!
        ;   (   T\=none
            ->  Q=T-J
            ;   ground(D)
            ->  once(symbol:by_type_name(B, error, W, _)),
                try_restore_input(D, U, V),
                D=[U|V],
                format(atom(X), '~w', [U]),
                Q=W-X    % 2: prevents LCO
            ;   once(symbol:by_type_name(B, eof, W, _)),
                Q=W-''   % 3: prevents LCO
            ),
            S=A    % 4: prevents LCO
        ).

ad 1) This is the recursive case you most probably are looking for. Here, everything seems nice.

ad 2,3) Already discussed above, maybe you want to exchange goals

ad 4) This is an effect of the precise, steadfast way how {}//1 is handled in DCGs. As a rule of thumb: Implementers rather prefer to be steadfast than to strive for LCO-ness. Please refer to: DCG Expansion: Is Steadfastness ignored?

Please note also that there is a lot more to this than the simple reuse of the call frame. There is a lot of interaction with garbage collection. To overcome all those problems in SWI, an additional GC phase was necessary.

For more, refer to the tiny benchmarks in Precise Garbage Collection in Prolog

So to finally answer your question: Your rule might become optimized; provided there is no choicepoint left prior to the recursive goal.


There is also the real low level approach to this. I never use this for code development: vm_list. The listing shows you ultimately whether or not SWI might consider LCO (provided no choicepoint is there).

i_call and i_callm will never perform LCO. Only i_depart will do. At: 142 i_depart(read_token/5)

?- vm_list(read_token).
========================================================================
read_token/5
========================================================================
   0 s_virgin
   1 i_exit
----------------------------------------
clause 1 ((0x1cc4710)):
----------------------------------------
   0 h_functor(parser/2)
   2 h_firstvar(5)
   4 h_firstvar(6)
   6 h_pop
   7 h_functor(lexer/3)
   9 h_functor((-)/2)
  11 h_const(dfa)
  13 h_firstvar(7)
  15 h_pop
  16 h_functor((-)/2)
  18 h_const(last_accept)
  20 h_firstvar(8)
  22 h_pop
  23 h_rfunctor((-)/2)
  25 h_const(chars)
  27 h_firstvar(9)
  29 h_pop
  30 i_enter
  31 c_ifthenelse(26,118)
  34 b_unify_var(3)
  36 h_list_ff(10,11)
  39 b_unify_exit
  40 b_var(6)
  42 b_var(7)
  44 b_firstvar(12)
  46 i_callm(dfa,dfa:current/3)
  49 b_var(10)
  51 b_firstvar(13)
  53 b_firstvar(14)
  55 i_call(char_and_code/3)
  57 b_var(6)
  59 b_var(12)
  61 b_var(14)
  63 b_firstvar(15)
  65 i_callm(dfa,dfa:find_edge/4)
  68 b_unify_fv(16,11)
  71 c_cut(26)
  73 b_const(dfa_table)
  75 b_var(6)
  77 b_var(15)
  79 b_firstvar(17)
  81 i_callm(table,table:item/4)
  84 b_var(17)
  86 b_firstvar(18)
  88 i_callm(dfa,dfa:accept/2)
  91 b_var(9)
  93 b_var(13)
  95 b_firstvar(19)
  97 i_call(atom_concat/3)
  99 b_unify_firstvar(20)
 101 b_functor(lexer/3)
 103 b_functor((-)/2)
 105 b_const(dfa)
 107 b_argvar(15)
 109 b_pop
 110 b_functor((-)/2)
 112 b_const(last_accept)
 114 b_argvar(18)
 116 b_pop
 117 b_rfunctor((-)/2)
 119 b_const(chars)
 121 b_argvar(19)
 123 b_pop
 124 b_unify_exit
 125 b_unify_fv(21,16)
 128 b_functor(parser/2)
 130 b_argvar(5)
 132 b_argvar(6)
 134 b_pop
 135 b_var(20)
 137 b_var2
 138 b_var(21)
 140 b_var(4)
 142 i_depart(read_token/5)
 144 c_var_n(22,2)
 147 c_var_n(24,2)
 150 c_jmp(152)
 152 c_ifthenelse(27,28)
 155 b_var(8)
 157 b_const(none)
 159 i_call((\=)/2)
 161 c_cut(27)
 163 b_unify_var(2)
 165 h_functor((-)/2)
 167 h_var(8)
 169 h_var(9)
 171 h_pop
 172 b_unify_exit
 173 c_var(10)
 175 c_var_n(22,2)
 178 c_var_n(24,2)
 181 c_jmp(101)
 183 c_ifthenelse(28,65)
 186 b_firstvar(10)
 188 i_call(ground/1)
 190 c_cut(28)
 192 b_functor((:)/2)
 194 b_const(symbol)
 196 b_rfunctor(by_type_name/4)
 198 b_argvar(6)
 200 b_const(error)
 202 b_argfirstvar(22)
 204 b_void
 205 b_pop
 206 i_call(once/1)
 208 b_var(10)
 210 b_firstvar(23)
 212 b_firstvar(24)
 214 i_call(try_restore_input/3)
 216 b_unify_var(10)
 218 h_list
 219 h_var(23)
 221 h_var(24)
 223 h_pop
 224 b_unify_exit
 225 b_functor(atom/1)
 227 b_argfirstvar(25)
 229 b_pop
 230 b_const('~w')
 232 b_list
 233 b_argvar(23)
 235 b_nil
 236 b_pop
 237 i_call(format/3)
 239 b_unify_var(2)
 241 h_functor((-)/2)
 243 h_var(22)
 245 h_var(25)
 247 h_pop
 248 b_unify_exit
 249 c_jmp(33)
 251 b_functor((:)/2)
 253 b_const(symbol)
 255 b_rfunctor(by_type_name/4)
 257 b_argvar(6)
 259 b_const(eof)
 261 b_argfirstvar(22)
 263 b_void
 264 b_pop
 265 i_call(once/1)
 267 b_unify_var(2)
 269 h_functor((-)/2)
 271 h_var(22)
 273 h_const('')
 275 h_pop
 276 b_unify_exit
 277 c_var(10)
 279 c_var_n(23,2)
 282 c_var(25)
 284 b_unify_vv(4,3)
 287 c_var_n(11,2)
 290 c_var_n(13,2)
 293 c_var_n(15,2)
 296 c_var_n(17,2)
 299 c_var_n(19,2)
 302 c_var(21)
 304 i_exit
Cachalot answered 15/3, 2013 at 16:7 Comment(7)
The choicepoint is left no matter if Token (S in your compiled Output) is always unbound, right?Zephyr
The code itself does not produce a superfluous choice point. But maybe table:item or dfa:accept do leave an open choice point. It all depends. You can easily observe this by calling statistics immediately at the end: How much is left on the local stack?Cachalot
oh! I forgot to mention, all the used predicates no not leave any choice pointsZephyr
I like your tip about listing by the way, I always look at the DCG code as the source code, I sometimes Forget about the term_Expansion syntax sugar.Zephyr
You make some good points about using listing and statistics, but I don't understand the conclusion as to choicepoints %2,%3,%4. These are not "open" when read_token calls itself. The call to read_token is on one branch of the if-then-else construct. Prolog commits to the one branch and does not backtrack to the other, so AFAIK last-call optimization will work. Perhaps we need to see the "local stack" statistics to be sure.Lanyard
At %2, %3, %4 it's not about choicepoints. But these are constructs that prevent that the last call to a predicate is an i_depart which is the only place where LCO can happen - statically. Choicepoints are a dynamic issue.Cachalot
@hardmath: The recursive read_token call is an i_depart which means that this might be optimized.Cachalot

© 2022 - 2024 — McMap. All rights reserved.