Get traceback information including SyntaxError from compile()
Asked Answered
P

1

8

Basic problem

It appears that SyntaxErrors (and TypeErrors) raised by the compile() function are not included in the stack trace returned by sys.exc_info(), but are printed as part of the formatted output using traceback.print_exc.

Example

For example, given the following code (where filename is the name of a file that contains Python code with the line $ flagrant syntax error):

import sys
from traceback import extract_tb
try:
    with open(filename) as f:
        code = compile(f.read(), filename, "exec")
except:
    print "using sys.exc_info:"
    tb_list = extract_tb(sys.exc_info()[2])
    for f in tb_list:
        print f
    print "using traceback.print_exc:"
    from traceback import print_exc
    print_exc()

I get the following output (where <scriptname> is the name of the script containing the code above):

using sys.exc_info:
('<scriptname>', 6, 'exec_file', 'code = compile(f.read(), filename, "exec")')
using traceback.print_exc:
Traceback (most recent call last):
  File "<scriptname>", line 6, in exec_file
    code = compile(f.read(), filename, "exec")
  File "<filename>", line 3
    $ flagrant syntax error
    ^
SyntaxError: invalid syntax

My questions

I have three questions:

  • Why doesn't the traceback from sys.exc_info() include the frame where the SyntaxError was generated?
  • How does traceback.print_exc get the missing frame info?
  • What's the best way to save the "extracted" traceback info, including the SyntaxError, in a list? Does the last list element (i.e. the info from the stack frame representing where the SyntaxError occurred) need to be manually constructed using filename and the SyntaxError exception object itself?

Example use-case

For context, here's the my use-case for trying to get a complete stack-trace extraction.

I have a program that essentially implements a DSL by execing some files containing user-written Python code. (Regardless of whether or not this is a good DSL-implementation strategy, I'm more or less stuck with it.) Upon encountering an error in the user code, I would (in some cases) like the interpreter to save the error for later rather than vomiting up a stack trace and dying. So I have a ScriptExcInfo class specifically made to store this information. Here's (a slightly edited version of) that class's __init__ method, complete with a rather ugly workaround for the problem described above:

def __init__(self, scriptpath, add_frame):
    self.exc, tb = sys.exc_info()[1:]
    self.tb_list = traceback.extract_tb(tb)
    if add_frame:
        # Note: I'm pretty sure that the names of the linenumber and
        # message attributes are undocumented, and I don't think there's
        # actually a good way to access them.
        if isinstance(exc, TypeError):
            lineno = -1
            text = exc.message
        else:
            lineno = exc.lineno
            text = exc.text
        # '?' is used as the function name since there's no function context
        # in which the SyntaxError or TypeError can occur.
        self.tb_list.append((scriptpath, lineno, '?', text))
    else:
        # Pop off the frames related to this `exec` infrastructure.
        # Note that there's no straightforward way to remove the unwanted
        # frames for the above case where the desired frame was explicitly
        # constructed and appended to tb_list, and in fact the resulting
        # list is out of order (!!!!).
        while scriptpath != self.tb_list[-1][0]:
            self.tb_list.pop()

Note that by "rather ugly," I mean that this workaround turns what should be a single-argument, 4-line __init__ function to require two arguments and take up 13 lines.

Pekoe answered 8/12, 2014 at 18:52 Comment(4)
....this can't be THAT hard, can it? Come on, where's Martjin??Pekoe
“Does the last list element (i.e. the info from the stack frame representing where the SyntaxError occurred) need to be manually constructed using filename and the SyntaxError exception object itself?” From the docstring on print_exception: “This differs from print_tb() in the following ways: […] (3) if type is SyntaxError and value has the appropriate format, it prints the line where the syntax error occurred with a caret on the next line indicating the approximate position of the error.”Renege
In other words, yes, it is synthesized in print_exc and not really in the traceback. You will have to inspect the exception object and reconstruct it yourself, or reuse some code from traceback for that purpose.Renege
Huh. I saw that, but I didn't realize it meant that the entire frame is missing from the traceback given by exc_info.Pekoe
J
8

The only difference between your two approaches is that print_exc() prints a formatted exception. For a SyntaxError that includes formatting the information in that exception, which includes the actual line that caused the problem.

For the traceback itself, print_exc() uses sys.exc_info()[2], the same information you are using already to produce the traceback. In other words, it doesn't get any more information than you already do, but you are ignoring the exception information itself:

>>> import traceback
>>> try:
...     compile('Hmmm, nope!', '<stdin>', 'exec')
... except SyntaxError as e:
...     print ''.join(traceback.format_exception_only(type(e), e))
... 
  File "<stdin>", line 1
    Hmmm, nope!
              ^
SyntaxError: invalid syntax

Here traceback.format_exception_only() is an undocumented function used by traceback.print_exc() to format the exception value. All the information is there for you to extract on the exception itself:

>>> dir(e)
['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__getitem__', '__getslice__', '__hash__', '__init__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__setstate__', '__sizeof__', '__str__', '__subclasshook__', '__unicode__', 'args', 'filename', 'lineno', 'message', 'msg', 'offset', 'print_file_and_line', 'text']
>>> e.args
('invalid syntax', ('<stdin>', 1, 11, 'Hmmm, nope!\n'))
>>> e.filename, e.lineno, e.offset, e.text
('<stdin>', 1, 11, 'Hmmm, nope!\n')

Also see the documentation of traceback.print_exception():

(3) if type is SyntaxError and value has the appropriate format, it prints the line where the syntax error occurred with a caret indicating the approximate position of the error.

and the SyntaxError documentation:

Instances of this class have attributes filename, lineno, offset and text for easier access to the details. str() of the exception instance returns only the message.

That the line with the syntax error is not included in the traceback is only logical; code with a syntax error cannot be executed, so no execution frame was ever created for it. And the exception is being thrown by the compile() function, the bottom-most frame of execution.

As such you are stuck with your 'ugly' approach; it is the correct approach for handling SyntaxError exceptions. However, the attributes are documented.

Note that exception.message is normally set to exception.args[0], and str(exception) usually gives you the same message (if args is longer you get str(exception.args) instead, although some exception types provide a custom __str__ that often just gives you exception.args[0]).

Jocelin answered 11/12, 2014 at 8:24 Comment(1)
"code with a syntax error cannot be executed, so no execution frame was ever created for it." ...Ah. Of course. Thank you.Pekoe

© 2022 - 2024 — McMap. All rights reserved.