How can I parse Python's triple-quote f-strings?
Asked Answered
P

1

1

I have this code that parses and processes normal "f-string" template strings (See the usage part below for an example):

from string import Formatter
import sys


_conversions = {'a': ascii, 'r': repr, 's': str}

def z(template, locals_=None):
    if locals_ is None:
        previous_frame = sys._getframe(1)
        previous_frame_locals = previous_frame.f_locals
        locals_ = previous_frame_locals
        # locals_ = globals()
    result = []
    parts = Formatter().parse(template)
    for part in parts:
        literal_text, field_name, format_spec, conversion = part
        if literal_text:
            result.append(literal_text)
        if not field_name:
            continue
        value = eval(field_name, locals_) #.__format__()
        if conversion:
            value = _conversions[conversion](value)
        if format_spec:
            value = format(value, format_spec)
        else:
            value = str(value)
        result.append(value)
    res = ''.join(result)
    return res

Usage:

a = 'World'
b = 10
z('Hello {a} --- {a:^30} --- {67+b} --- {a!r}')
# "Hello World ---             World              --- 77 --- 'World'"

But it doesn't work if the template string is something like this:

z('''
echo monkey {z("curl -s https://www.poemist.com/api/v1/randompoems | jq --raw-output '.[0].content'")} end | sed -e 's/monkey/start/'
echo --------------
''')

It gives this error:

  File "<string>", line 1
    z("curl -s https
                   ^
SyntaxError: EOL while scanning string literal

I am willing to even copy code from Python's source code to get this to work, if it's not possible normally.

Photoflash answered 13/4, 2020 at 12:34 Comment(5)
If you want to parse Python code, you can have a look at the ast module. It allows you to parse your string as if it were a regular f-string: ast.parse('f"Hello, {a} --- {67+b}"'). Then you want the generated tree and process it the way you wantTwoway
Colons have a special meaning inside of the {} in format strings. You need to pull the curl part out into a separate variable instead of nesting calls to z().Rabbitfish
@Rabbitfish No, it works in triple-quoted f-strings. I have checked. (The : is quoted in them.)Photoflash
@Twoway Your approach seems great. Is there a way to eval nodes of the parsed ast? For example _ast.FormattedValue to a string?Photoflash
@HappyFace, you can ast.dump(tree_node) to see what attributes each node has. Then walk the tree with a subclass of ast.NodeVisitor and check the attributes of each node. For f"{a}" you can retrieve the string a like this: FormattedValue_node.value.id. For more details, see Python's grammar.Twoway
P
1

Thanks to the tip by @ForceBru, I finished this. The following code parses and processes source tripe-quote f-strings: (Ignore the process parts)

_conversions = {'a': ascii, 'r': repr, 's': str}

def zstring(self, template, locals_=None, getframe=1):
    if locals_ is None:
        previous_frame = sys._getframe(getframe)
        previous_frame_locals = previous_frame.f_locals
        locals_ = previous_frame_locals

    def asteval(astNode):
        if astNode is not None:
            return eval(compile(ast.Expression(astNode), filename='<string>', mode='eval'), locals_)
        else:
            return None

    def eatFormat(format_spec, code):
        res = False
        if format_spec:
            flags = format_spec.split(':')
            res = code in flags
            format_spec = list(filter(lambda a: a != code,flags))
        return ':'.join(format_spec), res


    p = ast.parse(f"f'''{template}'''")
    result = []
    parts = p.body[0].value.values
    for part in parts:
        typ = type(part)
        if typ is ast.Str:
            result.append(part.s)
        elif typ is ast.FormattedValue:
            # print(part.__dict__)

            value = asteval(part.value)
            conversion = part.conversion
            if conversion >= 0:
                # parser doesn't support custom conversions
                conversion = chr(conversion)
                value = self._conversions[conversion](value)

            format_spec = asteval(part.format_spec) or ''
            # print(f"orig format: {format_spec}")
            format_spec, fmt_eval = eatFormat(format_spec, 'e')
            format_spec, fmt_bool = eatFormat(format_spec, 'bool')
            # print(f"format: {format_spec}")
            if format_spec:
                value = format(value, format_spec)
            if fmt_bool:
                value = boolsh(value)

            value = str(value)
            if not fmt_eval:
                value = self.zsh_quote(value)
            result.append(value)
    cmd = ''.join(result)
    return cmd
Photoflash answered 13/4, 2020 at 15:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.