Error when parsing yaml file : found character '%' that cannot start any token
Asked Answered
C

3

9

I am trying to parse data from yaml file having some expressions similar to jinaj2 template syntax, the goal is to delete or add some items to the file.

AddCodesList.yaml

AddCodesList:
  body:
    list:
    {% for elt in customer %}
      - code: {{ elt.code }}
        name: {{ elt.name }}
        country: {{ elt.country }}
    {% endfor %}   
  result:
    json:
      responseCode: {{ responseCode }}
      responseMsg: {{ responseMsg }}
      responseData: {{ responseData }}

parseFile.py

import ruamel.yaml
from ruamel.yaml.util import load_yaml_guess_indent

data,indent,block_seq_indent=load_yaml_guess_indent(open('AddCodesList.yaml'), preserve_quotes=True)

#delete item
del data['body']['list']['code']
#add new item
data['parameters'].insert(2, 'ssl_password','xxxxxx')#create new file
ruamel.yaml.round_trip_dump(data, open('missingCode.yaml', 'w'), explicit_start=True)

I have the following error when executing the parseFile.py script:

    Traceback (most recent call last):
      File "d:/workspace/TEST/manageItem.py", line 4, in <module>
        data, indent, block_seq_indent = load_yaml_guess_indent(open('AddCodesList.
...
        if self.check_token(ValueToken):
      File "C:\Python34\lib\site-packages\ruamel\yaml\scanner.py", line 1534, in ch
        self.fetch_more_tokens()
      File "C:\Python34\lib\site-packages\ruamel\yaml\scanner.py", line 269, in fet
        % utf8(ch), self.get_mark())
    ruamel.yaml.scanner.ScannerError: while scanning for the next token
    found character '%' that cannot start any token
      in "<unicode string>", line 4, column 6:
            {% for elt in customer %}
             ^ (line: 4)
Canvasback answered 16/2, 2017 at 9:27 Comment(0)
C
3

The problem was solved with the following structure:

AddCodesList:
  body:
    list:
    # {% for elt in customer %}
      - code: "{{ elt.code }}"
        name: "{{ elt.name }}"
        country: "{{ elt.country }}"
    # {% endfor %}  
Canvasback answered 16/2, 2017 at 16:4 Comment(0)
R
6

In YAML the '{' starts a flow style mapping, so (%) is going to be the start of the first key of that mapping and that character is not allowed as the first character.

Normally you would process the templates of the file first and then apply YAML. You cannot easily reverse that process, as the value for list would have to be a valid YAML construct.

One of the solutions to make it parseable is to change the value for list to valid YAML like:

list:
  - {% for elt in customer %}
  - code: {{ elt.code }}
    name: {{ elt.name }}
    country: {{ elt.country }}
  - {% endfor %} 

or:

list: |
    {% for elt in customer %}
      - code: {{ elt.code }}
        name: {{ elt.name }}
        country: {{ elt.country }}
    {% endfor %} 

and that would no longer make it templateable bij jinja2.

You can change the start sequence in jinja2 from {% but that doensn't help you (i.e. you still would not get valid YAML). The only real solution I see at the moment is to drop the jinja2 completely and implement the for loop using some list like object in Python (that gets expanded on access).

If it is allowable to always preprocess before applying jinja2, you can change the file to:

AddCodesList:
  body:
    list:
    # {% for elt in customer %}
      - code: '{{ elt.code }}'
        name: '{{ elt.name }}'
        country: '{{ elt.country }}'
    # {% endfor %}   

as that would load, but you might need to change the # b{ to just { before running your template engine.

Quote with single quotes as between those only the single quote has a special meaning. With double quotes you more often will get something inserted by the pre-processor that makes things incorrect YAML (e.g. DOS/Windows style full-file-paths: 'C:\yaml\abc.yaml' is correct but "c:\yaml\abc.yaml" will give you a error during YAML parsing.

Rarotonga answered 16/2, 2017 at 10:20 Comment(2)
Thank you Anthon, the last solution seems acceptable with a small improvement to be validated by yaml parser : Add double quotes for items containing "{{ ... }}" as following: country: "{{ elt.country }}"Canvasback
@Canvasback Yes, the quotes are of course necessary. I had not tested that "proposal" and only had looked at the {% lines. Fortunately ruamel.yaml preserves the double quotes when you round-trip with preserve_quotes=True :-)Rarotonga
C
3

The problem was solved with the following structure:

AddCodesList:
  body:
    list:
    # {% for elt in customer %}
      - code: "{{ elt.code }}"
        name: "{{ elt.name }}"
        country: "{{ elt.country }}"
    # {% endfor %}  
Canvasback answered 16/2, 2017 at 16:4 Comment(0)
L
0

What about a pre and post parse substitution?

pre_parse_yml () {
  sed -i 's/\([-:= ]\)\([^-:= "]*\){{\(.*\)/\1"\2{#{\3"/g; s/{%/#{%/g' $1
}
post_parse_yml () {
  sed -i 's/"\([^{\n\r]*{\)#\({.*\)"/\1\2/g; s/#{%/{%/g' $1
}
Logistician answered 27/6, 2023 at 8:40 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.