Converting a YAML file to JSON object in Python
Asked Answered
L

6

41

How can I load a YAML file and convert it to a Python JSON object?

My YAML file looks like this:

Section:
    heading: Heading 1
    font: 
        name: Times New Roman
        size: 22
        color_theme: ACCENT_2

SubSection:
    heading: Heading 3
    font:
        name: Times New Roman
        size: 15
        color_theme: ACCENT_2
Paragraph:
    font:
        name: Times New Roman
        size: 11
        color_theme: ACCENT_2
Table:
    style: MediumGrid3-Accent2
Lobster answered 13/6, 2018 at 21:19 Comment(0)
R
42

The PyYAML library is intended for this purpose

pip install pyyaml
import yaml
import json
with open("example.yaml", 'r') as yaml_in, open("example.json", "w") as json_out:
    yaml_object = yaml.safe_load(yaml_in) # yaml_object will be a list or a dict
    json.dump(yaml_object, json_out)

Notes: PyYAML only supports the pre-2009, YAML 1.1 specification.
ruamel.yaml is an option if YAML 1.2 is required.

pip install ruamel.yaml
Roughrider answered 13/6, 2018 at 21:29 Comment(5)
I agree, it's a more clear answer. I'll leave my answer here since it includes the file handling part, although it was not asked specifically for it is probably needed more often than not.Roughrider
I copied the pip install part when you mentioned the other answer being cleared, thanks.Roughrider
PyYAML's load() is documented to be unsafe, and there is no excuse for using it instead of safe_load() here (or almost anywhere else). You fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification.Bathulda
I had no idea, I'll include it in the answer, thanks.Roughrider
For tracking the PyYaml specs: github.com/yaml/pyyaml/issues/116Runofthemine
F
37

you can use PyYAML

pip install PyYAML

And in the ipython console:

In [1]: import yaml

In [2]: document = """Section:
   ...:     heading: Heading 1
   ...:     font: 
   ...:         name: Times New Roman
   ...:         size: 22
   ...:         color_theme: ACCENT_2
   ...: 
   ...: SubSection:
   ...:     heading: Heading 3
   ...:     font:
   ...:         name: Times New Roman
   ...:         size: 15
   ...:         color_theme: ACCENT_2
   ...: Paragraph:
   ...:     font:
   ...:         name: Times New Roman
   ...:         size: 11
   ...:         color_theme: ACCENT_2
   ...: Table:
   ...:     style: MediumGrid3-Accent2"""
   ...:     

In [3]: yaml.load(document)
Out[3]: 
{'Paragraph': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 11}},
 'Section': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 22},
  'heading': 'Heading 1'},
 'SubSection': {'font': {'color_theme': 'ACCENT_2',
   'name': 'Times New Roman',
   'size': 15},
  'heading': 'Heading 3'},
 'Table': {'style': 'MediumGrid3-Accent2'}}
Finecut answered 13/6, 2018 at 21:27 Comment(3)
Not only what the previous comment said, but you're using IPython's console, and not plain Python console ;)Alverson
1) Where is the JSON the OP requested? JSON strings have double quotes. 2) PyYAML's load() is documented to be unsafe, and there is no excuse for using it instead of safe_load() here (or almost anywhere else). 3) You fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification.Bathulda
1. what did you mean about the JSON type in Python, may be you can help me to read about it. is the dict. Other comments is good and interesting as your answer, thank you.Finecut
B
17

There is no such thing as a Python JSON object. JSON is a language independent file format that finds its roots in JavaScript, and is supported by many languages.

If your YAML document adheres to the old 1.1 standard, i.e. pre-2009, you can use PyYAML as suggested by some of the other answers.

If it uses the newer YAML 1.2 specification, which made YAML into a superset of JSON, you should use ruamel.yaml (disclaimer: I am the author of that package, which is a fork of PyYAML).

import ruamel.yaml
import json

in_file = 'input.yaml'
out_file = 'output.json'

yaml = ruamel.yaml.YAML(typ='safe')
with open(in_file) as fpi:
    data = yaml.load(fpi)
with open(out_file, 'w') as fpo:
    json.dump(data, fpo, indent=2)

which generates output.json:

{
  "Section": {
    "heading": "Heading 1",
    "font": {
      "name": "Times New Roman",
      "size": 22,
      "color_theme": "ACCENT_2"
    }
  },
  "SubSection": {
    "heading": "Heading 3",
    "font": {
      "name": "Times New Roman",
      "size": 15,
      "color_theme": "ACCENT_2"
    }
  },
  "Paragraph": {
    "font": {
      "name": "Times New Roman",
      "size": 11,
      "color_theme": "ACCENT_2"
    }
  },
  "Table": {
    "style": "MediumGrid3-Accent2"
  }
}

ruamel.yaml, apart from supporting YAML 1.2, has many PyYAML bugs fixed. You should also note that PyYAML's load() is also documented to be unsafe, if you don't have full control over the input at all times. PyYAML also loads scalar numbers 021 as integer 17 instead of 21 and converts scalar strings like on, yes, off to boolean values (resp. True, True and False).

Bathulda answered 14/6, 2018 at 7:15 Comment(5)
Thanks for developing a modern and higher-quality package. PyYAML is not abandonware, the latest release (5.4.1) is Jan 2021. Did you submit PRs only to have them rejected, or did you fork without trying to fix the original?Vallejo
@Vallejo PRs were not rejected, there was just no answer for several years and then the project was moved and open PRs and issues dropped in the process I haven't seen anything that IMO warrants a major version number change since 3.12, and PyYAML is still on the standard that was superseded in 2009.Bathulda
Thanks. For some reason I cannot install pyyaml with pip... It looks like it works fine but cannot import it in any scripts... but ruamel.yaml works fine and can import it so going with that ;) - PS: I really wish you did not name it with a dot in the name... messes with IDEs no endSistrunk
@Sistrunk What IDE are you using that cannot handle Python namespaces? Maybe you should post a question here on StackOverflow, there might be a workaround. ( myself develop using kakoune and its LSP under Linux, and don't have any problems).Bathulda
PyCharm (JetBrains) which is a very popular, very good IDE. It could be something in my environment to do with recently installing anaconda and messing up my PATHs etc... But I don't have any issues with any other packages and I code a lot on many different projects daily.. (and purged Anaconda from PATH)Sistrunk
G
5

In python3 you can use pyyaml.

$ pip3 install pyyaml

Then you load your yaml file and dump it into json:

import yaml, json

with open('./file.yaml') as f:
    print(json.dumps(yaml.load(f)))

Output:

{"Section": null, "heading": "Heading 1", "font": {"name": "Times New Roman", "size": 22, "color_theme": "ACCENT_2"}, "SubSection": {"heading": "Heading 3", "font": {"name": "Times New Roman", "size": 15, "color_theme": "ACCENT_2"}}, "Paragraph": {"font": {"name": "Times New Roman", "size": 11, "color_theme": "ACCENT_2"}}, "Table": {"style": "MediumGrid3-Accent2"}}
Graehl answered 13/6, 2018 at 21:39 Comment(1)
PyYAML's load() is documented to be unsafe, and there is no excuse for using it instead of safe_load() here (or almost anywhere else). Like many others you fail to mention that PyYAML only supports the old, pre-2009, YAML 1.1 specification.Bathulda
B
1

For what it's worth, here is a shell alias based on ruamel.yaml that works as a filter:

pip3 install ruamel.yaml
alias yaml2json="python3 -c 'import json, sys, ruamel.yaml as Y; print(json.dumps(Y.YAML(typ=\"safe\").load(sys.stdin), indent=2))'"

Usage:

yaml2json < foo.yaml > foo.json
Brownell answered 17/8, 2022 at 13:5 Comment(0)
L
1

Here is a simple solution of how to do it without saving the json to a file:

import yaml
import json

with open("your_yaml_file.yaml") as f:       
    yaml_obj = yaml.safe_load(f) 
    json_str = json.dumps(yaml_obj)
    json_dict = json.loads(json_str)
    print(json_dict)
Lancer answered 29/3, 2023 at 10:12 Comment(1)
if you want to do this you might as well do json.dump(yaml_obj, sys.stdout)Espinal

© 2022 - 2024 — McMap. All rights reserved.