This answer (a small wrapper around ruamel.yaml
), was put into a pip module here by me after needing this functionality so frequently
TLDR
pip install ez_yaml
import ez_yaml
ez_yaml.to_string(obj=your_object , options={})
ez_yaml.to_object(file_path=your_path, options={})
ez_yaml.to_object(string=your_string , options={})
ez_yaml.to_file(your_object, file_path=your_path)
Hacky / Copy-Paste Solution to Original Question
def object_to_yaml_str(obj, options=None):
#
# setup yaml part (customize this, probably move it outside this def)
#
import ruamel.yaml
yaml = ruamel.yaml.YAML()
yaml.version = (1, 2)
yaml.indent(mapping=3, sequence=2, offset=0)
yaml.allow_duplicate_keys = True
# show null
def my_represent_none(self, data):
return self.represent_scalar(u'tag:yaml.org,2002:null', u'null')
yaml.representer.add_representer(type(None), my_represent_none)
#
# the to-string part
#
if options == None: options = {}
from io import StringIO
string_stream = StringIO()
yaml.dump(obj, string_stream, **options)
output_str = string_stream.getvalue()
string_stream.close()
return output_str
Original Answer (if you want to customize the config/options more)
import ruamel.yaml
from io import StringIO
from pathlib import Path
# setup loader (basically options)
yaml = ruamel.yaml.YAML()
yaml.version = (1, 2)
yaml.indent(mapping=3, sequence=2, offset=0)
yaml.allow_duplicate_keys = True
yaml.explicit_start = False
# show null
def my_represent_none(self, data):
return self.represent_scalar(u'tag:yaml.org,2002:null', u'null')
yaml.representer.add_representer(type(None), my_represent_none)
# o->s
def object_to_yaml_str(obj, options=None):
if options == None: options = {}
string_stream = StringIO()
yaml.dump(obj, string_stream, **options)
output_str = string_stream.getvalue()
string_stream.close()
return output_str
# s->o
def yaml_string_to_object(string, options=None):
if options == None: options = {}
return yaml.load(string, **options)
# f->o
def yaml_file_to_object(file_path, options=None):
if options == None: options = {}
as_path_object = Path(file_path)
return yaml.load(as_path_object, **options)
# o->f
def object_to_yaml_file(obj, file_path, options=None):
if options == None: options = {}
as_path_object = Path(Path(file_path))
with as_path_object.open('w') as output_file:
return yaml.dump(obj, output_file, **options)
#
# string examples
#
yaml_string = object_to_yaml_str({ (1,2): "hi" })
print("yaml string:", yaml_string)
obj = yaml_string_to_object(yaml_string)
print("obj from string:", obj)
#
# file examples
#
obj = yaml_file_to_object("./thingy.yaml")
print("obj from file:", obj)
object_to_yaml_file(obj, file_path="./thingy2.yaml")
print("saved that to a file")
Rant
I appreciate Mike Night solving the original "I just want it to return the output to the caller", and calling out that Anthon's post fails to answer the question. Which I will do further: Anthon your module is great; round trip is impressive and one of the few ones ever made. But, (this happens often on Stack Overflow) it is not the job of the author to make other people's code runtime-efficient. Explicit tradeoffs are great, an author should help people understand the consequences of their choices. Adding a warning, including "slow" in the name, etc can be very helpful. However, the methods in the ruamel.yaml documentation; creating an entire inherited class, are not "explicit". They are encumbering and obfuscating, making it difficult to perform and time consuming for others to understand what and why that additional code exists.
As for performance, the runtime of my program, without YAML, is 2 weeks. A 500,000 line yaml file is read in seconds. Both the 2 weeks and the few seconds are irrelevant to the project because they are CPU time and the project is billed purely by dev-hours. Many users rightfully care about dev time more than runtime, we are using python after all.
Even assuming runtime is critical, the YAML code was already a string object because of other other operations being performed on it. Forcing it into a stream is is actually causing more overhead. Removing the need for the string form of the YAML would involve rewriting several major libraries and potentially months of effort; making streams a highly impractical choice in this situation.
Even assuming stream input is possible, and billing by CPU time; optimizing the one time read of a 500,000-line-yaml-file would be a ≤0.001% runtime improvement. The extra hour I spent figuring out the answer to this question, and the time spent by others trying to understand the point of my boilerplate code, could have instead been spent on one of the c-functions that is being called 100 times a second for two weeks. Even when we do care about CPU time, the optimized method still can fail to be the best choice.
A stack overflow post that ignores the question while also suggesting users sink potentially large amounts of time rewriting their applications is not an answer. Respect others by assuming they generally know what they are doing and are aware of the alternatives. Then offers of potentially more-efficient methods will be met with appreciation rather than rejection.
[end rant]