Python serializable objects json [duplicate]
Asked Answered
C

4

33
class gpagelet:
    """
    Holds   1) the pagelet xpath, which is a string
            2) the list of pagelet shingles, list
    """
    def __init__(self, parent):
        if not isinstance( parent, gwebpage):
            raise Exception("Parent must be an instance of gwebpage")
        self.parent = parent    # This must be a gwebpage instance
        self.xpath = None       # String
        self.visibleShingles = [] # list of tuples
        self.invisibleShingles = [] # list of tuples
        self.urls = [] # list of string

class gwebpage:
    """
    Holds all the datastructure after the results have been parsed
    holds:  1) lists of gpagelets
            2) loc, string, location of the file that represents it
    """
    def __init__(self, url):
        self.url = url              # Str
        self.netloc = False         # Str
        self.gpagelets = []         # gpagelets instance
        self.page_key = ""          # str

Is there a way for me to make my class json serializable? The thing that I am worried is the recursive reference.

Crutchfield answered 22/9, 2009 at 6:37 Comment(3)
this answer might be helpful: https://mcmap.net/q/41980/-encoding-nested-python-object-in-jsonPapagena
you question title is very vague. You should improve it.Overreact
same question with a closed answer: https://mcmap.net/q/111055/-convert-dynamic-python-object-to-json-duplicateChen
C
53

Write your own encoder and decoder, which can be very simple like return __dict__

e.g. here is a encoder to dump totally recursive tree structure, you can enhance it or use as it is for your own purpose

import json

class Tree(object):
    def __init__(self, name, childTrees=None):
        self.name = name
        if childTrees is None:
            childTrees = []
        self.childTrees = childTrees

class MyEncoder(json.JSONEncoder):
    def default(self, obj):
        if not isinstance(obj, Tree):
            return super(MyEncoder, self).default(obj)

        return obj.__dict__

c1 = Tree("c1")
c2 = Tree("c2") 
t = Tree("t",[c1,c2])

print json.dumps(t, cls=MyEncoder)

it prints

{"childTrees": [{"childTrees": [], "name": "c1"}, {"childTrees": [], "name": "c2"}], "name": "t"}

you can similarly write a decoder but there you will somehow need to identify is it is your object or not, so may be you can put a type too if needed.

Chub answered 22/9, 2009 at 8:6 Comment(4)
Documentation for simplejson explicitly says that you should call JSONEncoder.default() to raise TypeError, so I think it would be better to replace your raise with a call to that.Browbeat
Or even better, implement your own [simple]json.JSONEncoder sub-class and overwrite the default method with a version that return a serializable representation of your objects or calls JSONEncoder.default for all other types. See docs.python.org/library/json.html#json.JSONEncoder.Freesia
@ChrisArndt isn't that what Anurag's above method does?Reger
@yourfiendzak My comment is older than the last edit of the answer, so I was probabyl referring to an earlier version.Freesia
B
6

Indirect answer: instead of using JSON, you could use YAML, which has no problem doing what you want. (JSON is essentially a subset of YAML.)

Example:

import yaml
o1 = gwebpage("url")
o2 = gpagelet(o1)
o1.gpagelets = [o2]
print yaml.dump(o1)

In fact, YAML nicely handles cyclic references for you.

Brown answered 22/9, 2009 at 7:48 Comment(5)
Don't ever unpickle data you don't trust!Usurer
Interesting article, but there is no unpickling in this answer, only pickling (i.e. no load(), but dump()).Brown
Indeed but it is worth keeping in mind. Besides, why would you pickle something unless you planned to use it later?...Usurer
Indeed. However, it is perfectly safe to load() the YAML dumped by the code above (it cannot lead to the interpretation of Python code, bar a bug in PyYAML, as the source code shows [no Python code injection…]).Brown
Yes, we are in agreement: it is safe in this case but not necessarily in all cases. I am being paranoid and extrapolating use from your example. Thus just (what started as) a small comment.Usurer
A
3

I implemented a very simple todict method with the help of https://mcmap.net/q/47561/-iterate-over-object-attributes-in-python-duplicate

  • Iterate over properties that is not starts with __
  • Eliminate methods
  • Eliminate some properties manually which is not necessary (for my case, coming from sqlalcemy)

And used getattr to build dictionary.

class User(Base):
    id = Column(Integer, primary_key=True)
    firstname = Column(String(50))
    lastname = Column(String(50))
    password = Column(String(20))
    def props(self):
        return filter(
            lambda a:
            not a.startswith('__')
            and a not in ['_decl_class_registry', '_sa_instance_state', '_sa_class_manager', 'metadata']
            and not callable(getattr(self, a)),
            dir(self))
    def todict(self):
        return {k: self.__getattribute__(k) for k in self.props()}
Alible answered 6/3, 2016 at 21:0 Comment(0)
S
2

My solution for this was to extend the 'dict' class and perform checks around required/allowed attributes by overriding init, update, and set class methods.

class StrictDict(dict):
    required=set()
    at_least_one_required=set()
    cannot_coexist=set()
    allowed=set()
    def __init__(self, iterable={}, **kwargs):
        super(StrictDict, self).__init__({})
        keys = set(iterable.keys()).union(set(kwargs.keys()))
        if not keys.issuperset(self.required):
            msg = str(self.__class__.__name__) + " requires: " + str([str(key) for key in self.required])
            raise AttributeError(msg)
        if len(list(self.at_least_one_required)) and len(list(keys.intersection(self.at_least_one_required))) < 1:
            msg = str(self.__class__.__name__) + " requires at least one: " + str([str(key) for key in self.at_least_one_required])
            raise AttributeError(msg)
        for key, val in iterable.iteritems():
            self.__setitem__(key, val)
        for key, val in kwargs.iteritems():
            self.__setitem__(key, val)

    def update(self, E=None, **F):
        for key, val in E.iteritems():
            self.__setitem__(key, val)
        for key, val in F.iteritems():
            self.__setitem__(key, val)
        super(StrictDict, self).update({})

    def __setitem__(self, key, value):
        all_allowed = self.allowed.union(self.required).union(self.at_least_one_required).union(self.cannot_coexist)
        if key not in list(all_allowed):
            msg = str(self.__class__.__name__) + " does not allow member '" + key + "'"
            raise AttributeError(msg)
        if key in list(self.cannot_coexist):
            for item in list(self.cannot_coexist):
                if key != item and item in self.keys():
                    msg = str(self.__class__.__name__) + "does not allow members '" + key + "' and '" + item + "' to coexist'"
                    raise AttributeError(msg)
        super(StrictDict, self).__setitem__(key, value)

Example usage:

class JSONDoc(StrictDict):
    """
    Class corresponding to JSON API top-level document structure
    http://jsonapi.org/format/#document-top-level
    """
    at_least_one_required={'data', 'errors', 'meta'}
    allowed={"jsonapi", "links", "included"}
    cannot_coexist={"data", "errors"}
    def __setitem__(self, key, value):
        if key == "included" and "data" not in self.keys():
            msg = str(self.__class__.__name__) + " does not allow 'included' member if 'data' member is not present"
            raise AttributeError(msg)
        super(JSONDoc, self).__setitem__(key, value)

json_doc = JSONDoc(
    data={
        "id": 5,
        "type": "movies"
    },
    links={
        "self": "http://url.com"
    }
)
Simplex answered 25/6, 2015 at 21:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.