First, and most importantly, do not serialize data twice. Your database is itself a serialization of data, with a rich and expressive set of tools to query, explore, manipulate, and present it. Serializing data to be subsequently placed in a database eliminates the possibility for isolated sub-component updates, sub-component querying & indexing, and couples all writes to mandatory initial reads, for a few of the most significant issues.
Next, Java Script Object Notation (JSON) is a limited subset of the JavaScript language suitable for the representation of static data in service of data interchange. As a subset of the language, this means you can naively eval
it within JS to reconstruct the original object. It is a simple serialization (no advanced features such as internal references, template definition, type extension) with the limitations of the JavaScript language baked in and penalties for the use of strings requiring large amounts of "escaping". The use of end markers also makes it difficult to utilize in purely streaming scenarios, e.g. you can't "finalize" an object until hitting its paired }
, and as such it also has no marker for record separation. Notable examples of other limitations include delivering HTML within JSON requiring excessive escaping, all numbers are floating point (54-bit integer accuracy, rounding errors, …) making it patently unsuitable for the storage or transfer of financial information or use of technologies (e.g. crypto) requiring 64-bit integers, no native date representation, ...
There are some significant differences between JS and Python as languages, and thus in how JSON "JavaScript Object Notation" vs. PLS (Python Literal Syntax) behave. It just so happens that for the purpose of literal definition, most of JavaScript literal syntax is directly compatible with Python, albeit with slightly differing interpretations. The reverse is not true, see the above examples of disparity. If you care about preserving the fidelity of your data for Python, Python literals are more expressive and less "lossy" than their JS equivalents. However, as other answers/comments have noted, repr()
is not a reliable way to generate this representation; Python literal syntax is not meant to be used this way. For the greatest type fidelity I generally recommend YAML serialization, of which JSON is a fully valid subset.
FYI, to address the practical concern of storage of dictionary-like mappings associated with entities, there are entity-attribute-value data models. Arbitrary key-value stores in relational databases FTW, but with power comes responsibility. Use this pattern carefully and only when absolutely needed. (If this is a frequent pattern, look into document stores.)
u'
prefixes? (which is only an issue that would occur in Python 2.x, which is near-EOL). And then simply usejson.loads
, already? – Chartism