Json dumping a dict throws TypeError: keys must be a string
Asked Answered
S

9

22

I am attempting to convert the following dict into JSON using json.dumps:

 {
     'post_engaged': 36,
     'post_impressions': 491,
     'post_story': 23,
     'comment_count': 6,
     'created_time': '03:02 AM, Sep 30, 2012',
     'message': 'Specialities of Shaktis and Pandavas. \n While having power, why there isn\\u2019t',
     < built - in function id > : '471662059541196',
     'status_type': 'status',
     'likes_count': 22
 } {
     'post_engaged': 24,
     'text': '30 Sept 2012 Avyakt Murlli ( Dual Voice )',
     'post_story': 8,
     'comment_count': 3,
     'link': 'http:\\/\\/www.youtube.com\\/watch?v=VGmFj8g7JFA&feature=youtube_gdata_player',
     'post_impressions': 307,
     'created_time': '03:04 AM, Sep 30, 2012',
     'message': 'Not available',
     < built - in function id > : '529439300404155',
     'status_type': 'video',
     'likes_count': 7
 } {
     'post_engaged': 37,
     'post_impressions': 447,
     'post_story': 22,
     'comment_count': 4,
     'created_time': '03:11 AM, Sep 30, 2012',
     'message': '30-09-12 \\u092a\\u094d\\u0930\\u093e\\u0924:\\u092e\\u0941\\u0930\\u0932\\u0940 \\u0913\\u0',
     < built - in function id > : '471643246209744',
     'status_type': 'status',
     'likes_count': 20
 } {
     'post_engaged': 36,
     'post_impressions': 423,
     'post_story': 22,
     'comment_count': 0,
     'created_time': '03:04 AM, Sep 29, 2012',
     'message': 'Essence: Sweet children, whenever you have time, earn the true income. Staying i',
     < built - in function id > : '471274672913268',
     'status_type': 'status',
     'likes_count': 20
 } {
     'post_engaged': 16,
     'text': 'Essence Of Murli 29-09-2012',
     'post_story': 5,
     'comment_count': 2,
     'link': 'http:\\/\\/www.youtube.com\\/watch?v=i6OgmbRsJpg&feature=youtube_gdata_player',
     'post_impressions': 291,
     'created_time': '03:04 AM, Sep 29, 2012',
     'message': 'Not available',
     < built - in function id > : '213046588825668',
     'status_type': 'video',
     'likes_count': 5
 }

But it leads me to

TypeError : keys must be a string

The error is likely due to the dict containing, keys like:

 <built-in function id>: '213046588825668'

Can someone please guide me, with how should I remove these elements from the dict?

Slink answered 4/10, 2012 at 19:30 Comment(1)
The only proper solution is to fix the REAL bug - which is to use id (the builtin function) instead of "id" (literal string) as key - at the source (at the place where this dict is built).Sauternes
F
20

You could try to clean it up like this:

for key in mydict.keys():
  if type(key) is not str:
    try:
      mydict[str(key)] = mydict[key]
    except:
      try:
        mydict[repr(key)] = mydict[key]
      except:
        pass
    del mydict[key]

This will try to convert any key that is not a string into a string. Any key that could not be converted into a string or represented as a string will be deleted.

Freezer answered 4/10, 2012 at 19:38 Comment(3)
It's totally not your fault, but this won't do what the asker really wants.Fahy
The only way to get that function object as a key in your dictionary is if you typed id (which is a built-in function) instead of "id" (which is a string literal) somewhere. Instead of trying to work around your existing bug with more code, maybe you should try to figure out where the code building the dictionary messes up?Congenital
While accepted and upvoted, this answer is actually plain wrong, on both the design and the implementation. To anyone reading this answer: don't do this.Sauternes
S
10

I know this is an old question and it already has an accepted answer, but alas the accepted answer is just totally wrong.

The real issue here is that the code that generates the dict uses the builtin id function as key instead of the literal string "id". So the simple, obvious and only correct solution is to fix this bug at the source : check the code that generates the dict, and replace id with "id".

Sauternes answered 18/9, 2019 at 15:1 Comment(1)
Might also be worth noting that the json module supports custom serialization classes...Fibrous
H
8

Modifying the accepted answer above, I wrote a function to handle dictionaries of arbitrary depth:

def stringify_keys(d):
    """Convert a dict's keys to strings if they are not."""
    for key in d.keys():

        # check inner dict
        if isinstance(d[key], dict):
            value = stringify_keys(d[key])
        else:
            value = d[key]

        # convert nonstring to string if needed
        if not isinstance(key, str):
            try:
                d[str(key)] = value
            except Exception:
                try:
                    d[repr(key)] = value
                except Exception:
                    raise

            # delete old key
            del d[key]
    return d
Harvell answered 26/6, 2018 at 21:31 Comment(0)
C
0

This error arises when one’s trying to dump a non-string-keys dict into a JSON. Indeed, while Python dictionaries accept many types of keys, only strings are valid JSON keys.

Therefore, the solution is simply to encode those keys as strings :

json_ready_dict = { str(k):v for k,v in pure_python_dict.items() }

(Obviously, recursivity may be needed for nested dicts.)

Note that although JSON accepts more types for values, one may also need to convert them, to avoid Object of type 'XXX' is not JSON serializable TypeError.

Chasten answered 26/4, 2023 at 7:44 Comment(0)
E
0

To convert all bytes to strings in a combination of dicts/lists, you can do something like this:

def bytes_to_str(data):
    if isinstance(data, bytes):
        return data.decode()
    elif isinstance(data, list):
        return [ bytes_to_str(x) for x in data ]
    elif isinstance(data, dict):
        return { bytes_to_str(k): bytes_to_str(v) for k, v in data.items() }
    return data

The code above recursively traverses any lists and dicts, and replaces any byte values within with strings. This allows you to do:

serialized_data = json.dumps(bytes_to_str(data))

Note that data.decode() can fail if there are byte chars that can't be converted into a valid string.

Depending on your use-case you might need to support arbitrary binary data, in which case you can first base64-encode the data before decoding from bytes to str.

import base64

def bytes_to_str(data):
    if isinstance(data, bytes):
        return base64.b64encode(data).decode()
    elif isinstance(data, list):
        return [ bytes_to_str(x) for x in data ]
    elif isinstance(data, dict):
        return { bytes_to_str(k): bytes_to_str(v) for k, v in data.items() }
    return data
Ergograph answered 22/10, 2023 at 19:30 Comment(0)
S
-1

Maybe this will help:

your_dict = {("a", "b"):[1,2,3,4]}
# save
with open("file.json","w") as f:
    f.write(json.dumps(list(your_dict.items())))

# load
with open("file.json","r") as f:
    your_dict = dict([tuple((tuple(x[0]), x[1])) for x in json.loads(f.read())])
Stesha answered 18/9, 2019 at 14:36 Comment(0)
M
-2

Nolan conaway's answer gives this result for example

{"b'opening_hours'": {"b'1_from_hour'": 720, "b'1_to_hour'": 1440, "b'1_break_from_hour'": 1440, "b'1_break_to_hour'": 1440, "b'2_from_hour'": 720, "b'2_to_hour'": 1440, "b'2_break_from_hour'": 1440, "b'2_break_to_hour'": 1440, "b'3_from_hour'": 720, "b'3_to_hour'": 1440, "b'3_break_from_hour'": 1440, "b'3_break_to_hour'": 1440, "b'4_from_hour'": 720, "b'4_to_hour'": 1440, "b'4_break_from_hour'": 1440, "b'4_break_to_hour'": 1440, "b'5_from_hour'": 720, "b'5_to_hour'": 1440, "b'5_break_from_hour'": 1440, "b'5_break_to_hour'": 1440, "b'6_from_hour'": 720, "b'6_to_hour'": 1440, "b'6_break_from_hour'": 1440, "b'6_break_to_hour'": 1440, "b'7_from_hour'": 720, "b'7_to_hour'": 1440, "b'7_break_from_hour'": 1440, "b'7_break_to_hour'": 1440}}

while this amended version

import time
import re
import json
from phpserialize import *


class Helpers:
   def stringify_keys(self,d):
    """Convert a dict's keys to strings if they are not."""
    for key in d.keys():
        # check inner dict
        if isinstance(d[key], dict):
            value = Helpers().stringify_keys(d[key])
        else:
            value = d[key]
        # convert nonstring to string if needed
        if not isinstance(key, str):
            try:
                d[key.decode("utf-8")] = value
            except Exception:
                try:
                    d[repr(key)] = value
                except Exception:
                    raise

            # delete old key
            del d[key]
    return d

will give this cleaner version..

{"opening_hours": {"1_from_hour": 720, "1_to_hour": 1440, "1_break_from_hour": 1440, "1_break_to_hour": 1440, "2_from_hour": 720, "2_to_hour": 1440, "2_break_from_hour": 1440, "2_break_to_hour": 1440, "3_from_hour": 720, "3_to_hour": 1440, "3_break_from_hour": 1440, "3_break_to_hour": 1440, "4_from_hour": 720, "4_to_hour": 1440, "4_break_from_hour": 1440, "4_break_to_hour": 1440, "5_from_hour": 720, "5_to_hour": 1440, "5_break_from_hour": 1440, "5_break_to_hour": 1440, "6_from_hour": 720, "6_to_hour": 1440, "6_break_from_hour": 1440, "6_break_to_hour": 1440, "7_from_hour": 720, "7_to_hour": 1440, "7_break_from_hour": 1440, "7_break_to_hour": 1440}}

Macdonald answered 19/3, 2019 at 11:10 Comment(2)
can you add the import for Helpers()?Mycorrhiza
see the edited code, simply it's a recursive function that calls itself.Macdonald
C
-3

Ideally you would want to clean your data so you comply to the data types supported by JSON.

If you simply want to suppress/ or remove these elements from the dict while serializing, you can use skipkeys argument, description can be found in json.dump section

If skipkeys is true (default: False), then dict keys that are not of a basic type (str, int, float, bool, None) will be skipped instead of raising a TypeError.

json.dumps(obj, skipkeys=True)

This solution is much cleaner and allows the standard library handle erroneous keys for you.

WARNING: You must fully understand the implications of using such a drastic method as this will result in data loss for non-compliant data types as JSON keys.

Cover answered 19/8, 2019 at 2:9 Comment(2)
do not use skipkeys=True if you want the data to be written, since this flag will ignore the non-str keysHeckelphone
That's understood, the OP specifically asked - "Can someone please guide me, with how should I remove these elements from the dict?", I do not see how my solution is incorrect. That's pretty much what the accepted answer does. Also I tried this ` >>> import json >>> d = {4: "s"} >>> json.dumps(d) '{"4": "s"}' `Cover
V
-7

Maybe this will help the next guy:

strjson = json.dumps(str(dic).replace("'",'"'))
Valenba answered 17/9, 2015 at 20:30 Comment(1)
this is rather ugly and also dumps a json-encoding of the json-ish encoded data, hence json-encoding is executed one time to many.Feigin

© 2022 - 2024 — McMap. All rights reserved.