Access nested dictionary items via a list of keys?
Asked Answered
V

22

216

I have a complex dictionary structure which I would like to access via a list of keys to address the correct item.

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
            },
        "w": 3
        }
    }    

maplist = ["a", "r"]

or

maplist = ["b", "v", "y"]

I have made the following code which works but I'm sure there is a better and more efficient way to do this if anyone has an idea.

# Get a given data from a dictionary with position provided as a list
def getFromDict(dataDict, mapList):    
    for k in mapList:
        dataDict = dataDict[k]
    return dataDict

# Set a given data in a dictionary with position provided as a list
def setInDict(dataDict, mapList, value): 
    for k in mapList[:-1]:
        dataDict = dataDict[k]
    dataDict[mapList[-1]] = value
Vitrine answered 4/2, 2013 at 18:4 Comment(0)
S
336

Use reduce() to traverse the dictionary:

from functools import reduce  # forward compatibility for Python 3
import operator

def getFromDict(dataDict, mapList):
    return reduce(operator.getitem, mapList, dataDict)

and reuse getFromDict to find the location to store the value for setInDict():

def setInDict(dataDict, mapList, value):
    getFromDict(dataDict, mapList[:-1])[mapList[-1]] = value

All but the last element in mapList is needed to find the 'parent' dictionary to add the value to, then use the last element to set the value to the right key.

Demo:

>>> getFromDict(dataDict, ["a", "r"])
1
>>> getFromDict(dataDict, ["b", "v", "y"])
2
>>> setInDict(dataDict, ["b", "v", "w"], 4)
>>> import pprint
>>> pprint.pprint(dataDict)
{'a': {'r': 1, 's': 2, 't': 3},
 'b': {'u': 1, 'v': {'w': 4, 'x': 1, 'y': 2, 'z': 3}, 'w': 3}}

Note that the Python PEP8 style guide prescribes snake_case names for functions. The above works equally well for lists or a mix of dictionaries and lists, so the names should really be get_by_path() and set_by_path():

from functools import reduce  # forward compatibility for Python 3
import operator

def get_by_path(root, items):
    """Access a nested object in root by item sequence."""
    return reduce(operator.getitem, items, root)

def set_by_path(root, items, value):
    """Set a value in a nested object in root by item sequence."""
    get_by_path(root, items[:-1])[items[-1]] = value

And for completion's sake, a function to delete a key:

def del_by_path(root, items):
    """Delete a key-value in a nested object in root by item sequence."""
    del get_by_path(root, items[:-1])[items[-1]]
Servomechanical answered 4/2, 2013 at 18:7 Comment(5)
Also nested mapped set should create non-existing nodes, imo: lists for integer keys, dictionaries for string keys.Compendium
@user1353510: different usecases call for different behaviour. The code here doesn't create intermediaries, no.Servomechanical
@user1353510: for a default value, use try:, except (KeyError, IndexError): return default_value around the current return line.Servomechanical
@user1353510: See List to nested dictionary in python for the other use-case; using dict.setdefault() rather than dict.__getitem__.Servomechanical
Can make a nice one-liner to return an empty dict by using a lambda: reduce(lambda a,b: a.get(b,{}), mapList, dataDict)Cf
P
85

It seems more pythonic to use a for loop. See the quote from What’s New In Python 3.0.

Removed reduce(). Use functools.reduce() if you really need it; however, 99 percent of the time an explicit for loop is more readable.

def nested_get(dic, keys):    
    for key in keys:
        dic = dic[key]
    return dic

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

def nested_del(dic, keys):
    for key in keys[:-1]:
        dic = dic[key]
    del dic[keys[-1]]

Note that the accepted solution doesn't set non-existing nested keys (it raises KeyError). Using the approach above will create non-existing nodes instead.

The code works in both Python 2 and 3.

Perturbation answered 8/6, 2016 at 13:47 Comment(8)
I prefer this solution - but be careful. If I'm not mistaken, since Python dictionaries are not immutable getFromDict has the potential to destroy the caller's dataDict. I would copy.deepcopy(dataDict) first. Of course, (as written) this behavior is desired in the second function.Illuse
That's not really anything to do with mutability - it's just a matter of re-assigning the dataDict variable name to a new variable (sub dictionaries)Cusk
@DylanF Can you explain how that can destroy input? It looks like just rebinding a local variable name to me.Ceilometer
@Ceilometer I think what I meant was, if you're extracting a mutable object and start changing it, you're changing the object in the original dictionary as well. Looking back at it, I don't know if that's really surprising behavior. Just something to keep in mind.Illuse
@DylanF I don't understand: can you offer an example where calling getFromDict(dataDict, mapList) actually ends up modifying the original dictionary? It seems impossible to me.Ceilometer
No. I can give an example of what I think I meant (which is somewhat different from what I said in my original comment). Say dataDict is dataDict = {"a": {"b": "bananas"}}. Note the subdict is mutable. If you access it with newDataDict = getFromDict(dataDict, "a") and then you change newDataDict, e.g. newDataDict["b"] = "apples", then the original dataDict is changed as well to {'a': {'b': 'apples'}}Illuse
@DylanF OK, I see now. That is not getFromDict itself destroying the caller's dataDict, though? It's from mutating the return value, which was done outside of the function. User can always make a copy if they don't want that, but there's no way to undo a copy made inside the function - so it's more flexible not to copy.Ceilometer
@Ceilometer Correct. It would have been more precise to say that "improper use of the return value from getFromDict has the potential to alter objects inside the caller's dataDict. You can safeguard against this by copy.deepcopying either the input or output of the function, if so desired."Illuse
M
16

Using reduce is clever, but the OP's set method may have issues if the parent keys do not pre-exist in the nested dictionary. Since this is the first SO post I saw for this subject in my google search, I would like to make it slightly better.

The set method in ( Setting a value in a nested python dictionary given a list of indices and value ) seems more robust to missing parental keys. To copy it over:

def nested_set(dic, keys, value):
    for key in keys[:-1]:
        dic = dic.setdefault(key, {})
    dic[keys[-1]] = value

Also, it can be convenient to have a method that traverses the key tree and get all the absolute key paths, for which I have created:

def keysInDict(dataDict, parent=[]):
    if not isinstance(dataDict, dict):
        return [tuple(parent)]
    else:
        return reduce(list.__add__, 
            [keysInDict(v,parent+[k]) for k,v in dataDict.items()], [])

One use of it is to convert the nested tree to a pandas DataFrame, using the following code (assuming that all leafs in the nested dictionary have the same depth).

def dict_to_df(dataDict):
    ret = []
    for k in keysInDict(dataDict):
        v = np.array( getFromDict(dataDict, k), )
        v = pd.DataFrame(v)
        v.columns = pd.MultiIndex.from_product(list(k) + [v.columns])
        ret.append(v)
    return reduce(pd.DataFrame.join, ret)
Mesonephros answered 29/4, 2016 at 4:38 Comment(2)
why arbitrarily limit the 'keys' argument length to 2 or more in nested_set?Careful
@Careful It is not limited to 2 or more. It will work with a single top-level key, when keys[:-1] will be empty list.Ceilometer
C
12

This library may be helpful: https://github.com/akesterson/dpath-python

A python library for accessing and searching dictionaries via /slashed/paths ala xpath

Basically it lets you glob over a dictionary as if it were a filesystem.

Chader answered 29/4, 2015 at 6:2 Comment(0)
B
6

How about using recursive functions?

To get a value:

def getFromDict(dataDict, maplist):
    first, rest = maplist[0], maplist[1:]

    if rest: 
        # if `rest` is not empty, run the function recursively
        return getFromDict(dataDict[first], rest)
    else:
        return dataDict[first]

And to set a value:

def setInDict(dataDict, maplist, value):
    first, rest = maplist[0], maplist[1:]

    if rest:
        try:
            if not isinstance(dataDict[first], dict):
                # if the key is not a dict, then make it a dict
                dataDict[first] = {}
        except KeyError:
            # if key doesn't exist, create one
            dataDict[first] = {}

        setInDict(dataDict[first], rest, value)
    else:
        dataDict[first] = value
Bangalore answered 8/12, 2017 at 22:56 Comment(0)
O
4

Solved this with recursion:

def get(d,l):
    if len(l)==1: return d[l[0]]
    return get(d[l[0]],l[1:])

Using your example:

dataDict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}
maplist1 = ["a", "r"]
maplist2 = ["b", "v", "y"]
print(get(dataDict, maplist1)) # 1
print(get(dataDict, maplist2)) # 2
Orestes answered 5/3, 2018 at 13:38 Comment(1)
very nice, I added an extra if condition to handle missing keys: def get(d,l, default_val=None): if l[0] not in d: return default_val elif len(l)==1: return d[l[0]] else: return get(d[l[0]],l[1:])Igal
C
4

Check out NestedDict from the ndicts package (I am the author), it does exactly what you ask for.

from ndicts import NestedDict

data_dict = {
    "a":{
        "r": 1,
        "s": 2,
        "t": 3
        },
    "b":{
        "u": 1,
        "v": {
            "x": 1,
            "y": 2,
            "z": 3
        },
        "w": 3
        }
}  

nd = NestedDict(data_dict)

You can now access keys using comma separated values.

>>> nd["a", "r"]
    1
>>> nd["b", "v"]
    {"x": 1, "y": 2, "z": 3}
Christabelle answered 7/3, 2022 at 14:7 Comment(0)
P
3

Instead of taking a performance hit each time you want to look up a value, how about you flatten the dictionary once then simply look up the key like b:v:y

def flatten(mydict,sep = ':'):
  new_dict = {}
  for key,value in mydict.items():
    if isinstance(value,dict):
      _dict = {sep.join([key, _key]):_value for _key, _value in flatten(value).items()}
      new_dict.update(_dict)
    else:
      new_dict[key]=value
  return new_dict

dataDict = {
"a":{
    "r": 1,
    "s": 2,
    "t": 3
    },
"b":{
    "u": 1,
    "v": {
        "x": 1,
        "y": 2,
        "z": 3
    },
    "w": 3
    }
}    

flat_dict = flatten(dataDict)
print flat_dict
{'b:w': 3, 'b:u': 1, 'b:v:y': 2, 'b:v:x': 1, 'b:v:z': 3, 'a:r': 1, 'a:s': 2, 'a:t': 3}

This way you can simply look up items using flat_dict['b:v:y'] which will give you 1.

And instead of traversing the dictionary on each lookup, you may be able to speed this up by flattening the dictionary and saving the output so that a lookup from cold start would mean loading up the flattened dictionary and simply performing a key/value lookup with no traversal.

Plop answered 1/3, 2017 at 2:7 Comment(1)
I find this flatten to be very helpful and general purpose. I've modified your code ever so slightly to improve it to include "children" of dict (like defaultdict). Also, I have made the separator configurable.Gyrocompass
K
3

You can use pydash:

import pydash as _
_.get(dataDict, ["b", "v", "y"], default='Default')

or

import pydash 
data = {'a': {'b': {'c': [0, 0, {'d': [0, {1: 2}]}]}}}
pydash.get(data, 'a.b.c.2.d.1.[1]')  # ref https://pydash.readthedocs.io/en/latest/deeppath.html#deep-path-strings

https://pydash.readthedocs.io/en/latest/api.html

Kierakieran answered 20/5, 2020 at 19:0 Comment(2)
Awesome lib, thanks for sharing this!Dormouse
This must be the accepted answer in my viewMo
G
2

It's satisfying to see these answers for having two static methods for setting & getting nested attributes. These solutions are way better than using nested trees https://gist.github.com/hrldcpr/2012250

Here's my implementation.

Usage:

To set nested attribute call sattr(my_dict, 1, 2, 3, 5) is equal to my_dict[1][2][3][4]=5

To get a nested attribute call gattr(my_dict, 1, 2)

def gattr(d, *attrs):
    """
    This method receives a dict and list of attributes to return the innermost value of the give dict       
    """
    try:
        for at in attrs:
            d = d[at]
        return d
    except(KeyError, TypeError):
        return None


def sattr(d, *attrs):
    """
    Adds "val" to dict in the hierarchy mentioned via *attrs
    For ex:
    sattr(animals, "cat", "leg","fingers", 4) is equivalent to animals["cat"]["leg"]["fingers"]=4
    This method creates necessary objects until it reaches the final depth
    This behaviour is also known as autovivification and plenty of implementation are around
    This implementation addresses the corner case of replacing existing primitives
    https://gist.github.com/hrldcpr/2012250#gistcomment-1779319
    """
    for attr in attrs[:-2]:
        if type(d.get(attr)) is not dict:
            d[attr] = {}
        d = d[attr]
    d[attrs[-2]] = attrs[-1]
Gravel answered 3/12, 2018 at 0:56 Comment(0)
K
1

Pure Python style, without any import:

def nested_set(element, value, *keys):
    if type(element) is not dict:
        raise AttributeError('nested_set() expects dict as first argument.')
    if len(keys) < 2:
        raise AttributeError('nested_set() expects at least three arguments, not enough given.')

    _keys = keys[:-1]
    _element = element
    for key in _keys:
        _element = _element[key]
    _element[keys[-1]] = value

example = {"foo": { "bar": { "baz": "ok" } } }
keys = ['foo', 'bar']
nested_set(example, "yay", *keys)
print(example)

Output

{'foo': {'bar': 'yay'}}
Karrykarst answered 16/2, 2018 at 14:0 Comment(0)
P
1

An alternative way if you don't want to raise errors if one of the keys is absent (so that your main code can run without interruption):

def get_value(self,your_dict,*keys):
    curr_dict_ = your_dict
    for k in keys:
        v = curr_dict.get(k,None)
        if v is None:
            break
        if isinstance(v,dict):
            curr_dict = v
    return v

In this case, if any of the input keys is not present, None is returned, which can be used as a check in your main code to perform an alternative task.

Piecemeal answered 16/3, 2018 at 1:30 Comment(0)
N
1

Very late to the party, but posting in case this may help someone in the future. For my use case, the following function worked the best. Works to pull any data type out of dictionary

dict is the dictionary containing our value

list is a list of "steps" towards our value

def getnestedvalue(dict, list):

    length = len(list)
    try:
        for depth, key in enumerate(list):
            if depth == length - 1:
                output = dict[key]
                return output
            dict = dict[key]
    except (KeyError, TypeError):
        return None

    return None
Nudity answered 9/12, 2018 at 8:43 Comment(0)
J
1

Multipurpose and simple function to get a field value from a nested dictionary or list:

def key_chain(data, *args, default=None):
    for key in args:
        if isinstance(data, dict):
            data = data.get(key, default)
        elif isinstance(data, (list, tuple)) and isinstance(key, int):
            try:
                data = data[key]
            except IndexError:
                return default
        else:
            return default
    return data

It returns the default value if any key is missed and supports integer keys for lists and tuples. In your case you can call it like

key_chain(dataDict, *maplist)

or

key_chain(dataDict, "b", "v", "y")

More examples of usage https://gist.github.com/yaznahar/26bd3442467aff5d126d345cca0efcad

Joubert answered 12/3, 2023 at 13:36 Comment(1)
Thanks for this interesting contribution, I'm amazed that 10yrs later there are still new approaches that can be found about this question :) Impressive.Vitrine
M
1

Correct me if I'm wrong but none of the (many) answers here handle the case where you want to return a default value if the key is not found. Also, this function handles the case where you try to search deeper than the depth of the dictionary.

def deep_get(d, keys, default=None):
    if keys:
        if isinstance(d, dict):
            return deep_get(d.get(keys[0], default), keys[1:], default)
        else:
            return default
    else:
        return d

# Tests
d = {'A': 1, 'B': {'a': 5, 'b': 6}}
assert deep_get(d, ['A']) == 1
assert deep_get(d, ['B', 'b']) == 6
assert deep_get(d, ['C']) is None
assert deep_get(d, ['C'], -1) == -1
assert deep_get(d, ['A', 'b'], -1) == -1
assert deep_get(d, ['B', 'a', 'b'], -1) == -1
assert deep_get({}, ['A'], -1) == -1
assert deep_get(None, ['A'], -1) == -1
Maurya answered 11/12, 2023 at 20:8 Comment(0)
T
0

If you also want the ability to work with arbitrary json including nested lists and dicts, and nicely handle invalid lookup paths, here's my solution:

from functools import reduce


def get_furthest(s, path):
    '''
    Gets the furthest value along a given key path in a subscriptable structure.

    subscriptable, list -> any
    :param s: the subscriptable structure to examine
    :param path: the lookup path to follow
    :return: a tuple of the value at the furthest valid key, and whether the full path is valid
    '''

    def step_key(acc, key):
        s = acc[0]
        if isinstance(s, str):
            return (s, False)
        try:
            return (s[key], acc[1])
        except LookupError:
            return (s, False)

    return reduce(step_key, path, (s, True))


def get_val(s, path):
    val, successful = get_furthest(s, path)
    if successful:
        return val
    else:
        raise LookupError('Invalid lookup path: {}'.format(path))


def set_val(s, path, value):
    get_val(s, path[:-1])[path[-1]] = value
Takamatsu answered 6/3, 2018 at 21:30 Comment(0)
B
0

How about check and then set dict element without processing all indexes twice?

Solution:

def nested_yield(nested, keys_list):
    """
    Get current nested data by send(None) method. Allows change it to Value by calling send(Value) next time
    :param nested: list or dict of lists or dicts
    :param keys_list: list of indexes/keys
    """
    if not len(keys_list):  # assign to 1st level list
        if isinstance(nested, list):
            while True:
                nested[:] = yield nested
        else:
            raise IndexError('Only lists can take element without key')


    last_key = keys_list.pop()
    for key in keys_list:
        nested = nested[key]

    while True:
        try:
            nested[last_key] = yield nested[last_key]
        except IndexError as e:
            print('no index {} in {}'.format(last_key, nested))
            yield None

Example workflow:

ny = nested_yield(nested_dict, nested_address)
data_element = ny.send(None)
if data_element:
    # process element
    ...
else:
    # extend/update nested data
    ny.send(new_data_element)
    ...
ny.close()

Test

>>> cfg= {'Options': [[1,[0]],[2,[4,[8,16]]],[3,[9]]]}
    ny = nested_yield(cfg, ['Options',1,1,1])
    ny.send(None)
[8, 16]
>>> ny.send('Hello!')
'Hello!'
>>> cfg
{'Options': [[1, [0]], [2, [4, 'Hello!']], [3, [9]]]}
>>> ny.close()
Buckshot answered 15/8, 2018 at 9:46 Comment(0)
J
0

I'd rather use simple recursion function:

def get_value_by_path(data, maplist):
    if not maplist:
        return data
    for key in maplist:
        if key in data:
            return get_value_by_path(data[key], maplist[1:])
Jocose answered 11/7, 2022 at 11:0 Comment(0)
D
-1

a method for concatenating strings:

def get_sub_object_from_path(dict_name, map_list):
    for i in map_list:
        _string = "['%s']" % i
        dict_name += _string
    value = eval(dict_name)
    return value
#Sample:
_dict = {'new': 'person', 'time': {'for': 'one'}}
map_list = ['time', 'for']
print get_sub_object_from_path("_dict",map_list)
#Output:
#one
Dasteel answered 29/12, 2018 at 5:10 Comment(0)
C
-1

Extending @DomTomCat and others' approach, these functional (ie, return modified data via deepcopy without affecting the input) setter and mapper works for nested dict and list.

setter:

def set_at_path(data0, keys, value):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(set_by_path(v,keys[1:],value) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [set_by_path(x[1],keys[1:],value) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=value
        return data

mapper:

def map_at_path(data0, keys, f):
    data = deepcopy(data0)
    if len(keys)>1:
        if isinstance(data,dict):
            return {k:(map_at_path(v,keys[1:],f) if k==keys[0] else v) for k,v in data.items()}
        if isinstance(data,list):
            return [map_at_path(x[1],keys[1:],f) if x[0]==keys[0] else x[1] for x in enumerate(data)]
    else:
        data[keys[-1]]=f(data[keys[-1]])
        return data
Careful answered 1/5, 2019 at 19:0 Comment(0)
H
-1

I use this

def get_dictionary_value(dictionary_temp, variable_dictionary_keys):
     try:
          if(len(variable_dictionary_keys) == 0):
               return str(dictionary_temp)

          variable_dictionary_key = variable_dictionary_keys[0]
          variable_dictionary_keys.remove(variable_dictionary_key)

          return get_dictionary_value(dictionary_temp[variable_dictionary_key] , variable_dictionary_keys)

     except Exception as variable_exception:
          logging.error(variable_exception)
 
          return ''

Harneen answered 20/4, 2021 at 17:8 Comment(1)
Code only answers are discouraged. Please provide a summary of how your answer solves the problem and why it may be preferable to the other answers provided.Hendrickson
N
-4

You can make use of the eval function in python.

def nested_parse(nest, map_list):
    nestq = "nest['" + "']['".join(map_list) + "']"
    return eval(nestq, {'__builtins__':None}, {'nest':nest})

Explanation

For your example query: maplist = ["b", "v", "y"]

nestq will be "nest['b']['v']['y']" where nest is the nested dictionary.

The eval builtin function executes the given string. However, it is important to be careful about possible vulnerabilities that arise from use of eval function. Discussion can be found here:

  1. https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
  2. https://www.journaldev.com/22504/python-eval-function

In the nested_parse() function, I have made sure that no __builtins__ globals are available and only local variable that is available is the nest dictionary.

Neal answered 5/2, 2020 at 7:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.