Intro
With the benefit of a decade of Python evolution (working on 3.11) and after commenting on the most voted solution, I developed a variant that, after the fact, I realised walked the very steps of many of the proposed solutions. Still, my goal was to add context to the response in both depth and breadth.
XPath solution
My solution delivers both a sort of XPath to the found node, but also surrounding context on demand. To this end, it accepts an iterable of keys to be output together with the target key. Obviously, it delivers key: value
dictionary elements, for the benefit of readability:
def extract(var, key, context_keys=(), xpath=''):
if isinstance(var, dict):
if key in var:
yield {f'{xpath}.{key}': var[key]} | {f'{xpath}.{key}': value for key, value in var.items() if key in context_keys}
for subkey, value in var.items():
yield from extract(value, key, context_keys, f'{xpath}.{subkey}')
elif isinstance(var, list):
for i, elem in enumerate(var):
yield from extract(elem, key, context_keys, f'{xpath}[{i}]')
With this, looking for 'id'
would retrieve:
[
{
".id": "abcde"
},
{
".nestedlist[0].id": "qwerty"
},
{
".nestedlist[0].nestednestedlist[0].id": "xyz"
},
{
".nestedlist[0].nestednestedlist[1].id": "fghi"
},
{
".nestedlist[0].anothernestednestedlist[0].id": "asdf"
},
{
".nestedlist[0].anothernestednestedlist[1].id": "yuiop"
}
]
Obviously, since all keys represent different XPaths, we could unify the entries into a single dictionary, with either reduce
or comprehension, getting this:
{
".id": "abcde",
".nestedlist[0].id": "qwerty",
".nestedlist[0].nestednestedlist[0].id": "xyz",
".nestedlist[0].nestednestedlist[1].id": "fghi",
".nestedlist[0].anothernestednestedlist[0].id": "asdf",
".nestedlist[0].anothernestednestedlist[1].id": "yuiop"
}
The context could be used for something like this:
>>> pp(reduce(lambda acc, elem: acc | elem, extract(d, 'id', 'keyA')))
{
".id": "abcde",
".nestedlist[0].id": "qwerty",
".nestedlist[0].nestednestedlist[0].id": "xyz",
".nestedlist[0].nestednestedlist[0].keyA": "blah blah blah", # <-- HERE
".nestedlist[0].nestednestedlist[1].id": "fghi",
".nestedlist[0].anothernestednestedlist[0].id": "asdf",
".nestedlist[0].anothernestednestedlist[1].id": "yuiop"
}
I'm working on a solution to return an object that has the same structure of the original, but only the wanted keys. If I get it to work, I'll add it here.
None
as input. Do you care about robustness? (since this is now being used as canonical question) – Exhibition