Search a list of dictionaries in Python
Asked Answered
D

26

786

Given:

[
  {"name": "Tom", "age": 10},
  {"name": "Mark", "age": 5},
  {"name": "Pam", "age": 7}
]

How do I search by name == "Pam" to retrieve the corresponding dictionary below?

{"name": "Pam", "age": 7}
Dow answered 28/12, 2011 at 8:25 Comment(0)
D
975

You can use a generator expression:

>>> dicts = [
...     { "name": "Tom", "age": 10 },
...     { "name": "Mark", "age": 5 },
...     { "name": "Pam", "age": 7 },
...     { "name": "Dick", "age": 12 }
... ]

>>> next(item for item in dicts if item["name"] == "Pam")
{'age': 7, 'name': 'Pam'}

If you need to handle the item not being there, then you can do what user Matt suggested in his comment and provide a default using a slightly different API:

next((item for item in dicts if item["name"] == "Pam"), None)

And to find the index of the item, rather than the item itself, you can enumerate() the list:

next((i for i, item in enumerate(dicts) if item["name"] == "Pam"), None)
Dragoman answered 28/12, 2011 at 8:31 Comment(11)
Just to save anyone else a little time, if you need a default value in the event "Pam" just ain't in the list: next((item for item in dicts if item["name"] == "Pam"), None)Harbard
What about [item for item in dicts if item["name"] == "Pam"][0]?Infeld
@Moberg, that's still a list comprehension, so it will iterate over the whole input sequence regardless of the position of the matching item.Horrific
@FrédéricHamidi, how do you make this case insensitive? For example, I still want to get Pam's age even if the input string is "pam" or "PAM" not just "Pam."Knute
@EazyC, I personally would apply upper() to both strings before comparing them, although of course there are many other ways to do it.Horrific
Suppose I wanted to get an index of matching dictionary, would there be an neater solution than dicts.index(next(item for item in dicts if item["name"] == "Pam"))?Canaveral
This will raise stopiteration error if key is not present in dictionaryRakish
@Siemkowski: then add enumerate() to generate a running index: next(i for i, item in enumerate(dicts) if item["name"] == "Pam").Exo
How do you search all items in the list, not just the first match?Counsellor
@Counsellor If you create an iterable object before the next... you can perform subsequent matches (not only the first match), like this: mylist = iter(["apple", "small-banana", "cherry", "medium-banana"]) x = next((x for x in mylist if "banana" in x), None) print(x) x = next((x for x in mylist if "banana" in x), None) print(x)Thermography
A minimal modification to avoid a "KeyError" if key does not exists: next((item for item in dicts if item.get("name") == "Pam"), None)Oversee
M
328

This looks to me the most pythonic way:

people = [
{'name': "Tom", 'age': 10},
{'name': "Mark", 'age': 5},
{'name': "Pam", 'age': 7}
]

filter(lambda person: person['name'] == 'Pam', people)

result (returned as a list in Python 2):

[{'age': 7, 'name': 'Pam'}]

Note: In Python 3, a filter object is returned. So the python3 solution would be:

list(filter(lambda person: person['name'] == 'Pam', people))
Maggiore answered 18/8, 2014 at 22:46 Comment(8)
Is worth noting that this answer returns a list with all matches for 'Pam' in people, alternatively we could get a list of all the people that are not 'Pam' by changing the comparison operator to !=. +1Bourguiba
Also worth mentioning that the result is a filter object, not a list - if you want to use things like len(), you need to call list() on the result first. Or: #19182688Maisel
@Maisel this is what my Python 2.7 says: people = [ {'name': "Tom", 'age': 10}, {'name': "Mark", 'age': 5}, {'name': "Pam", 'age': 7} ] r = filter(lambda person: person['name'] == 'Pam', people) type(r) list So r is a listMaggiore
@Maggiore my bad, I've only used the function in Python 3. Have suggested an edit that the result is specific to Python 2.Maisel
List comprehensions are considered more Pythonic than map/filter/reduce: https://mcmap.net/q/55300/-google-python-style-guide-closedShelah
Get the first match: next(filter(lambda x: x['name'] == 'Pam', dicts))Xylograph
Worked like a charm in Python 3.Subversion
Upon testing this I found that for my (relatively small) list of objects, using the filter was on average an order of magnitude slower than a for loop and if statement.Voice
E
107

@Frédéric Hamidi's answer is great. In Python 3.x the syntax for .next() changed slightly. Thus a slight modification:

>>> dicts = [
     { "name": "Tom", "age": 10 },
     { "name": "Mark", "age": 5 },
     { "name": "Pam", "age": 7 },
     { "name": "Dick", "age": 12 }
 ]
>>> next(item for item in dicts if item["name"] == "Pam")
{'age': 7, 'name': 'Pam'}

As mentioned in the comments by @Matt, you can add a default value as such:

>>> next((item for item in dicts if item["name"] == "Pam"), False)
{'name': 'Pam', 'age': 7}
>>> next((item for item in dicts if item["name"] == "Sam"), False)
False
>>>
Elisabeth answered 13/8, 2015 at 12:48 Comment(1)
This is the best answer for Python 3.x. If you need a specific element from the dicts, like age, you can write: next((item.get('age') for item in dicts if item["name"] == "Pam"), False)Garboard
T
82

You can use a list comprehension:

def search(name, people):
    return [element for element in people if element['name'] == name]
Tham answered 28/12, 2011 at 8:32 Comment(5)
This is nice because it returns all matches if there is more than one. Not exactly what the question asked for, but it's what I needed! Thanks!Ric
Note also this returns a list!Lockout
Is it possible to pass two conditions? such as if element['name'] == name and element['age'] == age? I tried it out, but doesn't seem to work, says element is undefined on the second condition.Tingaling
@Tingaling yes, it is possible. Don't forget to add an argument age to the function def search2(name, age, people): and don't forget to pass this argument, as well =). I've just tried two conditions and it works!Holiday
This returns a list regardless of if the value is present.Kinsman
A
62

I tested various methods to go through a list of dictionaries and return the dictionaries where key x has a certain value.

Results:

  • Speed: list comprehension > generator expression >> normal list iteration >>> filter.
  • All scale linear with the number of dicts in the list (10x list size -> 10x time).
  • The keys per dictionary does not affect speed significantly for large amounts (thousands) of keys. Please see this graph I calculated: https://i.stack.imgur.com/j2nXV.jpg (method names see below).

All tests done with Python 3.6.4, W7x64.

from random import randint
from timeit import timeit


list_dicts = []
for _ in range(1000):     # number of dicts in the list
    dict_tmp = {}
    for i in range(10):   # number of keys for each dict
        dict_tmp[f"key{i}"] = randint(0,50)
    list_dicts.append( dict_tmp )



def a():
    # normal iteration over all elements
    for dict_ in list_dicts:
        if dict_["key3"] == 20:
            pass

def b():
    # use 'generator'
    for dict_ in (x for x in list_dicts if x["key3"] == 20):
        pass

def c():
    # use 'list'
    for dict_ in [x for x in list_dicts if x["key3"] == 20]:
        pass

def d():
    # use 'filter'
    for dict_ in filter(lambda x: x['key3'] == 20, list_dicts):
        pass

Results:

1.7303 # normal list iteration 
1.3849 # generator expression 
1.3158 # list comprehension 
7.7848 # filter
Archegonium answered 24/2, 2018 at 0:51 Comment(3)
I added function z() that implements next as pointed by Frédéric Hamidi above. Here are the results from Py profile.Streaky
Does anyone know why a list comprehension c() would be that much faster than simply iterating over the list a()Maffick
@knowledge_seeker, this might not be the best analogy but think of generators like indexes in a database and lists like query results in a database. It is much faster to sift through necessary pieces of data to get the end result instead of getting the result of a result of a result, etc. Hope this makes sense.Exciter
E
45
people = [
{'name': "Tom", 'age': 10},
{'name': "Mark", 'age': 5},
{'name': "Pam", 'age': 7}
]

def search(name):
    for p in people:
        if p['name'] == name:
            return p

search("Pam")
Excelsior answered 28/12, 2011 at 8:30 Comment(2)
It will return the first dictionary in the list with the given name.Mellen
Just to make this very useful routine a little more generic: def search(list, key, value): for item in list: if item[key] == value: return itemSomme
A
13

Have you ever tried out the pandas package? It's perfect for this kind of search task and optimized too.

import pandas as pd

listOfDicts = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

# Create a data frame, keys are used as column headers.
# Dict items with the same key are entered into the same respective column.
df = pd.DataFrame(listOfDicts)

# The pandas dataframe allows you to pick out specific values like so:

df2 = df[ (df['name'] == 'Pam') & (df['age'] == 7) ]

# Alternate syntax, same thing

df2 = df[ (df.name == 'Pam') & (df.age == 7) ]

I've added a little bit of benchmarking below to illustrate pandas' faster runtimes on a larger scale i.e. 100k+ entries:

setup_large = 'dicts = [];\
[dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 })) for _ in range(25000)];\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'

setup_small = 'dicts = [];\
dicts.extend(({ "name": "Tom", "age": 10 },{ "name": "Mark", "age": 5 },\
{ "name": "Pam", "age": 7 },{ "name": "Dick", "age": 12 }));\
from operator import itemgetter;import pandas as pd;\
df = pd.DataFrame(dicts);'

method1 = '[item for item in dicts if item["name"] == "Pam"]'
method2 = 'df[df["name"] == "Pam"]'

import timeit
t = timeit.Timer(method1, setup_small)
print('Small Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_small)
print('Small Method Pandas: ' + str(t.timeit(100)))

t = timeit.Timer(method1, setup_large)
print('Large Method LC: ' + str(t.timeit(100)))
t = timeit.Timer(method2, setup_large)
print('Large Method Pandas: ' + str(t.timeit(100)))

#Small Method LC: 0.000191926956177
#Small Method Pandas: 0.044392824173
#Large Method LC: 1.98827004433
#Large Method Pandas: 0.324505090714
Afield answered 1/9, 2016 at 21:12 Comment(1)
and method3 = """df.query("name == 'Pam'")""", while slightly slower than method 2 for small datasets (still 2 orders of magnitude faster than LC), is twice as fast on my machine for the larger datasetJorry
A
12

To add just a tiny bit to @FrédéricHamidi.

In case you are not sure a key is in the the list of dicts, something like this would help:

next((item for item in dicts if item.get("name") and item["name"] == "Pam"), None)
Adar answered 9/12, 2015 at 23:18 Comment(1)
or simply item.get("name") == "Pam"Kidskin
D
12

One simple way using list comprehensions is , if l is the list

l = [
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

then

[d['age'] for d in l if d['name']=='Tom']
Diapositive answered 30/1, 2020 at 10:45 Comment(0)
H
11
def dsearch(lod, **kw):
    return filter(lambda i: all((i[k] == v for (k, v) in kw.items())), lod)

lod=[{'a':33, 'b':'test2', 'c':'a.ing333'},
     {'a':22, 'b':'ihaha', 'c':'fbgval'},
     {'a':33, 'b':'TEst1', 'c':'s.ing123'},
     {'a':22, 'b':'ihaha', 'c':'dfdvbfjkv'}]



list(dsearch(lod, a=22))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
 {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]



list(dsearch(lod, a=22, b='ihaha'))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'},
 {'a': 22, 'b': 'ihaha', 'c': 'dfdvbfjkv'}]


list(dsearch(lod, a=22, c='fbgval'))

[{'a': 22, 'b': 'ihaha', 'c': 'fbgval'}]
Hyaline answered 20/9, 2020 at 23:48 Comment(0)
N
10

You can achieve this with the usage of filter and next methods in Python.

filter method filters the given sequence and returns an iterator. next method accepts an iterator and returns the next element in the list.

So you can find the element by,

my_dict = [
    {"name": "Tom", "age": 10},
    {"name": "Mark", "age": 5},
    {"name": "Pam", "age": 7}
]

next(filter(lambda obj: obj.get('name') == 'Pam', my_dict), None)

and the output is,

{'name': 'Pam', 'age': 7}

Note: The above code will return None incase if the name we are searching is not found.

Nernst answered 17/12, 2019 at 6:8 Comment(1)
This is a lot slower than list comprehensions.Exalt
N
9

Simply using list comprehension:

[i for i in dct if i['name'] == 'Pam'][0]

Sample code:

dct = [
    {'name': 'Tom', 'age': 10},
    {'name': 'Mark', 'age': 5},
    {'name': 'Pam', 'age': 7}
]

print([i for i in dct if i['name'] == 'Pam'][0])

> {'age': 7, 'name': 'Pam'}
Nuss answered 13/8, 2018 at 14:8 Comment(2)
This would crash if Pam isn't in the list.Mariquilla
@Roberto yep, that's true, but you can counter this by saving the result of list comprehension to a variable and check list size before taking the element 0. Or add "try except" clause on top of this line to catch IndexErrorNuss
E
9

Put the accepted answer in a function to easy re-use

def get_item(collection, key, target):
    return next((item for item in collection if item[key] == target), None)

Or also as a lambda

   get_item_lambda = lambda collection, key, target : next((item for item in collection if item[key] == target), None)

Result

    key = "name"
    target = "Pam"
    print(get_item(target_list, key, target))
    print(get_item_lambda(target_list, key, target))

    #{'name': 'Pam', 'age': 7}
    #{'name': 'Pam', 'age': 7}

In case the key may not be in the target dictionary use dict.get and avoid KeyError

def get_item(collection, key, target):
    return next((item for item in collection if item.get(key, None) == target), None)

get_item_lambda = lambda collection, key, target : next((item for item in collection if item.get(key, None) == target), None)
Edmonds answered 17/11, 2021 at 6:15 Comment(0)
K
8

This is a general way of searching a value in a list of dictionaries:

def search_dictionaries(key, value, list_of_dictionaries):
    return [element for element in list_of_dictionaries if element[key] == value]
Kindless answered 19/7, 2014 at 21:36 Comment(0)
C
7
names = [{'name':'Tom', 'age': 10}, {'name': 'Mark', 'age': 5}, {'name': 'Pam', 'age': 7}]
resultlist = [d    for d in names     if d.get('name', '') == 'Pam']
first_result = resultlist[0]

This is one way...

Cradling answered 28/12, 2011 at 8:34 Comment(1)
I might suggest [d for x in names if d.get('name', '') == 'Pam'] ... to gracefully handle any entries in "names" which did not have a "name" key.Mateya
O
7
dicts=[
{"name": "Tom", "age": 10},
{"name": "Mark", "age": 5},
{"name": "Pam", "age": 7}
]

from collections import defaultdict
dicts_by_name=defaultdict(list)
for d in dicts:
    dicts_by_name[d['name']]=d

print dicts_by_name['Tom']

#output
#>>>
#{'age': 10, 'name': 'Tom'}
Obannon answered 28/12, 2011 at 9:15 Comment(0)
S
6

You can try this:

''' lst: list of dictionaries '''
lst = [{"name": "Tom", "age": 10}, {"name": "Mark", "age": 5}, {"name": "Pam", "age": 7}]

search = raw_input("What name: ") #Input name that needs to be searched (say 'Pam')

print [ lst[i] for i in range(len(lst)) if(lst[i]["name"]==search) ][0] #Output
>>> {'age': 7, 'name': 'Pam'} 
Santoyo answered 3/12, 2018 at 4:41 Comment(0)
M
5

My first thought would be that you might want to consider creating a dictionary of these dictionaries ... if, for example, you were going to be searching it more a than small number of times.

However that might be a premature optimization. What would be wrong with:

def get_records(key, store=dict()):
    '''Return a list of all records containing name==key from our store
    '''
    assert key is not None
    return [d for d in store if d['name']==key]
Mateya answered 28/12, 2011 at 8:32 Comment(2)
Actually you can have a dictionary with a name=None item in it; but that wouldn't really work with this list comprehension and it's probably not sane to allow it in your data store.Mateya
asserts may be skipped if debug mode is off.Rustic
M
5

Most (if not all) implementations proposed here have two flaws:

  • They assume only one key to be passed for searching, while it may be interesting to have more for complex dict
  • They assume all keys passed for searching exist in the dicts, hence they don't deal correctly with KeyError occuring when it is not.

An updated proposition:

def find_first_in_list(objects, **kwargs):
    return next((obj for obj in objects if
                 len(set(obj.keys()).intersection(kwargs.keys())) > 0 and
                 all([obj[k] == v for k, v in kwargs.items() if k in obj.keys()])),
                None)

Maybe not the most pythonic, but at least a bit more failsafe.

Usage:

>>> obj1 = find_first_in_list(list_of_dict, name='Pam', age=7)
>>> obj2 = find_first_in_list(list_of_dict, name='Pam', age=27)
>>> obj3 = find_first_in_list(list_of_dict, name='Pam', address='nowhere')
>>> 
>>> print(obj1, obj2, obj3)
{"name": "Pam", "age": 7}, None, {"name": "Pam", "age": 7}

The gist.

Magistery answered 29/4, 2020 at 7:35 Comment(0)
P
2

Here is a comparison using iterating throuhg list, using filter+lambda or refactoring(if needed or valid to your case) your code to dict of dicts rather than list of dicts

import time

# Build list of dicts
list_of_dicts = list()
for i in range(100000):
    list_of_dicts.append({'id': i, 'name': 'Tom'})

# Build dict of dicts
dict_of_dicts = dict()
for i in range(100000):
    dict_of_dicts[i] = {'name': 'Tom'}


# Find the one with ID of 99

# 1. iterate through the list
lod_ts = time.time()
for elem in list_of_dicts:
    if elem['id'] == 99999:
        break
lod_tf = time.time()
lod_td = lod_tf - lod_ts

# 2. Use filter
f_ts = time.time()
x = filter(lambda k: k['id'] == 99999, list_of_dicts)
f_tf = time.time()
f_td = f_tf- f_ts

# 3. find it in dict of dicts
dod_ts = time.time()
x = dict_of_dicts[99999]
dod_tf = time.time()
dod_td = dod_tf - dod_ts


print 'List of Dictionries took: %s' % lod_td
print 'Using filter took: %s' % f_td
print 'Dict of Dicts took: %s' % dod_td

And the output is this:

List of Dictionries took: 0.0099310874939
Using filter took: 0.0121960639954
Dict of Dicts took: 4.05311584473e-06

Conclusion: Clearly having a dictionary of dicts is the most efficient way to be able to search in those cases, where you know say you will be searching by id's only. interestingly using filter is the slowest solution.

Patellate answered 16/1, 2016 at 13:1 Comment(0)
F
2

I would create a dict of dicts like so:

names = ["Tom", "Mark", "Pam"]
ages = [10, 5, 7]
my_d = {}

for i, j in zip(names, ages):
    my_d[i] = {"name": i, "age": j}

or, using exactly the same info as in the posted question:

info_list = [{"name": "Tom", "age": 10}, {"name": "Mark", "age": 5}, {"name": "Pam", "age": 7}]
my_d = {}

for d in info_list:
    my_d[d["name"]] = d

Then you could do my_d["Pam"] and get {"name": "Pam", "age": 7}

Franke answered 29/1, 2021 at 11:46 Comment(0)
C
2

Ducks will be a lot faster than a list comprehension or filter. It builds an index on your objects so lookups don't need to scan every item.

pip install ducks

from ducks import Dex

dicts = [
  {"name": "Tom", "age": 10},
  {"name": "Mark", "age": 5},
  {"name": "Pam", "age": 7}
]

# Build the index
dex = Dex(dicts, {'name': str, 'age': int})

# Find matching objects
dex[{'name': 'Pam', 'age': 7}]

Result: [{'name': 'Pam', 'age': 7}]

Conclude answered 22/6, 2022 at 20:29 Comment(1)
ducks not supported on Python 3.12Adjectival
T
2

Short and multi words search:

selected_items=[item for item in items if item['name'] in ['Mark','Pam']]
Tranquil answered 12/5, 2023 at 13:35 Comment(0)
F
1

You have to go through all elements of the list. There is not a shortcut!

Unless somewhere else you keep a dictionary of the names pointing to the items of the list, but then you have to take care of the consequences of popping an element from your list.

Fewell answered 28/12, 2011 at 8:44 Comment(3)
In the case of an unsorted list and a missing key this statement is correct, but not in general. If the list is known to be sorted, all elements do not need to be iterated over. Also, if a single record is hit and you know the keys are unique or only require one element, then the iteration may be halted with the single item returned.Telescopium
see the answer of @ThamMortise
@MelihYıldız' maybe I was not clear in my statement. By using a list comprehension user334856 in answer https://mcmap.net/q/53884/-search-a-list-of-dictionaries-in-python is going through the whole list. This confirms my statement. The answer you refer is another way to say what I wrote.Fewell
C
1

I found this thread when I was searching for an answer to the same question. While I realize that it's a late answer, I thought I'd contribute it in case it's useful to anyone else:

def find_dict_in_list(dicts, default=None, **kwargs):
    """Find first matching :obj:`dict` in :obj:`list`.

    :param list dicts: List of dictionaries.
    :param dict default: Optional. Default dictionary to return.
        Defaults to `None`.
    :param **kwargs: `key=value` pairs to match in :obj:`dict`.

    :returns: First matching :obj:`dict` from `dicts`.
    :rtype: dict

    """

    rval = default
    for d in dicts:
        is_found = False

        # Search for keys in dict.
        for k, v in kwargs.items():
            if d.get(k, None) == v:
                is_found = True

            else:
                is_found = False
                break

        if is_found:
            rval = d
            break

    return rval


if __name__ == '__main__':
    # Tests
    dicts = []
    keys = 'spam eggs shrubbery knight'.split()

    start = 0
    for _ in range(4):
        dct = {k: v for k, v in zip(keys, range(start, start+4))}
        dicts.append(dct)
        start += 4

    # Find each dict based on 'spam' key only.  
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam) == dicts[x]

    # Find each dict based on 'spam' and 'shrubbery' keys.
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam, shrubbery=spam+2) == dicts[x]

    # Search for one correct key, one incorrect key:
    for x in range(len(dicts)):
        spam = x*4
        assert find_dict_in_list(dicts, spam=spam, shrubbery=spam+1) is None

    # Search for non-existent dict.
    for x in range(len(dicts)):
        spam = x+100
        assert find_dict_in_list(dicts, spam=spam) is None
Catapult answered 18/1, 2018 at 15:12 Comment(0)
N
0
data = [
        {"name": "Tom", "age": 10},
        {"name": "Mark", "age": 5},
        {"name": "Pam", "age": 7}
]

target_name = "Pam"

for person in data:
  if person["name"] == target_name:
    print(person)  # This will print the dictionary for Pam
    break  # You can add a break statement to stop after finding the first match

"""Alternatively, to store the result in a variable:"""


pam_data = None
for person in data:
  if person["name"] == target_name:
    pam_data = person
    break

if pam_data:
      print(pam_data)  # This will print the dictionary for Pam (if found)
Nutpick answered 30/3 at 20:16 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.