Remove dictionary from list

Asked 5/8, 2009 at 20:41 Answered 26/3 at 5:37

If I have a list of dictionaries, say:

[{'id': 1, 'name': 'paul'},
 {'id': 2, 'name': 'john'}]

and I would like to remove the dictionary with id of 2 (or name 'john'), what is the most efficient way to go about this programmatically (that is to say, I don't know the index of the entry in the list so it can't simply be popped).

Reba answered 5/8, 2009 at 20:41 Comment(0)

154

thelist[:] = [d for d in thelist if d.get('id') != 2]

Edit: as some doubts have been expressed in a comment about the performance of this code (some based on misunderstanding Python's performance characteristics, some on assuming beyond the given specs that there is exactly one dict in the list with a value of 2 for key 'id'), I wish to offer reassurance on this point.

On an old Linux box, measuring this code:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random" "thelist=list(lod); random.shuffle(thelist); thelist[:] = [d for d in thelist if d.get('id') != 2]"
10000 loops, best of 3: 82.3 usec per loop

of which about 57 microseconds for the random.shuffle (needed to ensure that the element to remove is not ALWAYS at the same spot;-) and 0.65 microseconds for the initial copy (whoever worries about performance impact of shallow copies of Python lists is most obviously out to lunch;-), needed to avoid altering the original list in the loop (so each leg of the loop does have something to delete;-).

When it is known that there is exactly one item to remove, it's possible to locate and remove it even more expeditiously:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(99)]; import random" "thelist=list(lod); random.shuffle(thelist); where=(i for i,d in enumerate(thelist) if d.get('id')==2).next(); del thelist[where]"
10000 loops, best of 3: 72.8 usec per loop

(use the next builtin rather than the .next method if you're on Python 2.6 or better, of course) -- but this code breaks down if the number of dicts that satisfy the removal condition is not exactly one. Generalizing this, we have:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random" "thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()" "for i in where: del thelist[i]"
10000 loops, best of 3: 23.7 usec per loop

where the shuffling can be removed because there are already three equispaced dicts to remove, as we know. And the listcomp, unchanged, fares well:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*3; import random" "thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]"
10000 loops, best of 3: 23.8 usec per loop

totally neck and neck, with even just 3 elements of 99 to be removed. With longer lists and more repetitions, this holds even more of course:

$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random" "thelist=list(lod); where=[i for i,d in enumerate(thelist) if d.get('id')==2]; where.reverse()" "for i in where: del thelist[i]"
1000 loops, best of 3: 1.11 msec per loop
$ python -mtimeit -s"lod=[{'id':i, 'name':'nam%s'%i} for i in range(33)]*133; import random" "thelist=list(lod); thelist[:] = [d for d in thelist if d.get('id') != 2]"
1000 loops, best of 3: 998 usec per loop

All in all, it's obviously not worth deploying the subtlety of making and reversing the list of indices to remove, vs the perfectly simple and obvious list comprehension, to possibly gain 100 nanoseconds in one small case -- and lose 113 microseconds in a larger one;-). Avoiding or criticizing simple, straightforward, and perfectly performance-adequate solutions (like list comprehensions for this general class of "remove some items from a list" problems) is a particularly nasty example of Knuth's and Hoare's well-known thesis that "premature optimization is the root of all evil in programming"!-)

Shoveler answered 5/8, 2009 at 20:43 Comment(10)

Two reasons why this is bad: it copies the entire list, and it traverses the entire list even if the dictionary containing id 2 is the very first element. – Lewanna 5/8, 2009 at 20:52

@imagist, it's nevertheless fastest -- MEASURE it, for goodness sake, don't just ASSUME you know what you're talking about, esp. when you obviously don't;-), ESPECIALLY when the item to remove is the first (it avoids moving every other item). And there's no indication in the original question that every dict in the list MUST always have a different value corresponding to 'id'. – Shoveler 5/8, 2009 at 20:55

Hmmmm. Not bad. There are two approaches: make a new list with some elements filtered out or modify the existing list to remove some elements. This is just the former approach. And as far as that goes, there is nothing to say that a dictionary with id=2 won't appear more than once in the list. It's a list -- there is no guarantee of uniqueness. And the OP did not suggest this limitation. – Inconsonant 5/8, 2009 at 20:58

@Alex, exactly - I love list comprehensions for their sheer blinding speed. And, since as of Python 3 filter will return an iterator, this should be standard practice. – Abdul 5/8, 2009 at 21:11

@Inconsonant and @Meredith, I've now added copious measurements and variants showing performance behavior in various cases -- hope it's as interesting to read as it was to code and measure;-) – Shoveler 5/8, 2009 at 21:30

What is the colon for in theList[:]? – Mulley 5/8, 2009 at 22:6

@kzh: theList[:] is equivalent to theList[0:len(theList)]. In this context, it means "change theList in-place". – Sufflate 5/8, 2009 at 22:42

What is the difference between theList[:] = .. and theList = ..? – Bellybutton 9/9, 2009 at 15:30

A very clever response! never thought of doing something like this but have run into the problem before. Thanks! – Truscott 28/2, 2020 at 21:4

Is there a way to write an exception, that if a user inputs for what equals !=, and it isnt found, to print a message? For example, if a user wrote thelist[:] = [d for d in thelist if d.get('id') != 4` and of course this doesn't exist, how would one make a message be printed to the user that no id in a dictionary has this value? – Pyrolysis 20/9, 2022 at 0:28

Here's a way to do it with a list comprehension (assuming you name your list 'foo'):

[x for x in foo if not (2 == x.get('id'))]

Substitute 'john' == x.get('name') or whatever as appropriate.

filter also works:

foo.filter(lambda x: x.get('id')!=2, foo)

And if you want a generator you can use itertools:

itertools.ifilter(lambda x: x.get('id')!=2, foo)

However, as of Python 3, filter will return an iterator anyway, so the list comprehension is really the best choice, as Alex suggested.

Abdul answered 5/8, 2009 at 20:46 Comment(1)

also, .get is better than [] here, as it doesn't break if some dict in the list does NOT have an entry for key 'id'. – Shoveler 5/8, 2009 at 20:59

# assume ls contains your list
for i in range(len(ls)):
    if ls[i]['id'] == 2:
        del ls[i]
        break

Will probably be faster than the list comprehension methods on average because it doesn't traverse the whole list if it finds the item in question early on.

Lewanna answered 5/8, 2009 at 20:58 Comment(2)

will raise KeyError if dict has no id. and that's not what OP asked for. – Bain 5/8, 2009 at 21:14

@Lewanna +1 This was exactly what I was looking for. Note to @SilentGhost: You could just use a different key, other than id, if you wanted to target another value, ie: if ls[i]['name'] == 'john': would match and remove that dictionary. – Definite 27/4, 2017 at 2:38

This is not properly an anwser (as I think you already have some quite good of them), but... have you considered of having a dictionary of <id>:<name> instead of a list of dictionaries?

Laevorotation answered 5/8, 2009 at 20:52 Comment(3)

+1: "If it's hard, you're doing it wrong." If you want to remove things by an attribute, use a dictionary, keyed by the attribute. Much simpler. – Campanile 5/8, 2009 at 21:6

...as long as you don't care at all about preserving the order of items, never want to remove things by a different attribute, are happy with never allowing any duplicates regarding that one attribute, etc, etc -- far too many restrictions above and beyond any specs expressed by the OP, to make this suggestion reasonable;-). – Shoveler 5/8, 2009 at 21:31

If I'd had to take all those specs for granted, I would have said "use a database" xD – Laevorotation 5/8, 2009 at 22:54

You can try the following:

a = [{'id': 1, 'name': 'paul'},
     {'id': 2, 'name': 'john'}]

for e in range(len(a) - 1, -1, -1):
    if a[e]['id'] == 2:
        a.pop(e)

If You can't pop from the beginning - pop from the end, it won't ruin the for loop.

Limekiln answered 5/8, 2009 at 20:46 Comment(2)

You mean "range(len(a) - 1, -1, -1)", not "range(len(a) - 1, 0, -1)". This does not include the first element of the list. I've heard word that reversed() is preferred nowadays. See my code below. – Inconsonant 5/8, 2009 at 20:52

Here's what I was getting at: >>> a = list(range(5)) >>> a [0, 1, 2, 3, 4] >>> range(len(a) - 1, -1, -1) [4, 3, 2, 1, 0] >>> range(len(a) - 1, 0, -1) [4, 3, 2, 1] Just wait for the comment-mangling... – Inconsonant 5/8, 2009 at 20:53

Supposed your python version is 3.6 or greater, and that you don't need the deleted item this would be less expensive...

If the dictionaries in the list are unique :

for i in range(len(dicts)):
    if dicts[i].get('id') == 2:
        del dicts[i]
        break

If you want to remove all matched items :

for i in range(len(dicts)):
    if dicts[i].get('id') == 2:
        del dicts[i]

You can also to this to be sure getting id key won't raise keyerror regardless the python version

if dicts[i].get('id', None) == 2

Vandervelde answered 9/1, 2019 at 19:31 Comment(1)

The code to remove all matched items won't work. Deleting from the list will cause the index to change, which will cause this code to skip an item. – Venter 10/1, 2019 at 9:33

You could try something along the following lines:

def destructively_remove_if(predicate, list):
      for k in xrange(len(list)):
          if predicate(list[k]):
              del list[k]
              break
      return list

  list = [
      { 'id': 1, 'name': 'John' },
      { 'id': 2, 'name': 'Karl' },
      { 'id': 3, 'name': 'Desdemona' } 
  ]

  print "Before:", list
  destructively_remove_if(lambda p: p["id"] == 2, list)
  print "After:", list

Unless you build something akin to an index over your data, I don't think that you can do better than doing a brute-force "table scan" over the entire list. If your data is sorted by the key you are using, you might be able to employ the bisect module to find the object you are looking for somewhat faster.

Highway answered 5/8, 2009 at 20:51 Comment(2)

what is xrange ? @Highway – Saleswoman 10/6, 2018 at 11:16

xrange is/was in Python 2, what's nowadays called range in Python 3. The example as written is still Python 2 code (look at the date, observe the use of print as statement instead of as function). – Highway 10/6, 2018 at 17:55

From the update on pep448 on unpacking generalisations (python 3.5 and onwards) while iterating a list of dicts with a temporary variable, let's say row, You can take the dict of the current iteration in, using **row, merge new keys in or use a boolean operation to filter out dict(s) from your list of dicts.

Keep in mind **row will output a new dictionary.

For example your starting list of dicts :

data = [{'id': 1, 'name': 'paul'},{'id': 2, 'name': 'john'}]

if we want to filter out id 2 :

data = [{**row} for row in data if row['id']!=2]

if you want to filter out John :

data = [{**row} for row in data if row['name']!='John']

not directly related to the question but if you want to a add new key :

data = [{**row, 'id_name':str(row['id'])+'_'+row['name']} for row in data]

It's also a tiny bit faster than the accepted solution.

Muffin answered 1/12, 2022 at 19:57 Comment(1)

it should be if row['id']!=2 – Basaltware 27/12, 2022 at 18:15

-1

Try this: example remove 'joh' for the list

for id,elements in enumerate(dictionary):
    if elements['name']=='john':
        del dictionary[id]

Chenay answered 11/1, 2023 at 4:34 Comment(0)

-1

Why not just delete the items starting from the end of the list? That way the index is constant for the rest of the items as you're going through the loop.

In this example, notice there are 2 of the same id of 0:


myList = [{'id': 0, 'name': 'paul'},
          {'id': 2, 'name': 'john'}
          {'id': 0, 'name': 'raul'},
          {'id': 4, 'name': 'oscar'}]

for item in myList[::-1]:
            
    if item['id'] == 0:             
        myList.remove(item)

Utilizing the remove() method, you can delete the unwanted items effectively when you don't have the index.

Unesco answered 26/3 at 5:37 Comment(1)

Welcome to StackOverflow! zeroDivision's answer from 14 years ago already mentions looping backwards to avoid problems modifying while iterating. Invoking list.remove will start its own search, which is pointless when the item to be removed has already been located. – Subir 26/3 at 21:17

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags