Find count of characters within the string in Python, using dictionary comprehension (not Counter)
Asked Answered
D

5

12

I am trying to create a dictionary of word and number of times it repeats in string. Say suppose if string is like below

str1 = "aabbaba"

I want to create a dictionary like this

word_count = {'a':4,'b':3}

I am trying to use dictionary comprehension to do this. I did

dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}

This ends up giving an error saying

  File "<stdin>", line 1
    dic = {x:dic[x]+1 if x in dic.keys() else x:1 for x in str}
                                               ^
SyntaxError: invalid syntax

Can anybody tell me what's wrong with the syntax? Also, how can I create such a dictionary using dictionary comprehension?

Dexter answered 3/12, 2016 at 18:8 Comment(5)
Have your looked at a Counter?Furfural
remove the second x: the first x: is the key for both and the if clause is parsed as part of the valueGrumpy
@Furfural I know counter. I don't want to use counter. I want to do this using dictionary comprehension if possible.Dexter
dict/list/set comprehension are cool and everything but that don't means that they are the optimal solution for everything, this is one of those casesFourdrinier
Please have a look at this answer as wellFurfural
F
16

As others have said, this is best done with a Counter.

You can also do:

>>> {e:str1.count(e) for e in set(str1)}
{'a': 4, 'b': 3}

But that traverses the string 1+n times for each unique character (once to create the set, and once for each unique letter to count the number of times it appears. i.e., This has quadratic runtime complexity.). Bad result if you have a lot of unique characters in a long string... A Counter only traverses the string once.

If you want no import version that is more efficient than using .count, you can use .setdefault to make a counter:

>>> count={}
>>> for c in str1:
...    count[c]=count.setdefault(c, 0)+1
... 
>>> count
{'a': 4, 'b': 3}

That only traverses the string once no matter how long or how many unique characters.


You can also use defaultdict if you prefer:

>>> from collections import defaultdict
>>> count=defaultdict(int)
>>> for c in str1:
...    count[c]+=1
... 
>>> count
defaultdict(<type 'int'>, {'a': 4, 'b': 3})
>>> dict(count)
{'a': 4, 'b': 3}

But if you are going to import collections -- Use a Counter!

Furfural answered 3/12, 2016 at 18:13 Comment(2)
Why not just count = collections.defaultdict(int) once instead of setdefault on every iteration?Nygaard
@MichaelKohl to avoid imports? otherwise counter is the better optionFourdrinier
S
9

Ideal way to do this is via using collections.Counter:

>>> from collections import Counter
>>> str1 = "aabbaba"
>>> Counter(str1)
Counter({'a': 4, 'b': 3})

You can not achieve this via simple dict comprehension expression as you will require reference to your previous value of count of element. As mentioned in Dawg's answer, as a work around you may use list.count(e) in order to find count of each element from the set of string within you dict comprehension expression. But time complexity will be n*m as it will traverse the complete string for each unique element (where m are uniques elements), where as with counter it will be n.

Sucy answered 3/12, 2016 at 18:10 Comment(0)
C
5

This is a nice case for collections.Counter:

>>> from collections import Counter
>>> Counter(str1)
Counter({'a': 4, 'b': 3})

It's dict subclass so you can work with the object similarly to standard dictionary:

>>> c = Counter(str1)
>>> c['a']
4

You can do this without use of Counter class as well. The simple and efficient python code for this would be:

>>> d = {}
>>> for x in str1:
...     d[x] = d.get(x, 0) + 1
... 
>>> d
{'a': 4, 'b': 3}
Cabinda answered 3/12, 2016 at 18:10 Comment(0)
S
3

Note that this is not the correct way to do it since it won't count repeated characters more than once (apart from losing other characters from the original dict) but this answers the original question of whether if-else is possible in comprehensions and demonstrates how it can be done.

To answer your question, yes it's possible but the approach is like this:

dic = {x: (dic[x] + 1 if x in dic else 1) for x in str1}

The condition is applied on the value only not on the key:value mapping.

The above can be made clearer using dict.get:

dic = {x: dic.get(x, 0) + 1 for x in str1}

0 is returned if x is not in dic.

Demo:

In [78]: s = "abcde"

In [79]: dic = {}

In [80]: dic = {x: (dic[x] + 1 if x in dic else 1) for x in s}

In [81]: dic 
Out[81]: {'a': 1, 'b': 1, 'c': 1, 'd': 1, 'e': 1}

In [82]: s = "abfg"

In [83]: dic = {x: dic.get(x, 0) + 1 for x in s}

In [84]: dic
Out[84]: {'a': 2, 'b': 2, 'f': 1, 'g': 1}
Smiley answered 3/12, 2016 at 18:23 Comment(7)
You mean just dic = {x: dic.get(x, 1) for x in str} ? Did you defined dic anywhere outside? Because within dict comprehension it will have no referenceSucy
this will fail with a NameError: name 'dic' is not defined (once you fix the str to str1)Fourdrinier
@MoinuddinQuadri yep just noticed it, would've been more helpful if you pointed it out. Fixed nowSmiley
now you need to reset dic if you want to start overFourdrinier
@Fourdrinier this is not the right way to do it but it answers the OP's question about if-else in comprehensions.Smiley
@Fourdrinier that's already noted in my answer. Again, this is not the correct way to achieve the desired result but the original question is about using if-else inside dict-comprehensions which is what I demonstrate in my answer. It's just for the sake of knowledge.Smiley
my bad, I did not notice your noteFourdrinier
H
0

please find below simple solution. If key not found in dictionary, this will create and otherwise adding 1 to current value.

text = 'ABCDEEEEFED'

d = {}

for x in text:
    if x in d:
        d[x] = d[x] + 1
    else:
        d[x] = 1

Hyaloid answered 27/10, 2023 at 12:20 Comment(1)
Welcome to SO. Please validate your answer by displaying your output obtained using OP's provided input data.Mouthy

© 2022 - 2024 — McMap. All rights reserved.