How do I calculate the number of times a word occurs in a sentence?

Asked 25/11, 2011 at 17:20 Answered 8/6, 2024 at 9:8

So I've been learning Python for some months now and was wondering how I would go about writing a function that will count the number of times a word occurs in a sentence. I would appreciate if someone could please give me a step-by-step method for doing this.

Audiophile answered 25/11, 2011 at 17:20 Comment(2)

http://stackoverflow.com/search?q=[python]+count+words – Sardonic 25/11, 2011 at 17:22

Define "sentence" and "word". Besides, if you've been learning for a few months, you ought to be able to start (not necessarily finish, but give it a try) writing a function on your own... – Angelika 25/11, 2011 at 17:23

Quick answer:

def count_occurrences(word, sentence):
    return sentence.lower().split().count(word)

'some string.split() will split the string on whitespace (spaces, tabs and linefeeds) into a list of word-ish things. Then ['some', 'string'].count(item) returns the number of times item occurs in the list.

That doesn't handle removing punctuation. You could do that using string.maketrans and str.translate.

# Make collection of chars to keep (don't translate them)
import string
keep = string.lowercase + string.digits + string.whitespace
table = string.maketrans(keep, keep)
delete = ''.join(set(string.printable) - set(keep))

def count_occurrences(word, sentence):
    return sentence.lower().translate(table, delete).split().count(word)

The key here is that we've constructed the string delete so that it contains all the ascii characters except letters, numbers and spaces. Then str.translate in this case takes a translation table that doesn't change the string, but also a string of chars to strip out.

Naturalist answered 25/11, 2011 at 17:30 Comment(2)

string.translate is technically in the deprecated section of the documentation, so I would be wary of using that function as a habit. – Cida 25/11, 2011 at 17:51

You're right - I changed the text to refer to str.translate, which is the blessed way of doing this. – Naturalist 25/11, 2011 at 17:53

wilberforce has the quick, correct answer, and I'll give the long winded 'how to get to that conclusion' answer.

First, here are some tools to get you started, and some questions you need to ask yourself.

You need to read the section on Sequence Types, in the python docs, because it is your best friend for solving this problem. Seriously, read it. Once you have read that, you should have some ideas. For example you can take a long string and break it up using the split() function. To be explicit:

mystring = "This sentence is a simple sentence."
result = mystring.split()
print result
print "The total number of words is: "  + str(len(result))
print "The word 'sentence' occurs: " + str(result.count("sentence"))

Takes the input string and splits it on any whitespace, and will give you:

["This", "sentence", "is", "a", "simple", "sentence."]
The total number of words is 6
The word 'sentence' occurs: 1

Now note here that you do have the period still at the end of the second 'sentence'. This is a problem because 'sentence' is not the same as 'sentence.'. If you are going to go over your list and count words, you need to make sure that the strings are identical. You may need to find and remove some punctuation.

A naieve approach to this might be:

no_period_string = mystring.replace(".", " ")
print no_period_string

To get me a period-less sentence:

"This sentence is a simple sentence"

You also need to decide if your input going to be just a single sentence, or maybe a paragraph of text. If you have many sentences in your input, you might want to find a way to break them up into individual sentences, and find the periods (or question marks, or exclamation marks, or other punctuation that ends a sentence). Once you find out where in the string the 'sentence terminator' is you could maybe split up the string at that point, or something like that.

You should give this a try yourself - hopefully I've peppered in enough hints to get you to look at some specific functions in the documentation.

Cida answered 25/11, 2011 at 17:43 Comment(2)

That answers 'How many words are in this sentence?', but not 'How many times does this word occur in this sentence?'. :) – Naturalist 25/11, 2011 at 18:0

Oh dang. Reading fail. Fixing. – Cida 25/11, 2011 at 18:6

Simplest way:

def count_occurrences(word, sentence):
    return sentence.count(word)

Bisayas answered 4/12, 2018 at 6:46 Comment(0)

text=input("Enter your sentence:")
print("'the' appears", text.count("the"),"times")

simplest way to do it

Incretion answered 30/9, 2021 at 9:3 Comment(0)

Problem with using count() method is that it not always gives the correct number of occurrence when there is overlapping, for example

print('banana'.count('ana'))

output

but 'ana' occurs twice in 'banana'

To solve this issue, i used

def total_occurrence(string,word):
    count = 0
    tempsting = string
    while(word in tempsting):
        count +=1
        tempsting = tempsting[tempsting.index(word)+1:]
    return count

Contented answered 24/11, 2021 at 14:20 Comment(0)

I converted the sentence to lowercase to make the search case-insensitive. The lower method came first to avoid an error. The split method will convert it to a 'list' object(individual words). After calling the function, the count method counts the number of times 'boy' occurs in the sentence.

def countBoy(word):
    return word.lower().split().count("boy")
    
result = countBoy("That boy gave the little boy a pen, and the older boy a pencil.")
print(f"Number of times of occurrence of 'boy' is {result}")

Transverse answered 8/6, 2024 at 9:8 Comment(1)

The answer is useful but it's not robust to punctuation. For example, this does not provide the correct results: result = countBoy("That boy gave a pen to the little boy, and the older boy a pencil.") – Newborn 8/6, 2024 at 17:55

You can do it like this:

def countWord(word):

    numWord = 0
    for i in range(1, len(word)-1):
        if word[i-1:i+3] == 'word':
            numWord += 1
    print 'Number of times "word" occurs is:', numWord

then calling the string:

countWord('wordetcetcetcetcetcetcetcword')

will return: Number of times "word" occurs is: 2

Nihilism answered 22/5, 2016 at 4:12 Comment(0)

def check_Search_WordCount(mySearchStr, mySentence):

len_mySentence = len(mySentence)
len_Sentence_without_Find_Word = len(mySentence.replace(mySearchStr,""))
len_Remaining_Sentence = len_mySentence - len_Sentence_without_Find_Word
count = len_Remaining_Sentence/len(mySearchStr)
return (int(count))

Abrahamabrahams answered 9/8, 2018 at 8:33 Comment(2)

Only posting the code without explanation is not very helpful for the person who asked. – Gayn 9/8, 2018 at 8:37

Hi, this is simple...Step1) Take total length of string - "len_mySentence" Step2) Take string length without Search Word - "len_Sentence_without_Find_Word" Step3) Subtraction of both lengths - "len_Remaining_Sentence" Step4) Finally divide len_Remaining_Sentence with Search Word length........ – Abrahamabrahams 9/8, 2018 at 8:47

I assume that you just know about python string and for loop.

def count_occurences(s,word):

    count = 0
    for i in range(len(s)): 
        if s[i:i+len(word)] == word:
            count += 1    
    return count

mystring = "This sentence is a simple sentence."
myword = "sentence"
print(count_occurences(mystring,myword))

explanation: s[i:i+len(word)]: slicing the string s to extract a word having the same length with the word (argument) count += 1 : increase the counter whenever matched.

Bimestrial answered 12/1, 2019 at 18:17 Comment(1)

This is needlessly complicated though. – Collective 4/11, 2020 at 13:42

Hot tags

Godot Unity Godot Help Programming Godot 4.X GUI GDScript 3D 2D Physics CSharp Godot 3.X VR XR Projects C++

Recommended topics

Hot tags