Count word frequency in a text? [duplicate]
Asked Answered
C

1

11

Possible Duplicate:
php: sort and count instances of words in a given string

I am looking to write a php function which takes a string as input, splits it into words and then returns an array of words sorted by the frequency of occurence of each word.

What's the most algorithmically efficient way of accomplishing this ?

Casias answered 12/1, 2011 at 15:19 Comment(3)
I expect it would depend on the size of the text. In any event, there are piles of such parsers out there, and the most efficient way of programming is to reuse rather than write your own. Just google 'word frequency counter php'Extroversion
It depends on what you mean by 'word' too, though. Does "'s" count as a word when it's a possessive marker? What about when it is a contraction for "is"? How about other contractions? If you're just interested in splitting up by whitespace or hyphens (like T9 on your phone does) then you're probably best off using the built in stuff like Gordon suggested below.Flabellum
2 previous questions from StackOverflow on the same topic. Should be useful. [Count how often the word occurs in the text in PHP][1] [php: sort and count instances of words in a given string][2] [1]: #2123736 [2]: #2985286Mavilia
E
29

Your best bet are these:

Example

$words = 'A string with certain words occuring more often than other words.';
print_r( array_count_values(str_word_count($words, 1)) );

Output

Array
(
    [A] => 1
    [string] => 1
    [with] => 1
    [certain] => 1
    [words] => 2
    [occuring] => 1
    [more] => 1
    [often] => 1
    [than] => 1
    [other] => 1
)

marking CW because question is a duplicate of at least two other questions containing the same answer

Eastman answered 12/1, 2011 at 15:20 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.