How do I only get the first 10 words from a string?
implode(' ', array_slice(explode(' ', $sentence), 0, 10));
To add support for other word breaks like commas and dashes, preg_match
gives a quick way and doesn't require splitting the string:
function get_words($sentence, $count = 10) {
preg_match("/(?:\w+(?:\W+|$)){0,$count}/", $sentence, $matches);
return $matches[0];
}
As Pebbl mentions, PHP doesn't handle UTF-8 or Unicode all that well, so if that is a concern then you can replace \w
for [^\s,\.;\?\!]
and \W
for [\s,\.;\?\!]
.
<p>
? this not work with string that html on theme... –
Knell strip_tags
. –
Words Simply splitting on spaces will function incorrectly if there is an unexpected character in place of a space in the sentence structure, or if the sentence contains multiple conjoined spaces.
The following version will work no matter what kind of "space" you use between words and can be easily extended to handle other characters... it currently supports any white space character plus , . ; ? !
function get_snippet( $str, $wordCount = 10 ) {
return implode(
'',
array_slice(
preg_split(
'/([\s,\.;\?\!]+)/',
$str,
$wordCount*2+1,
PREG_SPLIT_DELIM_CAPTURE
),
0,
$wordCount*2-1
)
);
}
Regular expressions are perfect for this issue, because you can easily make the code as flexible or strict as you like. You do have to be careful however. I specifically approached the above targeting the gaps between words — rather than the words themselves — because it is rather difficult to state unequivocally what will define a word.
Take the \w
word boundary, or its inverse \W
. I rarely rely on these, mainly because — depending on the software you are using (like certain versions of PHP) — they don't always include UTF-8 or Unicode characters.
In regular expressions it is better to be specific, at all times. So that your expressions can handle things like the following, no matter where they are rendered:
echo get_snippet('Это не те дроиды, которые вы ищете', 5);
/// outputs: Это не те дроиды, которые
Avoiding splitting could be worthwhile however, in terms of performance. So you could use Kelly's updated approach but switch \w
for [^\s,\.;\?\!]+
and \W
for [\s,\.;\?\!]+
. Although, personally I like the simplicity of the splitting expression used above, it is easier to read and therefore modify. The stack of PHP functions however, is a bit ugly :)
trim()
around your $str
before you process it. This way you eliminate any whitespace in the corners. This would help if you want to check whether you want to add ellipses to the end of the string if the resulting string is a subset of the original. –
Packthread http://snipplr.com/view/8480/a-php-function-to-return-the-first-n-words-from-a-string/
function shorten_string($string, $wordsreturned)
{
$retval = $string; // Just in case of a problem
$array = explode(" ", $string);
/* Already short enough, return the whole thing*/
if (count($array)<=$wordsreturned)
{
$retval = $string;
}
/* Need to chop of some words*/
else
{
array_splice($array, $wordsreturned);
$retval = implode(" ", $array)." ...";
}
return $retval;
}
I suggest to use str_word_count
:
<?php
$str = "Lorem ipsum dolor sit amet,
consectetur adipiscing elit";
print_r(str_word_count($str, 1));
?>
The above example will output:
Array
(
[0] => Lorem
[1] => ipsum
[2] => dolor
[3] => sit
[4] => amet
[5] => consectetur
[6] => adipiscing
[7] => elit
)
The use a loop to get the words you want.
Source: http://php.net/str_word_count
To select 10 words of the given text you can implement following function:
function first_words($text, $count=10)
{
$words = explode(' ', $text);
$result = '';
for ($i = 0; $i < $count && isset($words[$i]); $i++) {
$result .= $words[$i];
}
return $result;
}
This can easily be done using str_word_count()
$first10words = implode(' ', array_slice(str_word_count($sentence,1), 0, 10));
This might help you. Function to return N no. of words
public function getNWordsFromString($text,$numberOfWords = 6)
{
if($text != null)
{
$textArray = explode(" ", $text);
if(count($textArray) > $numberOfWords)
{
return implode(" ",array_slice($textArray, 0, $numberOfWords))."...";
}
return $text;
}
return "";
}
}
Try this
$str = 'Lorem ipsum dolor sit amet,consectetur adipiscing elit. Mauris ornare luctus diam sit amet mollis.';
$arr = explode(" ", str_replace(",", ", ", $str));
for ($index = 0; $index < 10; $index++) {
echo $arr[$index]. " ";
}
I know this is not time to answer , but let the new comers choose their own answers.
function get_first_num_of_words($string, $num_of_words)
{
$string = preg_replace('/\s+/', ' ', trim($string));
$words = explode(" ", $string); // an array
// if number of words you want to get is greater than number of words in the string
if ($num_of_words > count($words)) {
// then use number of words in the string
$num_of_words = count($words);
}
$new_string = "";
for ($i = 0; $i < $num_of_words; $i++) {
$new_string .= $words[$i] . " ";
}
return trim($new_string);
}
Use it like this:
echo get_first_num_of_words("Lorem ipsum dolor sit amet consectetur adipisicing elit. Aliquid, illo?", 5);
Output: Lorem ipsum dolor sit amet
This function also works very well with unicode characters like Arabic characters.
echo get_first_num_of_words("نموذج لنص عربي الغرض منه توضيح كيف يمكن استخلاص أول عدد معين من الكلمات الموجودة فى نص معين.", 100);
Output: نموذج لنص عربي الغرض منه توضيح كيف يمكن استخلاص أول عدد معين من الكلمات الموجودة فى نص معين.
It is totally what we are searching Just cut n pasted into your program and ran.
function shorten_string($string, $wordsreturned)
/* Returns the first $wordsreturned out of $string. If string
contains fewer words than $wordsreturned, the entire string
is returned.
*/
{
$retval = $string; // Just in case of a problem
$array = explode(" ", $string);
if (count($array)<=$wordsreturned)
/* Already short enough, return the whole thing
*/
{
$retval = $string;
}
else
/* Need to chop of some words
*/
{
array_splice($array, $wordsreturned);
$retval = implode(" ", $array)." ...";
}
return $retval;
}
and just call the function in your block of code just as
$data_itr = shorten_string($Itinerary,25);
I do it this way:
function trim_by_words($string, $word_count = 10) {
$string = explode(' ', $string);
if (empty($string) == false) {
$string = array_chunk($string, $word_count);
$string = $string[0];
}
$string = implode(' ', $string);
return $string;
}
Its UTF8 compatible...
This might help you. Function to return 10 no. of words
.
function num_of_word($text,$numb) {
$wordsArray = explode(" ", $text);
$parts = array_chunk($wordsArray, $numb);
$final = implode(" ", $parts[0]);
if(isset($parts[1]))
$final = $final." ...";
return $final;
return;
}
echo num_of_word($text, 10);
Instead of generating an array of N words, then truncating the array, then re-imploding the words, just truncate the input string after the Nth word. Demo
echo preg_replace('/(?:\s*\S+){10}\K.*/', '', $string);
The pattern will search N sequences of zero or more whitespace character followed by one or more non-whitespace characters, then \K
restarts the fullstring match (effectively "releasing" the matches characters, then .*
will match the rest of the string. Whatever is matched will be replaced with an empty string.
This solution will ensure that the output string does not have more than N words. It is possible that the string has fewer words than N, so be aware that no mutation will take place and that if that string has a trailing whitespace -- that whitespace will not be removed.
To ensure that leading and whitespaces are removed, adjust the pattern to capture zero to N words which are delimited by whitespaces. Demo
$string = ' I would like to know ';
var_export(
preg_replace('/\s*(\S*(?:\s+\S+){0,9}).*/', '$1', $string)
);
© 2022 - 2025 — McMap. All rights reserved.
s($str)->words(10)
helpful, as found in this standalone library. – Pessa