Truncate string on whole words in .NET C#
Asked Answered
T

10

67

I am trying to truncate some long text in C#, but I don't want my string to be cut off part way through a word. Does anyone have a function that I can use to truncate my string at the end of a word?

E.g:

"This was a long string..."

Not:

"This was a long st..."
Triplicate answered 23/10, 2009 at 14:37 Comment(3)
Could you give your current solution for truncating?Heroine
@Cloud Just .Substring(0, <number of characters>)Triplicate
Well if <number of characters> is higher than the actual string, substring will throw an exception, requiring an extra check.Socialminded
T
88

Thanks for your answer Dave. I've tweaked the function a bit and this is what I'm using ... unless there are any more comments ;)

public static string TruncateAtWord(this string input, int length)
{
    if (input == null || input.Length < length)
        return input;
    int iNextSpace = input.LastIndexOf(" ", length, StringComparison.Ordinal);
    return string.Format("{0}…", input.Substring(0, (iNextSpace > 0) ? iNextSpace : length).Trim());
}
Triplicate answered 23/10, 2009 at 15:7 Comment(7)
Further to this, I am also now calling another string utility function from within this one, which strips out any HTML tags (using RegEx). This minimises the risk of broken HTML as a result of truncation, as all string will be in plain text.Triplicate
Note that this method looks for the first space AFTER the specified length value, almost always causing the resulting string to be longer than the value. To find the last space prior to length, simply substitute input.LastIndexOf(" ", length) when calculating iNextSpace.Piselli
+100 for CBono's comment - this needs to be before! In the eventofareallylongwordlikethisoneyouwillhaveaverylongstringthatisfarbeyondyourdesiredlength!Peppergrass
Note also that the ellipsis (three periods) appended to the end of the truncated string will push the string over the maximum length in certain cases.Infidel
There is a ellipses char that you could add instead. '…'Archaeornis
In line 5 I would suggest to use: int iNextSpace = input.LastIndexOf(" ", length, System.StringComparison.Ordinal); for language specific charactersCarthage
as @GoranŽuri has recommended ( and Resharper also gave me the suggestion / warning) I added StringComparison.Ordinal in line 5 and also tested it to make sure it works as expected.Toddy
R
99

Try the following. It is pretty rudimentary. Just finds the first space starting at the desired length.

public static string TruncateAtWord(this string value, int length) {
    if (value == null || value.Length < length || value.IndexOf(" ", length) == -1)
        return value;

    return value.Substring(0, value.IndexOf(" ", length));
}
Redress answered 23/10, 2009 at 14:40 Comment(7)
Perfect! And not a regex in sight :)Triplicate
It might make sense to find the first space BEFORE the desired length? Otherwise, you have to guess at what the desired length?Mezzanine
Also should be -1 for not using regex ;)Photocomposition
@mlseeves Good point but I'm not too worried in this instance, as this is just a vanity function, so there is no fixed length cut off.Triplicate
The @string usage is uncalled-for: it's unnecessary and confusing in this instance. The parameter could as easily have been named str. If not for that, I would have upvoted this answer.Abattoir
I don't agree on the use of the abbreviation "str" - I think abbreviations always make code less readable. +1 for the comment to use "value" instead.Haletky
This answer is better than other answers, just needs one change tho, instead of IndexOf you should use LastIndexOf, otherwise the method's output could be longer than the lengthRecreant
T
88

Thanks for your answer Dave. I've tweaked the function a bit and this is what I'm using ... unless there are any more comments ;)

public static string TruncateAtWord(this string input, int length)
{
    if (input == null || input.Length < length)
        return input;
    int iNextSpace = input.LastIndexOf(" ", length, StringComparison.Ordinal);
    return string.Format("{0}…", input.Substring(0, (iNextSpace > 0) ? iNextSpace : length).Trim());
}
Triplicate answered 23/10, 2009 at 15:7 Comment(7)
Further to this, I am also now calling another string utility function from within this one, which strips out any HTML tags (using RegEx). This minimises the risk of broken HTML as a result of truncation, as all string will be in plain text.Triplicate
Note that this method looks for the first space AFTER the specified length value, almost always causing the resulting string to be longer than the value. To find the last space prior to length, simply substitute input.LastIndexOf(" ", length) when calculating iNextSpace.Piselli
+100 for CBono's comment - this needs to be before! In the eventofareallylongwordlikethisoneyouwillhaveaverylongstringthatisfarbeyondyourdesiredlength!Peppergrass
Note also that the ellipsis (three periods) appended to the end of the truncated string will push the string over the maximum length in certain cases.Infidel
There is a ellipses char that you could add instead. '…'Archaeornis
In line 5 I would suggest to use: int iNextSpace = input.LastIndexOf(" ", length, System.StringComparison.Ordinal); for language specific charactersCarthage
as @GoranŽuri has recommended ( and Resharper also gave me the suggestion / warning) I added StringComparison.Ordinal in line 5 and also tested it to make sure it works as expected.Toddy
C
5

My contribution:

public static string TruncateAtWord(string text, int maxCharacters, string trailingStringIfTextCut = "&hellip;")
{
    if (text == null || (text = text.Trim()).Length <= maxCharacters) 
      return text;

    int trailLength = trailingStringIfTextCut.StartsWith("&") ? 1 
                                                              : trailingStringIfTextCut.Length; 
    maxCharacters = maxCharacters - trailLength >= 0 ? maxCharacters - trailLength 
                                                     : 0;
    int pos = text.LastIndexOf(" ", maxCharacters);
    if (pos >= 0)
        return text.Substring(0, pos) + trailingStringIfTextCut;

    return string.Empty;
}

This is what I use in my projects, with optional trailing. Text will never exceed the maxCharacters + trailing text length.

Cracking answered 12/9, 2012 at 16:21 Comment(0)
M
4

If you are using windows forms, in the Graphics.DrawString method, there is an option in StringFormat to specify if the string should be truncated, if it does not fit into the area specified. This will handle adding the ellipsis as necessary.

http://msdn.microsoft.com/en-us/library/system.drawing.stringtrimming.aspx

Mezzanine answered 23/10, 2009 at 14:49 Comment(1)
This is for an ASP.Net page, but I do some Win Forms stuff so good to know!Triplicate
S
3

I took your approach a little further:

public string TruncateAtWord(string value, int length)
{
    if (value == null || value.Trim().Length <= length)
        return value;

    int index = value.Trim().LastIndexOf(" ");

    while ((index + 3) > length)
        index = value.Substring(0, index).Trim().LastIndexOf(" ");

    if (index > 0)
        return value.Substring(0, index) + "...";

    return value.Substring(0, length - 3) + "...";
}

I'm using this to truncate tweets.

Sharla answered 24/2, 2012 at 17:37 Comment(1)
I would consider extracting the "..." into a constant, because if you decide to change it you have to update it in 4 places now (if you include the number 3)Celtic
Z
3

This solution works too (takes first 10 words from myString):

String.Join(" ", myString.Split(' ').Take(10))
Zigzag answered 14/3, 2016 at 11:5 Comment(1)
This is actually pretty neat. There are some scenarios it doesn't cater for (word. for example), but generally a nice readable approach.Physicality
H
2

Taking into account more than just a blank space separator (e.g. words can be separated by periods followed by newlines, followed by tabs, etc.), and several other edge cases, here is an appropriate extension method:

    public static string GetMaxWords(this string input, int maxWords, string truncateWith = "...", string additionalSeparators = ",-_:")
    {
        int words = 1;
        bool IsSeparator(char c) => Char.IsSeparator(c) || additionalSeparators.Contains(c);

        IEnumerable<char> IterateChars()
        {
            yield return input[0];

            for (int i = 1; i < input.Length; i++)
            {
                if (IsSeparator(input[i]) && !IsSeparator(input[i - 1]))
                    if (words == maxWords)
                    {
                        foreach (char c in truncateWith)
                            yield return c;

                        break;
                    }
                    else
                        words++;

                yield return input[i];
            }
        }

        return !input.IsNullOrEmpty()
            ? new String(IterateChars().ToArray())
            : String.Empty;
    }
Hippodrome answered 25/5, 2020 at 11:0 Comment(1)
This is the only version that solved it for me, handling quotes... eg.. "User Updated 'Appointment Date' to 2020-10-28T12:00:00.000Z from 2020-10-28T04:00:00.000Z"Titre
N
1

simplified, added trunking character option and made it an extension.

    public static string TruncateAtWord(this string value, int maxLength)
    {
        if (value == null || value.Trim().Length <= maxLength)
            return value;

        string ellipse = "...";
        char[] truncateChars = new char[] { ' ', ',' };
        int index = value.Trim().LastIndexOfAny(truncateChars);

        while ((index + ellipse.Length) > maxLength)
            index = value.Substring(0, index).Trim().LastIndexOfAny(truncateChars);

        if (index > 0)
            return value.Substring(0, index) + ellipse;

        return value.Substring(0, maxLength - ellipse.Length) + ellipse;
    }
Neptunian answered 24/7, 2014 at 10:21 Comment(1)
this simply does not work as intended. Please at least do a sanity check before posting anything here.Ballance
A
1

Heres what i came up with. This is to get the rest of the sentence also in chunks.

public static List<string> SplitTheSentenceAtWord(this string originalString, int length)
    {
        try
        {
            List<string> truncatedStrings = new List<string>();
            if (originalString == null || originalString.Trim().Length <= length)
            {
                truncatedStrings.Add(originalString);
                return truncatedStrings;
            }
            int index = originalString.Trim().LastIndexOf(" ");

            while ((index + 3) > length)
                index = originalString.Substring(0, index).Trim().LastIndexOf(" ");

            if (index > 0)
            {
                string retValue = originalString.Substring(0, index) + "...";
                truncatedStrings.Add(retValue);

                string shortWord2 = originalString;
                if (retValue.EndsWith("..."))
                {
                    shortWord2 = retValue.Replace("...", "");
                }
                shortWord2 = originalString.Substring(shortWord2.Length);

                if (shortWord2.Length > length) //truncate it further
                {
                    List<string> retValues = SplitTheSentenceAtWord(shortWord2.TrimStart(), length);
                    truncatedStrings.AddRange(retValues);
                }
                else
                {
                    truncatedStrings.Add(shortWord2.TrimStart());
                }
                return truncatedStrings;
            }
            var retVal_Last = originalString.Substring(0, length - 3);
            truncatedStrings.Add(retVal_Last + "...");
            if (originalString.Length > length)//truncate it further
            {
                string shortWord3 = originalString;
                if (originalString.EndsWith("..."))
                {
                    shortWord3 = originalString.Replace("...", "");
                }
                shortWord3 = originalString.Substring(retVal_Last.Length);
                List<string> retValues = SplitTheSentenceAtWord(shortWord3.TrimStart(), length);

                truncatedStrings.AddRange(retValues);
            }
            else
            {
                truncatedStrings.Add(retVal_Last + "...");
            }
            return truncatedStrings;
        }
        catch
        {
            return new List<string> { originalString };
        }
    }
Amoebaean answered 25/7, 2015 at 22:52 Comment(0)
B
-1

I use this

public string Truncate(string content, int length)
    {
        try
        {
            return content.Substring(0,content.IndexOf(" ",length)) + "...";
        }
        catch
        {
            return content;
        }
    }
Backandforth answered 10/8, 2014 at 6:50 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.