C# Regex to match the word with dot
Asked Answered
B

3

39

The quick brown fox jumps over the lazy dog" is an English-language pangram, alphabet! that is, a phrase that contains all of the letters of the alphabet. It has been used to test typewriters alphabet. and computer keyboards, and in other applications involving all of the letters in the English alphabet.

I need to get the "alphabet." word in regex. In the above text there are 3 instances. It should not include "alphabet!". I just tried regex with

 MatchCollection match = Regex.Matches(entireText, "alphabet."); 

but this returns 4 instances including "alphabet!". How to omit this and get only "alphabet."

Bolivar answered 17/4, 2011 at 22:34 Comment(0)
I
54

. is a special character in regex, that matches anything. Try escaping it:

 MatchCollection match = Regex.Matches(entireText, @"alphabet\.");
Insalivate answered 17/4, 2011 at 22:37 Comment(7)
Same answers in gap of seconds :)Trident
Hi Harpyon, no results returned for this expression. If I just put "alphabet" there are 4 instances. Is there any specific syntax to c#?Bolivar
Are you sure it doesn't work? I'm unable to test the C# portion of it, but the regex seems to be working when I test it on RegexHero.Curdle
Hi Harypyon, C# wanted to have this option and it worked.. Thanks... RegexOptions myRegexOptions = RegexOptions.None; Regex myRegex = new Regex(strRegex, myRegexOptions);Bolivar
Hi Harypyon, If I want to get that alphabet if only it proceeds with " " or "\n" how should ammend that please..Bolivar
If you want to get alphabet only followed by a space or newline, you can use a lookahead: alphabet(?= |\n).Curdle
It wasn't obvious, I'd underlined it: to make an expression work one needs to add the «@» sign before the string.Abiogenetic
I
23

. is a special character in regular expressions. You need to escape it with a slash first:

Regex.Matches(entireText, "alphabet\\.")

The slash ends up being double because \ inside a string must in turn be escaped with another slash.

Indicate answered 17/4, 2011 at 22:38 Comment(6)
Usually Regular expression strings are better off being verbatimTrident
@manojlds: I hope you agree that this is a matter of preference.Indicate
Yes, but already complex regular expressions would have \\ strewn all around.Trident
Thanks guys but there is no results returned for the expression. Is there a c# specific expression like "^ $" ?Bolivar
@user712307 - ok your comments makes it a little more clear. How are you initializing entireText?Trident
@Trident : RegexOptions myRegexOptions = RegexOptions.None; Regex myRegex = new Regex(strRegex, myRegexOptions); Did that and worked.. Thanks for your comments.Bolivar
T
12

"." has special meaning in Regular expressions. Escape it to match the period

MatchCollection match = Regex.Matches(entireText, @"alphabet\.");

Edit:

Full code, giving expected result:

        string entireText = @"The quick brown fox jumps over the lazy dog is an English-language pangram, alphabet! that is, a phrase that contains all of the letters of the alphabet. It has been used to test typewriters alphabet. and computer keyboards, and in other applications involving all of the letters in the English alphabet.";
        MatchCollection matches = Regex.Matches(entireText, @"alphabet\.");
        foreach (Match match in matches)
        {
            foreach (Group group in match.Groups)
            {
                Console.WriteLine(group);
            }
        }
Trident answered 17/4, 2011 at 22:38 Comment(5)
Hi manojlds, no results returned for this expression. If I just put "alphabet" there are 4 instances. Is there any specific syntax to c#?Bolivar
Just verified in C#. Gives three alphabet..Please verify your code. See my editTrident
HI Manjojlds.. thanks for your code. it works.. I added this portion: RegexOptions myRegexOptions = RegexOptions.None; Regex myRegex = new Regex(strRegex, myRegexOptions);Bolivar
If I want to get that alphabet if only it proceeds with " " or "\n" how should ammend that please..Bolivar
Use something like \salphabet\. for the regex. \s matches any whitespace character (spaces, tabs, line breaks).Trident

© 2022 - 2024 — McMap. All rights reserved.