Match words that don't start with a certain letter using regex

Asked 16/5, 2018 at 15:13 Answered 16/5, 2018 at 15:34

Solved python regex regex-negation regex-lookarounds

I am learning regex but have not been able to find the right regex in python for selecting characters that start with a particular alphabet.

Example below

text='this is a test'
match=re.findall('(?!t)\w*',text)

# match returns
['his', '', 'is', '', 'a', '', 'est', '']

match=re.findall('[^t]\w+',text)

# match
['his', ' is', ' a', ' test']

Expected : ['is','a']

Toxic answered 16/5, 2018 at 15:13 Comment(2)

Try: regex101.com/r/OzUEO9/1 – Chuipek 16/5, 2018 at 15:15

[i for i in text.split() if i[0] != 't'] – Anthony 16/5, 2018 at 15:17

With regex

Use the negative set [^\Wt] to match any alphanumeric character that is not t. To avoid matching subsets of words, add the word boundary metacharacter, \b, at the beginning of your pattern.

Also, do not forget that you should use raw strings for regex patterns.

import re

text = 'this is a test'
match = re.findall(r'\b[^\Wt]\w*', text)

print(match) # prints: ['is', 'a']

See the demo here.

Without regex

Note that this is also achievable without regex.

text = 'this is a test'
match = [word for word in text.split() if not word.startswith('t')]

print(match) # prints: ['is', 'a']

Gynaecomastia answered 16/5, 2018 at 15:22 Comment(0)

You are almost on the right track. You just forgot \b (word boundary) token:

\b(?!t)\w+

Live demo

Mungo answered 16/5, 2018 at 15:34 Comment(1)

Thanks .Actually match=re.findall(r'\b(?!t)\w+',text) worked. It was looking for raw string – Toxic 16/5, 2018 at 17:17

With regex

Without regex

Recommended topics

Hot tags