I'm looking for a regex to match hyphenated words in Python.
The closest I've managed to get is: '\w+-\w+[-w+]*'
text = "one-hundered-and-three- some text foo-bar some--text"
hyphenated = re.findall(r'\w+-\w+[-\w+]*',text)
which returns list ['one-hundered-and-three-', 'foo-bar']
.
This is almost perfect except for the trailing hyphen after 'three'
. I only want the additional hyphen if followed by a 'word'
. i.e. instead of the '[-\w+]\*'
I need something like '(-\w+)*'
which I thought would work, but doesn't (it returns ['-three, '']
). i.e. something that matches |word
followed by hyphen followed by word followed by hyphen_word zero or more times|.