Python regex partial extract
Asked Answered
H

1

7

I want to find all data enclosed in [[ ]] these brackets.

[[aaaaa]] -> aaaaa

My python code (using re library) was

la = re.findall(r'\[\[(.*?)\]\]', fa.read())

What if I want to extract only 'a' from [[a|b]]

Any concise regular expression for this task? ( extract data before | )

Or should I use additional if statement?

Hazen answered 28/9, 2015 at 3:14 Comment(0)
B
3

You can try:

r'\[\[([^\]|]*)(?=.*\]\])'

([^\]|]*) will match until a | or ] is found. And (?=.*\]\]) is a lookahead to ensure that ]] is matched on RHS of match.

Testing:

>>> re.search( r'\[\[([^\]|]*)(?=.*\]\])', '[[aaa|bbb]]' ).group(1)
'aaa'
>>> re.search( r'\[\[([^\]|]*)(?=.*\]\])', '[[aaabbb]]' ).group(1)
'aaabbb'
Borscht answered 28/9, 2015 at 3:21 Comment(2)
Thank you for providing concise answer and explanation. I should read more articles about regular expression.Hazen
Only one thing to note - this won't handle nested brackets (and in fact a regex wouldn't in general anyway without the help of a counter).Network

© 2022 - 2024 — McMap. All rights reserved.