For Python 2.5, 2.6, should I be using string.replace
or re.sub
for basic text replacements?
In PHP, this was explicitly stated but I can't find a similar note for Python.
For Python 2.5, 2.6, should I be using string.replace
or re.sub
for basic text replacements?
In PHP, this was explicitly stated but I can't find a similar note for Python.
As long as you can make do with str.replace()
, you should use it. It avoids all the pitfalls of regular expressions (like escaping), and is generally faster.
str.replace()
should be used whenever it's possible to. It's more explicit, simpler, and faster.
In [1]: import re
In [2]: text = """For python 2.5, 2.6, should I be using string.replace or re.sub for basic text replacements.
In PHP, this was explicitly stated but I can't find a similar note for python.
"""
In [3]: timeit text.replace('e', 'X')
1000000 loops, best of 3: 735 ns per loop
In [4]: timeit re.sub('e', 'X', text)
100000 loops, best of 3: 5.52 us per loop
timeit
in your example output? Is that something special to iPython allowing you to use that syntax? (Oh, and +1!) –
Hero replace
vs a single regex. At some point a single regex replace should be faster than having N chained replace
's on a string, no? –
Conversational us
vs ns
) text.replace
took 735 nano-seconds re.sub
took 5,520 nano-seconds which is 7.5 times slower. –
Barayon String manipulation is usually preferable to regex when you can figure out how to adapt it. Regex is incredibly powerful, but it's usually slower, and usually harder to write, debug, and maintain.
That being said, notice the amount of "usually" in the above paragraph! It's possible (and I've seen it done) to write a zillion lines of string manipulation for something you could've done with a 20-character regex. It's also possible to waste valuable time using "efficient" string functions on tasks a good regex engine could do almost as fast. Then there's maintainability: Regex can be horribly complex, but sometimes a regex will be simpler and easier to read than a giant block of procedural code.
Regex is fantastic for its intended purpose: searching for highly-variable needles in highly-variable haystacks. Think of it as a precision torque wrench: It's the perfect tool for a specific set of jobs, but it makes a lousy hammer.
- Is the pattern you're looking for highly static? For example, do you want to split a string on every comma, pipe, or tab?
- Is resource efficiency more important than developer time? What are your priorities? Remember: Hardware is cheap, programmers are expensive.
- Are you working with HTML, XML, or other context-free grammars? Don't forget that regex has limitations.
- And my #1 rule of thumb: If you work on the problem for 5 minutes, can you rough out an idea for a non-regex approach?
If the answer to any of these questions is "yes", you probably want string manipulation. Otherwise, consider regex.
Another thing to consider is that if you're doing rather complex replacements, str.translate() might be what you're looking for.
© 2022 - 2024 — McMap. All rights reserved.
split()
,replace()
,find()
et al) without needing multiple status variables, complicated slicing etc you should. If it starts getting complex, then you move alternate tools such as regular expressions. – Lavation