My program looks something like this:
import re
# Escape the string, in case it happens to have re metacharacters
my_str = "The quick brown fox jumped"
escaped_str = re.escape(my_str)
# "The\\ quick\\ brown\\ fox\\ jumped"
# Replace escaped space patterns with a generic white space pattern
spaced_pattern = re.sub(r"\\\s+", r"\s+", escaped_str)
# Raises error
The error is this:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "/home/swfarnsworth/programs/pycharm-2019.2/helpers/pydev/_pydev_bundle/pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "/home/swfarnsworth/programs/pycharm-2019.2/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/home/swfarnsworth/projects/medaCy/medacy/tools/converters/con_to_brat.py", line 255, in <module>
content = convert_con_to_brat(full_file_path)
File "/home/swfarnsworth/projects/my_file.py", line 191, in convert_con_to_brat
start_ind = get_absolute_index(text_lines, d["start_ind"], d["data_item"])
File "/home/swfarnsworth/projects/my_file.py", line 122, in get_absolute_index
entity_pattern_spaced = re.sub(r"\\\s+", r"\s+", entity_pattern_escaped)
File "/usr/local/lib/python3.7/re.py", line 192, in sub
return _compile(pattern, flags).sub(repl, string, count)
File "/usr/local/lib/python3.7/re.py", line 309, in _subx
template = _compile_repl(template, pattern)
File "/usr/local/lib/python3.7/re.py", line 300, in _compile_repl
return sre_parse.parse_template(repl, pattern)
File "/usr/local/lib/python3.7/sre_parse.py", line 1024, in parse_template
raise s.error('bad escape %s' % this, len(this))
re.error: bad escape \s at position 0
I get this error even if I remove the two backslashes before the '\s+'
or if I make the raw string (r"\\\s+"
) into a regular string. I checked the Python 3.7 documentation, and it appears that \s
is still the escape sequence for white space.
entity_pattern_escaped
changed toescaped_str
, thenprint(spaced_pattern)
producesThe\s+quick\s+brown\s+fox\s+jumped
which looks like the desired result. – Odisodium