Ripgrep Missing Character Class + Repetitions
Asked Answered
A

1

17

Why do these match:

echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | grep -E 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{1,2}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2,}C'
echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | awk '$0 ~ /CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C/'

But this does not:

echo 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAGTTC' | rg 'CCAGCTACTCGGGAGGCTGAGGCTGGAGGATCGCTTGAGTCCAGGAG[ATCG]{2}C'

I was under the impression that ripgrep used rust regex engine, which should be able to handle the character class + repetition?

Affirmative answered 5/7, 2019 at 16:46 Comment(5)
Looks like a bug in rg. Did you report it?Bulbous
Yup, this is a bug. I filed an issue for you: github.com/BurntSushi/ripgrep/issues/1319Ky
Thanks guys! This was literally the first thing I tried with ripgrep, so I just assumed I was wrong and rg was right.Affirmative
It looks to be fixed on Feb 17.Kamkama
I’m voting to close this question because it identified a good bug in an implementation that has now been fixed. Others are not likely to run into this issue again.Haplology
A
1

This is due to a bug (issue 1319) in ripgrep which was fixed in version 12.0.0.

Action answered 5/7, 2019 at 16:46 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.