This could reference a different character encoding where t
is larger than x80
and x80
can't be addressed normally.
Take EBCDIC Scan codes for example (see here for a reference).
(But I too have no clue why somebody would want to write it that way)
For ASCII I have a wild guess: If -t
means "until the next token -1" or if placed last in line "until the end of allowed characters" the second query would state this:
To:(not a newline, more than one character)(not a newline)
So basically the expression [\x01-t\x0B\x0C\x0E-t\x80-t]
would mean [^\r\n]
.
If one applies that to (.Ç-t]|...[Ç-t])
that would address any character larger than 7bit ASCII which also could address all of unicode (besides the first 127 characters).
(That being said, I still have no clue why somebody should write it like this, but at least thats a coherent explanation besides "Its a bug")
Maybe helpful: What does the rexexes you posted mean if one writes out the \xYY?
ASCII:
/=\NULL\DEVICE_CONTROL_2\NULL\.{10}\(.Ç-t]|...[Ç-t])/smiR
/^To\:[^\r\n]+[\START_OF_HEADING-t\VERTICALTAB\FORMFEED\SHIFTOUT\Ç-t]/smi
Looking after the \0x12
aka Device control 2
could help, because that won't show up in text, but maybe in net traffic.
-t\b
but there was not match. Which means there's nothing special about-t
in pcre. Now there are a few possibilities: 1) The range is just an error from the author 2)0x80
is128
in decimal, if you try€
in a browser you get the euro symbol€
. So maybe the program is using some kind of other encoding/character table ? – Restlesst
and notτ
or other letters that look close tot
) – Fluorometer