Remove ASCII color codes
Asked Answered
D

2

19

So, I'm having an issue. I'm catching some stuff from a Logger, And the output looks something like this:

11:41:19 [INFO] ←[35;1m[Server] hi←[m

I need to know how to remove those pesky ASCII color codes (or to parse them).

Dusen answered 1/2, 2013 at 18:15 Comment(0)
W
48

If they're intact, they should consist of ESC (U+001B) plus [ plus a semicolon-separated list of numbers, plus m. (See https://mcmap.net/q/665716/-colour-output-of-program-run-under-bash-closed.) In that case, you can remove them by writing:

final String msgWithoutColorCodes =
    msgWithColorCodes.replaceAll("\u001B\\[[;\\d]*m", "");

. . . or you can take advantage of them by using less -r when examining your logs. :-)

(Note: this is specific to color codes. If you also find other ANSI escape sequences, you'll want to generalize that a bit. I think a fairly general regex would be \u001B\\[[;\\d]*[ -/]*[@-~]. You may find http://en.wikipedia.org/wiki/ANSI_escape_code to be helpful.)

If the sequences are not intact — that is, if they've been mangled in some way — then you'll have to investigate and figure out exactly what mangling has happened.

Whitmore answered 1/2, 2013 at 18:28 Comment(5)
I feel like the question and this answer are highly underrated.Blatman
It works! But it is unclear to me how the second one is more general than the first one: where is the terminal m captured, in the second REGEXP?Surreptitious
@OlivierCailloux: The m is matched by the [@-~].Whitmore
Indeed, this is what I observe. But where is this syntax documented, do you have some reference documentation about it, or is this an undocumented feature of reg exps in Java? Is it a range from @ to ~? What does that mean? I can’t find a precise definition of range covering that case in the API javadoc.Surreptitious
@OlivierCailloux: Yes, it's a range, matching any character from U+0040 (@) to U+007E (~). It's true that the Javadoc doesn't really explain ranges. I guess it's leaning on its last paragraph: "For a more precise description of the behavior of regular expression constructs, please see Mastering Regular Expressions, 3nd Edition, Jeffrey E. F. Friedl, O'Reilly and Associates, 2006."Whitmore
C
-2

How about this regex

replaceAll("\\d{1,2}(;\\d{1,2})?", "");

Based on the format found here: http://bluesock.org/~willg/dev/ansi.html

Cesium answered 1/2, 2013 at 18:28 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.