is there a way to highlight all the special accent characters in sublime text or any other text editor?
Asked Answered
K

3

80

I a using the the HTML encode special characters in Sublime text to convert all the special character into their HTML code. I have a lot of accented characters in different parts of the file. So, it would be great if I could select all the special character and then use the plugin to convert all at once!

Is there a regex that helps select all special characters only?

Kirov answered 20/12, 2012 at 16:3 Comment(0)
C
198

Yes.

Sublime text supports regular expression and you can select all non-ASCII (code point > 128) characters. This regex find should be enough for you:

[^\x00-\x7F]

Just search and replace.

But if you are doing manual HTML encode in the first place you are doing it wrong. Save your files as UTF-8 encoding (Sublime Text 2 default) and make sure your web server also sends out those files as UTF-8. No conversion, encoding or anything needed.

Conventionalism answered 21/12, 2012 at 18:15 Comment(4)
However, when coding an HTML email, using UTF-8 usually isn't an option because it's not supported in all email clients. In these cases, manual HTML encoding is necessary.Pogey
@mtnorthrop: Can you please tell when UTF-8 causes issues? Namely I am sending out tons of HTML emails and I'd like to know which kind of problems I can run into.Conventionalism
can't thank you enough for this...have been trying to look at a non utf-8 data file for hours trying to figure this outWorm
Great! This regex solution is not limited to sublime editor, it also works for any other editor that supports regex searchImpressment
T
15

Just as further reference (or as complement):

The Sublime Text 2/3 package, named Highlighter, can (as his name says) highlight some characters with regex...

"You can also add a custom regex for characters to highlight."

So, with this package, plus @Mikko Ohtamaa answer, we can edit the file...

highlighter.sublime-settings - User

...and include the proposed regex, (expresed here as [^\\x00-\\x7F]) to end up with something like this:

{  
    "highlighter_regex": "(\t+ +)|( +\t+)|[^\\x00-\\x7F]|[\u2026\u2018\u2019\u201c\u201d\u2013\u2014]|[\t ]+$"  
}

The result would be an automatic highlight of any "non-ASCII (code point > 128) characters" in our file.

Note, this wil not made a selection of those characters, only will highlight them to easily realize if you have any.

Trine answered 30/4, 2014 at 10:45 Comment(0)
D
7

Another plugin option

I recently wrote a plugin dedicated to highlighting non-ascii characters: https://github.com/TuureKaunisto/highlight-dodgy-chars

The exactly same functionality can be achieved with Highlighter but with the less generic Highlight Dodgy Chars plugin you don't need to write a regular expression, you can just list the non-ascii characters you don't wish to highlight in the settings. The European special characters are whitelisted by default.

Denomination answered 19/12, 2015 at 21:49 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.