character-properties Questions
1
Solved
Suppose I want to match a lowercase letter followed by an uppercase letter, I could do something like
re.compile(r"[a-z][A-Z]")
Now I want to do the same thing for unicode strings, i.e. match so...
Yesterday asked 13/9, 2011 at 6:50
2
Solved
The various levels of Unicode Regular Expression support are described in UTS#18.
Is there a way to to have a few tests for every requirement, so it is possible to port the tests to the language i...
Graft asked 19/8, 2011 at 18:0
3
Solved
I would like to use a regular expression like this in Java : [[=a=][=e=][=i=]].
But Java doesn't support the POSIX classes [=a=], [=e=] etc.
How can I do this? More precisely, is there a way to n...
Lois asked 7/7, 2011 at 15:12
2
Solved
I need to write a regular expression so I could replace the invalid characters in user's input before sending it further. I think i need to use string.replaceAll("regex", "replacement") to do that....
Associative asked 27/6, 2011 at 13:54
2
Solved
I have some documents that went through OCR conversion from PDF into HTML. Because of that, they wound up having lots of random unicode punctuation where the converter messed up (i.e. elipses, etc....
Catchpenny asked 14/5, 2011 at 23:32
2
I would like to use this regular expression new RegExp("\b"+pat+"\b") in greek text but the "\b" metacharacter supports only ASCII characters.
I tried XregExp library but i didnt manage to solve t...
Sharma asked 13/4, 2011 at 13:33
2
Solved
I came across some regular expressions that contain [^\\p{L}]. I understand that this is using some form of a Unicode category, but when I checked the documentation, I found only the following "L" ...
Overscrupulous asked 11/5, 2011 at 19:20
2
Solved
In JavaScript we can match individual Unicode codepoints or codepoint ranges by using the Unicode escape sequences, e.g.:
"A".match(/\u0041/) // => ["A"]
"B".match(/[\u0041-\u007A]/) // => [...
Epsomite asked 6/4, 2011 at 18:18
4
Solved
I have a script that parses the filenames of TV episodes (show.name.s01e02.avi for example), grabs the episode name (from the www.thetvdb.com API) and automatically renames them into something nice...
Massif asked 18/8, 2008 at 9:41
3
Solved
There are many questions and answers here on StackOverflow that assume a "letter" can be matched in a regexp by [a-zA-Z]. However with Unicode there are many more characters that most people would ...
Bea asked 15/3, 2011 at 17:10
4
I need the list of ranges of Unicode characters with the property Alphabetic as defined in http://www.unicode.org/Public/5.1.0/ucd/UCD.html#Alphabetic. However, I cannot find them in the Unicode Ch...
Urdu asked 30/1, 2011 at 14:13
2
Solved
Recently I discovered, to my surprise, that JavaScript has no built-in support for Unicode regular expressions.
So how can I test a string for letters only, Unicode or ASCII?
Floats asked 10/12, 2010 at 7:44
2
Solved
Is there a regex which accepts any symbol?
EDIT: To clarify what I'm looking for.. I want to build a regex which will accept ANY number of whitespaces and the it must contain atleast 1 symbol (e.g...
Interrogate asked 3/12, 2010 at 12:48
3
Solved
Many modern regex implementations interpret the \w character class shorthand as "any letter, digit, or connecting punctuation" (usually: underscore). That way, a regex like \w+ matches words like h...
Phineas asked 29/11, 2010 at 15:0
2
Solved
Is there any way in Java so that I can obtain all the Unicode characters of a particular language (for example Bengali or Arabic)?
Happily asked 21/11, 2010 at 10:59
1
Solved
I need to replace all special control character in a string in Java.
I want to ask the Google maps API v3, and Google doesn't seems to like these characters.
Example: http://www.google.com/maps/a...
Round asked 9/8, 2010 at 9:48
1
Solved
I'm trying to craft a Java regular expression to split strings of the general format "foo - bar" into "foo" and "bar" using Pattern.split(). The "-" character may be one of several dashes: the ASCI...
Crippen asked 15/6, 2010 at 13:22
2
Solved
How to determine if a character is a Chinese character using ruby?
Barbuto asked 28/4, 2010 at 8:22
1
Solved
#coding: utf-8
str2 = "asdfМикимаус"
p str2.encoding #<Encoding:UTF-8>
p str2.scan /\p{Cyrillic}/ #found all cyrillic characters
str2.gsub!(/\w/u,'') #removes only latin characters
puts str2...
Photoemission asked 27/4, 2010 at 14:6
5
Solved
I'm looking for a way to match only fully composed characters in a Unicode string.
Is [:print:] dependent upon locale in any regular expression implementation that incorporates this character clas...
Electrograph asked 15/10, 2008 at 3:10
4
How do I convert the regular expression
\w+
To give me the whole words in Unicode – not just ASCII?
I use .net
Acclaim asked 25/11, 2009 at 12:22
4
Solved
Without looping over the entire range of Unicode characters, how can I get a list of characters that have a given property? In particular I want a list of all characters that are digits (i.e. those...
Insinuating asked 25/7, 2009 at 16:18
2
Solved
I need to delete some Unicode symbols from the string 'بِسْمِ اللَّهِ الرَّحْمَٰنِ الرَّحِيمِ'
I know they exist here for sure. I tried:
re.sub('([\u064B-\u0652\u06D4\u0670\u0674\u06D5-\u06ED]+)'...
Hebdomadal asked 26/12, 2008 at 14:40
© 2022 - 2024 — McMap. All rights reserved.