Regex that matches Camel and Pascal Case
Asked Answered
P

7

11

I'm about to write a parser for a language that's supposed to have strict syntactic rules about naming of types, variables and such. For example all classes must be PascalCase, and all variables/parameter names and other identifiers must be camelCase.

For example HTMLParser is not allowed and must be named HtmlParser. Any ideas for a regexp that can match something that is PascalCase, but does not have two capital letters in it?

Propertied answered 20/1, 2010 at 17:47 Comment(4)
I believe that last sentence should be "...but does not have two consecutive capital letters in it?"Lavolta
Suppose I want to write a C preprocessor in that language. Must I name my class Cpreprocessor? Are underscores (C_Preprocessor) allowed?Formulaic
Would H be a valid class name?Octosyllabic
@Chris yeah, it should not have 2 consecutive capital letters in it. C_preprocessor is not allowed, it'd have to be PreprocessorForC or something similar.Propertied
Y
23

camelCase:

^[a-z]+(?:[A-Z][a-z]+)*$

PascalCase:

^[A-Z][a-z]+(?:[A-Z][a-z]+)*$
Ylla answered 21/1, 2010 at 1:57 Comment(2)
Neither of the above worked for me for some reason. The following did however (?:[a-z]+|[A-Z]+|^)([a-z]|\d)* (remove the |\d if you don't want numbers included.Glassman
This doesn't capture capitals at the end, e.g. ModeA. It also doesn't allow 2 capital letters in a row (which is generally accepted, e.g. CreateAMode, CreateBMode)Elisha
O
4

^[A-Z][a-z]*([A-Z][a-z]*)

This should work for :

  1. MadeEasy
  2. WonderFul
  3. AndMe

this types of patters.

Orfinger answered 8/4, 2019 at 7:25 Comment(0)
E
3
/([A-Z][a-z]+)*[A-Z][a-z]*/

But I have to say your naming choice stinks, HTMLParser should be allowed and preferred.

Eu answered 20/1, 2010 at 17:51 Comment(3)
+1 for a regex and a comment on the naming convention that both look suspiciously similar to what I was going to post, though I would simplify the regex to /(?:[A-Z][a-z]+)+/ (I don't think the OP is concerned with allowing AaA as a class name).Lavolta
Yeah, I considered that, but figured AaA doesn't have two consecutive uppercase letters. A bigger problem not yet addressed by this scheme is numbers, do they count as upper, lower, neither, or both?Eu
It's missing some details - like numbers, other than that it seems to work.Propertied
E
3

I don't believe the items listed can start with numbers (thought I read it somewhere so take it with a grain of salt) so the best case would be something like Roger Pate's with a few minor modifications (in my opinion)

/^([A-Z][a-z0-9]+)*[A-Z][a-z0-9]*$/

Should be something like, Look for a Capital Letter, then at least one small case or number, or more, as well as it looks like it handles just a capital letter as that seems to be required, but the additional letters are optional.

Good luck

Egide answered 21/1, 2010 at 2:10 Comment(1)
/([A-Z][a-z0-9]+)*[A-Z][a-z0-9]*/.test("HELLO") is true, it requires ^ and $Straightedge
E
1
^[A-Z]{1,2}([a-z]+[A-Z]{0,2})*$

This allows 2 consecutive capital characters (which is generally accepted, but unluckily PascalCase is not a spec).

Elisha answered 8/4, 2022 at 9:47 Comment(0)
D
0

Although the original post specifically excluded two consecutive capital (uppercase) letters, I'd like to post the regex for PascalCase that will answer many comments raised:

  • Allowing two consecutive capital letters
  • Allowing digits (but not as the leading character in the string)
  • Allowing a string ending with a capital letter or a digit

The regex is ^[A-Z][a-z0-9]*(?:[A-Z][a-z0-9]*)*(?:[A-Z]?)$

When tested against all strings raised in all comments, the following match as PascalCase:

PascalCase
Pascal2Case
PascalCaseA
Pascal2CaseA
ModeA
Mode2A
Mode2A2
Mode2A2A
CreateAMode
CreateBMode
MadeEasy
WonderFul
AndMe
Context
HTMLParser
HtmlParser
H
AaA
HELLO

The following do not match as PascalCase:

camelCase
2PascalCase
Dachau answered 5/7, 2022 at 10:8 Comment(0)
P
0

Lower Camel Case - no digits allowed


    ^[a-z][a-z]*(([A-Z][a-z]+)*[A-Z]?|([a-z]+[A-Z])*|[A-Z])$
    

Test Cases: https://regex101.com/library/4h7A1I

Lower Camel Case - digits allowed


    ^[a-z][a-z0-9]*(([A-Z][a-z0-9]+)*[A-Z]?|([a-z0-9]+[A-Z])*|[A-Z])$

Test Cases: https://regex101.com/library/8nQras

Pascal Case - no digits allowed


    ^[A-Z](([a-z]+[A-Z]?)*)$

Test Cases: https://regex101.com/library/sF2jRZ

Pascal Case - digits allowed


    ^[A-Z](([a-z0-9]+[A-Z]?)*)$

Test Cases: https://regex101.com/library/csrkQw

For more details on camel case and pascal case check out this repo.

Pericarditis answered 12/8, 2022 at 22:3 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.