What is the ultimate postal code and zip regex?
Asked Answered
I

20

253

I'm looking for the ultimate postal code and zip code regex. I'm looking for something that will cover most (hopefully all) of the world.

Inestimable answered 23/2, 2009 at 16:58 Comment(4)
One single regex for all postal codes would be useless for most cases, not to mention requiring a lot of unicode encoding. Much better is to check regex on a country-by-country basis so that you don't validate things like "New York, NY AF23Q" as correct.Poona
You have a problem. You write a regex for it. Now you have two problems.Nowt
regexlib.com/Search.aspx?k=decimal&c=3&m=-1&ps=100 for validating a field go hereSeigler
The one that handles all possible future values.Costin
R
149

There is none.

Postal/zip codes around the world don't follow a common pattern. In some countries they are made up by numbers, in others they can be combinations of numbers an letters, some can contain spaces, others dots, the number of characters can vary from two to at least six...

What you could do (theoretically) is create a seperate regex for every country in the world, not recommendable IMO. But you would still be missing on the validation part: Zip code 12345 may exist, but 12346 not, maybe 12344 doesn't exist either. How do you check for that with a regex?

You can't.

Relentless answered 23/2, 2009 at 17:10 Comment(7)
I suspect that a regex could be compiled, but that a task like this be much better suited to a database. The regex would look something like 10000|10001|10002|10003|.......Witham
for validating a field go here regexlib.com/Search.aspx?k=decimal&c=3&m=-1&ps=100Seigler
You can use a regexp first that matches your country (see en.wikipedia.org/wiki/List_of_postal_codes) and do a real check by an external service like geonames.org/export/ws-overview.htmlRegional
My two cents: in Brazil it is actualy 8 numbers, 5 followed by a dash and 3 moreJarita
^\d{5}(?:[-\s]\d{4})?$Sudatorium
It would be useful for many types of software to have a per-country regex. One example: if you are taking GUI input and want to validate in real time if it's a valid zip code, you wait until your text entry field matches the regex, then you validate against a more fine-grained database.Vibrator
dude this is the best answer without a solution I have ever seenArgile
T
341

The unicode CLDR contains the postal code regex for each country. (158 regex's in total!)

Google also has a web service with per-country address formatting information, including postal codes, here - http://i18napis.appspot.com/address (I found that link via http://unicode.org/review/pri180/ )

Edit

Here a copy of postalCodeData.xml regex :

"GB", "GIR[ ]?0AA|((AB|AL|B|BA|BB|BD|BH|BL|BN|BR|BS|BT|CA|CB|CF|CH|CM|CO|CR|CT|CV|CW|DA|DD|DE|DG|DH|DL|DN|DT|DY|E|EC|EH|EN|EX|FK|FY|G|GL|GY|GU|HA|HD|HG|HP|HR|HS|HU|HX|IG|IM|IP|IV|JE|KA|KT|KW|KY|L|LA|LD|LE|LL|LN|LS|LU|M|ME|MK|ML|N|NE|NG|NN|NP|NR|NW|OL|OX|PA|PE|PH|PL|PO|PR|RG|RH|RM|S|SA|SE|SG|SK|SL|SM|SN|SO|SP|SR|SS|ST|SW|SY|TA|TD|TF|TN|TQ|TR|TS|TW|UB|W|WA|WC|WD|WF|WN|WR|WS|WV|YO|ZE)(\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}))|BFPO[ ]?\d{1,4}"
"JE", "JE\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"GG", "GY\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"IM", "IM\d[\dA-Z]?[ ]?\d[ABD-HJLN-UW-Z]{2}"
"US", "\d{5}([ \-]\d{4})?"
"CA", "[ABCEGHJKLMNPRSTVXY]\d[ABCEGHJ-NPRSTV-Z][ ]?\d[ABCEGHJ-NPRSTV-Z]\d"
"DE", "\d{5}"
"JP", "\d{3}-\d{4}"
"FR", "\d{2}[ ]?\d{3}"
"AU", "\d{4}"
"IT", "\d{5}"
"CH", "\d{4}"
"AT", "\d{4}"
"ES", "\d{5}"
"NL", "\d{4}[ ]?[A-Z]{2}"
"BE", "\d{4}"
"DK", "\d{4}"
"SE", "\d{3}[ ]?\d{2}"
"NO", "\d{4}"
"BR", "\d{5}[\-]?\d{3}"
"PT", "\d{4}([\-]\d{3})?"
"FI", "\d{5}"
"AX", "22\d{3}"
"KR", "\d{3}[\-]\d{3}"
"CN", "\d{6}"
"TW", "\d{3}(\d{2})?"
"SG", "\d{6}"
"DZ", "\d{5}"
"AD", "AD\d{3}"
"AR", "([A-HJ-NP-Z])?\d{4}([A-Z]{3})?"
"AM", "(37)?\d{4}"
"AZ", "\d{4}"
"BH", "((1[0-2]|[2-9])\d{2})?"
"BD", "\d{4}"
"BB", "(BB\d{5})?"
"BY", "\d{6}"
"BM", "[A-Z]{2}[ ]?[A-Z0-9]{2}"
"BA", "\d{5}"
"IO", "BBND 1ZZ"
"BN", "[A-Z]{2}[ ]?\d{4}"
"BG", "\d{4}"
"KH", "\d{5}"
"CV", "\d{4}"
"CL", "\d{7}"
"CR", "\d{4,5}|\d{3}-\d{4}"
"HR", "\d{5}"
"CY", "\d{4}"
"CZ", "\d{3}[ ]?\d{2}"
"DO", "\d{5}"
"EC", "([A-Z]\d{4}[A-Z]|(?:[A-Z]{2})?\d{6})?"
"EG", "\d{5}"
"EE", "\d{5}"
"FO", "\d{3}"
"GE", "\d{4}"
"GR", "\d{3}[ ]?\d{2}"
"GL", "39\d{2}"
"GT", "\d{5}"
"HT", "\d{4}"
"HN", "(?:\d{5})?"
"HU", "\d{4}"
"IS", "\d{3}"
"IN", "\d{6}"
"ID", "\d{5}"
"IL", "\d{5}"
"JO", "\d{5}"
"KZ", "\d{6}"
"KE", "\d{5}"
"KW", "\d{5}"
"LA", "\d{5}"
"LV", "\d{4}"
"LB", "(\d{4}([ ]?\d{4})?)?"
"LI", "(948[5-9])|(949[0-7])"
"LT", "\d{5}"
"LU", "\d{4}"
"MK", "\d{4}"
"MY", "\d{5}"
"MV", "\d{5}"
"MT", "[A-Z]{3}[ ]?\d{2,4}"
"MU", "(\d{3}[A-Z]{2}\d{3})?"
"MX", "\d{5}"
"MD", "\d{4}"
"MC", "980\d{2}"
"MA", "\d{5}"
"NP", "\d{5}"
"NZ", "\d{4}"
"NI", "((\d{4}-)?\d{3}-\d{3}(-\d{1})?)?"
"NG", "(\d{6})?"
"OM", "(PC )?\d{3}"
"PK", "\d{5}"
"PY", "\d{4}"
"PH", "\d{4}"
"PL", "\d{2}-\d{3}"
"PR", "00[679]\d{2}([ \-]\d{4})?"
"RO", "\d{6}"
"RU", "\d{6}"
"SM", "4789\d"
"SA", "\d{5}"
"SN", "\d{5}"
"SK", "\d{3}[ ]?\d{2}"
"SI", "\d{4}"
"ZA", "\d{4}"
"LK", "\d{5}"
"TJ", "\d{6}"
"TH", "\d{5}"
"TN", "\d{4}"
"TR", "\d{5}"
"TM", "\d{6}"
"UA", "\d{5}"
"UY", "\d{5}"
"UZ", "\d{6}"
"VA", "00120"
"VE", "\d{4}"
"ZM", "\d{5}"
"AS", "96799"
"CC", "6799"
"CK", "\d{4}"
"RS", "\d{6}"
"ME", "8\d{4}"
"CS", "\d{5}"
"YU", "\d{5}"
"CX", "6798"
"ET", "\d{4}"
"FK", "FIQQ 1ZZ"
"NF", "2899"
"FM", "(9694[1-4])([ \-]\d{4})?"
"GF", "9[78]3\d{2}"
"GN", "\d{3}"
"GP", "9[78][01]\d{2}"
"GS", "SIQQ 1ZZ"
"GU", "969[123]\d([ \-]\d{4})?"
"GW", "\d{4}"
"HM", "\d{4}"
"IQ", "\d{5}"
"KG", "\d{6}"
"LR", "\d{4}"
"LS", "\d{3}"
"MG", "\d{3}"
"MH", "969[67]\d([ \-]\d{4})?"
"MN", "\d{6}"
"MP", "9695[012]([ \-]\d{4})?"
"MQ", "9[78]2\d{2}"
"NC", "988\d{2}"
"NE", "\d{4}"
"VI", "008(([0-4]\d)|(5[01]))([ \-]\d{4})?"
"PF", "987\d{2}"
"PG", "\d{3}"
"PM", "9[78]5\d{2}"
"PN", "PCRN 1ZZ"
"PW", "96940"
"RE", "9[78]4\d{2}"
"SH", "(ASCN|STHL) 1ZZ"
"SJ", "\d{4}"
"SO", "\d{5}"
"SZ", "[HLMS]\d{3}"
"TC", "TKCA 1ZZ"
"WF", "986\d{2}"
"XK", "\d{5}"
"YT", "976\d{2}"
Tried answered 25/8, 2011 at 4:43 Comment(8)
Just with a quick scan of the AU postcode-regex... this regex is very simple and will allow lots of false-positives through, so it's not exhaustive.Extent
The latest version of unicode CLDR containing the postal code regex is version 26.0.1. In later versions it has been removed because the data was not maintained and no other reliable sources could be found.Propositus
Same, very basic for french Zip code regex. Use this one "^((0[1-9])|([1-8][0-9])|(9[0-8])|(2A)|(2B))[0-9]{3}$" -> developpez.net/forums/d518232/webmasters-developpement-web/…Calamondin
I'm using i18napis.appspot.com/address/data/GB now; are there any problems with this service?Bailly
Small correction to @kiko-software's comment: the latest version containing postal code data is 27.0.3.Shamble
I wondered if someone else might have the same problem like me. In order to make these regexes work correctly in my js array I had to modify for example Italy: "IT", "\d{5}" into "IT": /^\d{5}$/, without second"" and with , . This works as expected but it doesn't with stuff like: ([A-HJ-NP-Z])?\d{4}([A-Z]{3})? Question mark at the end or similar things. Someone out there who managed this?Anyhow thanks for them!Convulsant
Nicaragua (NI) appears to be incorrect. Should just be 5 digits.Jurel
@Convulsant you are correct. We (I) need to match the exact string; hence added ^(REGEX)$ and it works well for me.Perplexed
R
149

There is none.

Postal/zip codes around the world don't follow a common pattern. In some countries they are made up by numbers, in others they can be combinations of numbers an letters, some can contain spaces, others dots, the number of characters can vary from two to at least six...

What you could do (theoretically) is create a seperate regex for every country in the world, not recommendable IMO. But you would still be missing on the validation part: Zip code 12345 may exist, but 12346 not, maybe 12344 doesn't exist either. How do you check for that with a regex?

You can't.

Relentless answered 23/2, 2009 at 17:10 Comment(7)
I suspect that a regex could be compiled, but that a task like this be much better suited to a database. The regex would look something like 10000|10001|10002|10003|.......Witham
for validating a field go here regexlib.com/Search.aspx?k=decimal&c=3&m=-1&ps=100Seigler
You can use a regexp first that matches your country (see en.wikipedia.org/wiki/List_of_postal_codes) and do a real check by an external service like geonames.org/export/ws-overview.htmlRegional
My two cents: in Brazil it is actualy 8 numbers, 5 followed by a dash and 3 moreJarita
^\d{5}(?:[-\s]\d{4})?$Sudatorium
It would be useful for many types of software to have a per-country regex. One example: if you are taking GUI input and want to validate in real time if it's a valid zip code, you wait until your text entry field matches the regex, then you validate against a more fine-grained database.Vibrator
dude this is the best answer without a solution I have ever seenArgile
G
93

use these regx

$ZIPREG=array(
    "US"=>"^\d{5}([\-]?\d{4})?$",
    "UK"=>"^(GIR|[A-Z]\d[A-Z\d]??|[A-Z]{2}\d[A-Z\d]??)[ ]??(\d[A-Z]{2})$",
    "DE"=>"\b((?:0[1-46-9]\d{3})|(?:[1-357-9]\d{4})|(?:[4][0-24-9]\d{3})|(?:[6][013-9]\d{3}))\b",
    "CA"=>"^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ])\ {0,1}(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$",
    "FR"=>"^(F-)?((2[A|B])|[0-9]{2})[0-9]{3}$",
    "IT"=>"^(V-|I-)?[0-9]{5}$",
    "AU"=>"^(0[289][0-9]{2})|([1345689][0-9]{3})|(2[0-8][0-9]{2})|(290[0-9])|(291[0-4])|(7[0-4][0-9]{2})|(7[8-9][0-9]{2})$",
    "NL"=>"^[1-9][0-9]{3}\s?([a-zA-Z]{2})?$",
    "ES"=>"^([1-9]{2}|[0-9][1-9]|[1-9][0-9])[0-9]{3}$",
    "DK"=>"^([D|d][K|k]( |-))?[1-9]{1}[0-9]{3}$",
    "SE"=>"^(s-|S-){0,1}[0-9]{3}\s?[0-9]{2}$",
    "BE"=>"^[1-9]{1}[0-9]{3}$",
    "IN"=>"^\d{6}$"
);
Gallop answered 10/5, 2012 at 7:15 Comment(8)
One of the better attempts I've seen to actually answer the OP. Get's slower as you ad more but a clean and clear approach.Mince
It does not get slower as you add more as Rob suggests as you would choose one of the regexes from the country code.Fur
I see you posted this in 2012. Got any more since?Valtin
@Valtin check Chi answer.Lazy
Would the US version accept 00000 with your code, though?Shebat
@Shebat it's a matter of how far you want to go. Validating a US ZIP code as simply "any five digits" is good enough in some cases. On the other hand, for some applications maybe you need to check against a list of all valid ZIP codes, or even check that the ZIP matches other address info.Sodomy
@ddunn801, there's a (whomping big) differencee between validating the pattern and authenticating the postal code. Authenticating the codes is whole orders of magnitude more difficult since (at least in the U.S.) postal codes are added and dropped regularly. In an ideal world, you would perform a quick-check to validate the pattern before submitting to a service (e.g., USPS) to validate the entire mailing address (services like this are paid, you'd hate to waste the value with bad data). Alas, the world is far from ideal.Ortegal
Just for future reference, the Dutch (NL) regex is wrong. The letter part is made optional, but it's not; it should be required. Correct would be ^[1-9][0-9]{3}\s?([a-zA-Z]{2})$, so without trailing ? quantifier.Ethylene
S
90
  1. Every postal code system uses only A-Z and/or 0-9 and sometimes space/dash

  2. Not every country uses postal codes (ex. Ireland outside of Dublin), but we'll ignore that here.

  3. The shortest postal code format is Sierra Leone with NN

  4. The longest is American Samoa with NNNNN-NNNNNN

  5. You should allow one space or dash.

  6. Should not begin or end with space or dash

This should cover the above:

(?i)^[a-z0-9][a-z0-9\- ]{0,10}[a-z0-9]$
Sonority answered 7/11, 2013 at 19:0 Comment(10)
This seems to be the only answer that provides a sanity check (which is probably what the OP wanted) rather than a full validation of every possibly combination. Exactly what I wanted thxMadisonmadlen
Still it's important to remember that it's a bad practice, if you want to validate better see @Tried answer.Lazy
@GiulioCaccin H0H0H0 is a valid Canadian Postal Code (which children use to get letters from Canada Post pretending to be Santa Claus), but that doesn't mean it's a valid customer postal code :)Sonority
Not bad I liked it but for example: 9743 PS is not working (thats groningen, NL)Galloon
FYI, American Samoa is small enough to only has one postcode and it's 96799Normalcy
In my opinion this is the only good answer. It can universally be used as pre-validation in HTML pattern attribute for instance.Buckhound
I think this is a good answer for the situation where one just wants to have a sanity check and not validate precisly per country. Just to have a little cleaner data without much effort -- in cases where full safety is needed, a third party plugin/service might be needed as others pointed out.Falco
@NeilMcGuigan This regular expression is not working in Safari BrowserWichita
For Javascript, remove the "(?i) as it does not conform to ECMA script. you can use this. ^[a-z0-9][a-z0-9\- ]{0,10}[a-z0-9]$Brochette
@EmmanuelAliji this will match 99999 even it doesn't exist. The highest real zip code is 99950. Check this regex demo.Tanga
T
17

Trying to cover the whole world with one regular expression is not completely possible, and certainly not feasible or recommended.

Not to toot my own horn, but I've written some pretty thorough regular expressions which you may find helpful.

  • Canadian postal codes

    Basic validation:
    ^[ABCEGHJ-NPRSTVXY]{1}[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[ ]?[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[0-9]{1}$
    
    Extended validation:
    ^(A(0[ABCEGHJ-NPR]|1[ABCEGHK-NSV-Y]|2[ABHNV]|5[A]|8[A])|B(0[CEHJ-NPRSTVW]|1[ABCEGHJ-NPRSTV-Y]|2[ABCEGHJNRSTV-Z]|3[ABEGHJ-NPRSTVZ]|4[ABCEGHNPRV]|5[A]|6[L]|9[A])|C(0[AB]|1[ABCEN])|E(1[ABCEGHJNVWX]|2[AEGHJ-NPRSV]|3[ABCELNVYZ]|4[ABCEGHJ-NPRSTV-Z]|5[ABCEGHJ-NPRSTV]|6[ABCEGHJKL]|7[ABCEGHJ-NP]|8[ABCEGJ-NPRST]|9[ABCEGH])|G(0[ACEGHJ-NPRSTV-Z]|1[ABCEGHJ-NPRSTV-Y]|2[ABCEGJ-N]|3[ABCEGHJ-NZ]|4[ARSTVWXZ]|5[ABCHJLMNRTVXYZ]|6[ABCEGHJKLPRSTVWXZ]|7[ABGHJKNPSTXYZ]|8[ABCEGHJ-NPTVWYZ]|9[ABCHNPRTX])|H(0[HM]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NPRSTV-Z]|4[ABCEGHJ-NPRSTV-Z]|5[AB]|7[ABCEGHJ-NPRSTV-Y]|8[NPRSTYZ]|9[ABCEGHJKPRSWX])|J(0[ABCEGHJ-NPRSTV-Z]|1[ACEGHJ-NRSTXZ]|2[ABCEGHJ-NRSTWXY]|3[ABEGHLMNPRTVXYZ]|4[BGHJ-NPRSTV-Z]|5[ABCJ-MRTV-Z]|6[AEJKNRSTVWYXZ]|7[ABCEGHJ-NPRTV-Z]|8[ABCEGHLMNPRTVXYZ]|9[ABEHJLNTVXYZ])|K(0[ABCEGHJ-M]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-MPRSTVW]|4[ABCKMPR]|6[AHJKTV]|7[ACGHK-NPRSV]|8[ABHNPRV]|9[AHJKLV])|L(0[[ABCEGHJ-NPRS]]|1[ABCEGHJ-NPRSTV-Z]|2[AEGHJMNPRSTVW]|3[BCKMPRSTVXYZ]|4[ABCEGHJ-NPRSTV-Z]|5[ABCEGHJ-NPRSTVW]|6[ABCEGHJ-MPRSTV-Z]|7[ABCEGJ-NPRST]|8[EGHJ-NPRSTVW]|9[ABCGHK-NPRSTVWYZ])|M(1[BCEGHJ-NPRSTVWX]|2[HJ-NPR]|3[ABCHJ-N]|4[ABCEGHJ-NPRSTV-Y]|5[ABCEGHJ-NPRSTVWX]|6[ABCEGHJ-NPRS]|7[AY]|8[V-Z]|9[ABCLMNPRVW])|N(0[ABCEGHJ-NPR]|1[ACEGHKLMPRST]|2[ABCEGHJ-NPRTVZ]|3[ABCEHLPRSTVWY]|4[BGKLNSTVWXZ]|5[ACHLPRV-Z]|6[ABCEGHJ-NP]|7[AGLMSTVWX]|8[AHMNPRSTV-Y]|9[ABCEGHJKVY])|P(0[ABCEGHJ-NPRSTV-Y]|1[ABCHLP]|2[ABN]|3[ABCEGLNPY]|4[NPR]|5[AEN]|6[ABC]|7[ABCEGJKL]|8[NT]|9[AN])|R(0[ABCEGHJ-M]|1[ABN]|2[CEGHJ-NPRV-Y]|3[ABCEGHJ-NPRSTV-Y]|4[AHJKL]|5[AGH]|6[MW]|7[ABCN]|8[AN]|9[A])|S(0[ACEGHJ-NP]|2[V]|3[N]|4[AHLNPRSTV-Z]|6[HJKVWX]|7[HJ-NPRSTVW]|9[AHVX])|T(0[ABCEGHJ-MPV]|1[ABCGHJ-MPRSV-Y]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NPRZ]|4[ABCEGHJLNPRSTVX]|5[ABCEGHJ-NPRSTV-Z]|6[ABCEGHJ-NPRSTVWX]|7[AENPSVXYZ]|8[ABCEGHLNRSVWX]|9[ACEGHJKMNSVWX])|V(0[ABCEGHJ-NPRSTVWX]|1[ABCEGHJ-NPRSTV-Z]|2[ABCEGHJ-NPRSTV-Z]|3[ABCEGHJ-NRSTV-Y]|4[ABCEGK-NPRSTVWXZ]|5[ABCEGHJ-NPRSTV-Z]|6[ABCEGHJ-NPRSTV-Z]|7[ABCEGHJ-NPRSTV-Y]|8[ABCGJ-NPRSTV-Z]|9[ABCEGHJ-NPRSTV-Z])|X(0[ABCGX]|1[A])|Y(0[AB]|1[A]))[ ]?[0-9]{1}[ABCEGHJ-NPRSTV-Z]{1}[0-9]{1}$
    
  • US ZIP codes

    ^[0-9]{5}(-[0-9]{4})?$
    
  • UK post codes

    ^([A-PR-UWYZ]([0-9]{1,2}|([A-HK-Y][0-9]|[A-HK-Y][0-9]([0-9]|[ABEHMNPRV-Y]))|[0-9][A-HJKS-UW])\ [0-9][ABD-HJLNP-UW-Z]{2}|(GIR\ 0AA)|(SAN\ TA1)|(BFPO\ (C\/O\ )?[0-9]{1,4})|((ASCN|BBND|[BFS]IQQ|PCRN|STHL|TDCU|TKCA)\ 1ZZ))$
    

It is not possible to guarantee accuracy without actually mailing something to an address and having the person let you know when they receive it, but we can narrow things by down by eliminating cases that we know are bad.

Trilbi answered 23/2, 2009 at 18:2 Comment(2)
The extended version for Canadian Postal Codes might have something wrong or missing, as it says that the following postal code is invalid: E3G 0A1, although it is a valid one.Semiautomatic
I have validated against all 845,495 postal codes in Canada and this regex string has some fixes on the Extended validation to support all of these postal codes. Here is the new regex string for the extended validation on Canadian Postal Codes: pastebin.com/vazqFKy4Semiautomatic
N
8

We use the following:

Canada

([A-Z]{1}[0-9]{1}){3}   //We raise to upper first

America

[0-9]{5}                //-or-
[0-9]{5}-[0-9]{4}       //10 digit zip

Other

Accept as is

Nicolina answered 23/2, 2009 at 17:40 Comment(3)
I'd suggest adding an optional -[0-9]{4} to the US one. Some people do use their ZIP+4.Violinist
/[0-9]{5}(?:-[0-9]{4})?/ lets you validate both styles from the US at the same time.Speechless
@Chas.Owens adding ^ and $ ensure they can't type anything else before or after, like "12345aaa" ... /^[0-9]{5}(?:-[0-9]{4})?$/Siltstone
R
8

If someone is still interested in how to validate zip codes I've found a solution:

Using Google Geocoding API we can check validity of ZIP code having both Country code and a ZIP code itself.

For example I live in Ukraine so I can check like this: https://maps.googleapis.com/maps/api/geocode/json?components=postal_code:80380|country:UA

Or using JS API: https://developers.google.com/maps/documentation/javascript/geocoding#ComponentFiltering

Where 80380 is valid ZIP for Ukraine, actually every (#####) is valid.

Google returns ZERO_RESULTS status if nothing found. Or OK and a result if both are correct.

Hope this will be helpful.

Rufescent answered 23/10, 2015 at 14:36 Comment(1)
The only issue would be the limit on the number of queries, which, depending on the site/size, could be an issue.Inestimable
G
7

Depending on your application, you might want to implement regex matching for the countries where most of your visitors originate and no validation for the rest (accept anything).

Gownsman answered 23/2, 2009 at 17:12 Comment(0)
Z
7

Please note that this is quite a hard problem, as stated by the accepted answer. I guess it didn't deter the folks at geonames.org though. They have a file a country info file, which doesn't fit whole into this answer - limit is at 30000 chars apparently. There are regexes for about 150 countries.

I extracted the bits relevant to this question here :

AD ^(?:AD)*(\d{3})$
AM ^(\d{6})$
AR ^([A-Z]\d{4}[A-Z]{3})$
AT ^(\d{4})$
AU ^(\d{4})$
AX ^(?:FI)*(\d{5})$
AZ ^(?:AZ)*(\d{4})$
BA ^(\d{5})$
BB ^(?:BB)*(\d{5})$
BD ^(\d{4})$
BE ^(\d{4})$
BG ^(\d{4})$
BH ^(\d{3}\d?)$
BM ^([A-Z]{2}\d{2})$
BN ^([A-Z]{2}\d{4})$
BR ^(\d{8})$
BY ^(\d{6})$
CA ^([ABCEGHJKLMNPRSTVXY]\d[ABCEGHJKLMNPRSTVWXYZ]) ?(\d[ABCEGHJKLMNPRSTVWXYZ]\d)$
CH ^(\d{4})$
CL ^(\d{7})$
CN ^(\d{6})$
CR ^(\d{4})$
CU ^(?:CP)*(\d{5})$
CV ^(\d{4})$
CX ^(\d{4})$
CY ^(\d{4})$
CZ ^(\d{5})$
DE ^(\d{5})$
DK ^(\d{4})$
DO ^(\d{5})$
DZ ^(\d{5})$
EC ^([a-zA-Z]\d{4}[a-zA-Z])$
EE ^(\d{5})$
EG ^(\d{5})$
ES ^(\d{5})$
ET ^(\d{4})$
FI ^(?:FI)*(\d{5})$
FM ^(\d{5})$
FO ^(?:FO)*(\d{3})$
FR ^(\d{5})$
GB ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
GE ^(\d{4})$
GF ^((97|98)3\d{2})$
GG ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
GL ^(\d{4})$
GP ^((97|98)\d{3})$
GR ^(\d{5})$
GT ^(\d{5})$
GU ^(969\d{2})$
GW ^(\d{4})$
HN ^([A-Z]{2}\d{4})$
HR ^(?:HR)*(\d{5})$
HT ^(?:HT)*(\d{4})$
HU ^(\d{4})$
ID ^(\d{5})$
IL ^(\d{5})$
IM ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
IN ^(\d{6})$
IQ ^(\d{5})$
IR ^(\d{10})$
IS ^(\d{3})$
IT ^(\d{5})$
JE ^(([A-Z]\d{2}[A-Z]{2})|([A-Z]\d{3}[A-Z]{2})|([A-Z]{2}\d{2}[A-Z]{2})|([A-Z]{2}\d{3}[A-Z]{2})|([A-Z]\d[A-Z]\d[A-Z]{2})|([A-Z]{2}\d[A-Z]\d[A-Z]{2})|(GIR0AA))$
JO ^(\d{5})$
JP ^(\d{7})$
KE ^(\d{5})$
KG ^(\d{6})$
KH ^(\d{5})$
KP ^(\d{6})$
KR ^(?:SEOUL)*(\d{6})$
KW ^(\d{5})$
KZ ^(\d{6})$
LA ^(\d{5})$
LB ^(\d{4}(\d{4})?)$
LI ^(\d{4})$
LK ^(\d{5})$
LR ^(\d{4})$
LS ^(\d{3})$
LT ^(?:LT)*(\d{5})$
LU ^(\d{4})$
LV ^(?:LV)*(\d{4})$
MA ^(\d{5})$
MC ^(\d{5})$
MD ^(?:MD)*(\d{4})$
ME ^(\d{5})$
MG ^(\d{3})$
MK ^(\d{4})$
MM ^(\d{5})$
MN ^(\d{6})$
MQ ^(\d{5})$
MT ^([A-Z]{3}\d{2}\d?)$
MV ^(\d{5})$
MX ^(\d{5})$
MY ^(\d{5})$
MZ ^(\d{4})$
NC ^(\d{5})$
NE ^(\d{4})$
NF ^(\d{4})$
NG ^(\d{6})$
NI ^(\d{7})$
NL ^(\d{4}[A-Z]{2})$
NO ^(\d{4})$
NP ^(\d{5})$
NZ ^(\d{4})$
OM ^(\d{3})$
PF ^((97|98)7\d{2})$
PG ^(\d{3})$
PH ^(\d{4})$
PK ^(\d{5})$
PL ^(\d{5})$
PM ^(97500)$
PR ^(\d{9})$
PT ^(\d{7})$
PW ^(96940)$
PY ^(\d{4})$
RE ^((97|98)(4|7|8)\d{2})$
RO ^(\d{6})$
RS ^(\d{6})$
RU ^(\d{6})$
SA ^(\d{5})$
SD ^(\d{5})$
SE ^(?:SE)*(\d{5})$
SG ^(\d{6})$
SH ^(STHL1ZZ)$
SI ^(?:SI)*(\d{4})$
SK ^(\d{5})$
SM ^(4789\d)$
SN ^(\d{5})$
SO ^([A-Z]{2}\d{5})$
SV ^(?:CP)*(\d{4})$
SZ ^([A-Z]\d{3})$
TC ^(TKCA 1ZZ)$
TH ^(\d{5})$
TJ ^(\d{6})$
TM ^(\d{6})$
TN ^(\d{4})$
TR ^(\d{5})$
TW ^(\d{5})$
UA ^(\d{5})$
US ^\d{5}(-\d{4})?$
UY ^(\d{5})$
UZ ^(\d{6})$
VA ^(\d{5})$
VE ^(\d{4})$
VI ^\d{5}(-\d{4})?$
VN ^(\d{6})$
WF ^(986\d{2})$
YT ^(\d{5})$
ZA ^(\d{4})$
ZM ^(\d{5})$
CS ^(\d{5})$

Hopefully I didn't make any mistake, my regex-fu is pretty weak.

Zahara answered 25/7, 2015 at 16:45 Comment(3)
I would like to point out that the regex for France and Great Britain do not take into account possible spaces; In France, postal codes can be input with a space between the second and third digits (i.e. 75 001 instead of 75001). British post codes are quite often written with a space (i.e. SW1 1AA instead of SW11AA).Hellcat
@Hellcat Thanks for the input, I did not notice that (even though I am French). Looks like Chi's answer is better in this regard.Zahara
because str_replace a space with no space is super taxing right? :pCelebrant
Z
5
.* 

Big Jump forgot about line breaks, blanks and control characters.

International postal codes are a kind of halting problem.

Zingg answered 11/5, 2012 at 22:47 Comment(0)
G
4

As others have pointed out, one regex to rule them all is unlikely. However, you can craft regular expressions for as many countries as you need using the address formatting info from the Universal Postal Union -- a little-known UN agency.

For example, here are the address formatting rules, including postal code, for a handful of countries (PDF format):

Gallery answered 7/3, 2017 at 23:4 Comment(0)
S
2

Given that there are so many edge cases for each country (eg. London addresses may use a slightly different format to the rest of the UK) I don't think that there is an ultimate regex other than maybe:

[0-9a-zA-Z]+

Best of going with a fairly broad pattern (well not quite as broad as the above), or treat each country/region with a specific pattern of its own!

UPDATE: However, it may be possible to dynamically construct a regex based upon lots of smaller, region specific rules - not sure about performance though!

Lots of country specific patterns can be found on the RegExLib site.

Snowslide answered 23/2, 2009 at 17:6 Comment(0)
H
2

The problem is going to be that you probably have no good means of keeping up with the changing postal code requirements of countries on the other side of the globe and which you share no common languages. Unless you have a large enough budget to track this, you are almost certainly better off giving the responsibility of validating addresses to google or yahoo.

Both companies provide address lookup facuilities through a programmable API.

Handshaker answered 23/2, 2009 at 19:8 Comment(0)
I
1

Why are you doing this and why do you care? As Tom Ritter pointed out, it doesn't matter whether you even have a ZIP/postal code at all, much less whether it's valid or not, until and unless you are actually going to be sending something to that address. Even if you expect that you will be sending them something someday, that doesn't mean you need a postal code today.

Infeudation answered 23/2, 2009 at 17:18 Comment(2)
Yeah but if they're going to be entering one, might as well make sure it's correct at that point. However, I agree with one of the other answers that basically says, make it validate for the countries that you think will be the majority of your customers.Buckbuckaroo
Some credit clearing houses will not accept a bill unless the zip is correct. I would rather validate the zip on input, rather than submit the charge and have it rejected.Marcheshvan
H
1

As noted elsewhere the variation around the world is huge. And even if something that matches the pattern does not mean it exists.

Then, of course, there are many places where postcodes are not used (e.g. much or Ireland).

Hydroquinone answered 23/2, 2009 at 17:20 Comment(1)
Actually, probably all of Ireland, as I don't think D1, D2, etc. are considered proper post codes as you can't identify an address using just this code and a street number.Accordant
B
1

There are reasons beyond shipping for having an accurate postal code. Travel agencies doing tours that cross borders (Eurozone excepted of course) need this information ahead of time to give to the authorities. Often this information is entered by an agent that may or may not be familiar with such things. ANY method that can cut down on mistakes is a Good Idea™

However, writing a regex that would cover all postal codes in the world would be insane.

Beaulahbeaulieu answered 11/5, 2009 at 20:35 Comment(1)
It is only a good idea until the code starts rejecting valid zipcodes either because it is buggy or the zipcodes have changed. Validation is something that must either be right or not there at all. At the very least there should be an override option.Speechless
C
1

Somebody was asking about list of formatting mailing addresses, and I think this is what he was looking for...

Frank's Compulsive Guide to Postal Addresses: http://www.columbia.edu/~fdc/postal/ Doesn't help much with street-level issues, however.

My work uses a couple of tools to assist with this: - Lexis-Nexis services, including NCOA lookups (you'll get address standardization for "free") - "Melissa Data" http://www.melissadata.com

Counsel answered 11/5, 2012 at 22:13 Comment(0)
S
1

This is a very simple RegEx for validating US Zipcode (not ZipCode Plus Four):

(?!([089])\1{4})\d{5}

Seems all five digit numeric are valid zipcodes except 00000, 88888 & 99999.

I have tested this RegEx with http://regexpal.com/

SP

Sural answered 13/11, 2012 at 15:38 Comment(1)
This RegEx does not enforce four digits for the zip+4 portion. E.g. it considers "92122-1" a valid zip code.Lenette
V
0

If Zip Code allows characters and digits (alphanumeric), below regex would be used where it matches, 5 or 9 or 10 alphanumeric characters with one hypen (-):

^([0-9A-Za-z]{5}|[0-9A-Za-z]{9}|(([0-9a-zA-Z]{5}-){1}[0-9a-zA-Z]{4}))$
Villegas answered 7/6, 2019 at 9:5 Comment(0)
I
0

I know this is an old quesiton, but I stumbled across the same problem. I have invoices from over 100 countries and am trying to get the correkt creditor over the zip (if every other check is failing). So what I did is writing a short Python Script, that creates a pattern from a string:

class RegexPatternBuilder:
    """
    Builds a regex pattern out of a given string(i.e. --> HM452 AX2155 : [A-Z]{2}\d{3}\s{1}[A-Z]{2}\d{4})
    """
    __is_alpha_count = 0
    __is_numeric_count = 0
    __is_whitespace_count = 0
    __pattern = ""

    # Count: wich character of the string we're locking at right now
    __count = 0

    # Countrys like  Andora starts theire ZIP with the country abbreviation :AD500
    # So check at first if the ZIP starts with the abbreviation and if so, add it to the pattern and increase the count.
    def __init__(self, zip_string, country):
        self.__zip_string = zip_string
        self.__country = country
        if self.__zip_string.startswith(country):
            self.__pattern = f'({self.__country})'
            self.__count += len(self.__country)

    def build_regex(self):
        # Last step ;
        # Add the current alpha_numeric pattern with count
        if len(self.__zip_string) == self.__count:
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            if self.__is_numeric_count:
                self.__pattern += f"\d{{{self.__is_numeric_count}}}"
            return f'{self.__pattern}\\b'

        # Case: Whitespace
        # Check if there is a crossing from numeric / alphanumeric to whitespace,
        # if so --> add the alpha_numeric regex to the whole pattern with the
        # count as the number of viable appeaerances.
        # Since there is max 1 whitespace in a ZIP, add the whitespace regex immediately.
        # Every other case is similar to that.
        if self.__zip_string[self.__count].isspace():
            if self.__is_numeric_count:
                self.__pattern += f"\d{{{self.__is_numeric_count}}}"
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            self.__pattern += "\s{1}"
            self.__is_whitespace_count += 1
            self.__is_alpha_count = 0
            self.__is_numeric_count = 0

        # Case: Is Alphanumeric
        if self.__zip_string[self.__count].isalpha():
            if self.__is_numeric_count:
                self.__pattern += f"[0-9]{{{self.__is_numeric_count}}}"
            self.__is_whitespace_count = 0
            self.__is_alpha_count += 1
            self.__is_numeric_count = 0

        # Case: Is Numeric
        if self.__zip_string[self.__count].isnumeric():
            if self.__is_alpha_count:
                self.__pattern += f"[A-Z]{{{self.__is_alpha_count}}}"
            self.__is_whitespace_count = 0
            self.__is_alpha_count = 0
            self.__is_numeric_count += 1

        # Case: Special Character (i.e. - )
        # No escaping or count for this so far, because it shouldn't be needed for our zip purposes
        if not self.__zip_string[self.__count].isalpha() \
                and not self.__zip_string[self.__count].isnumeric() \
                and not self.__zip_string[self.__count].isspace():
            self.__pattern += f'{self.__zip_string[self.__count]}{{1}}'
        self.__count += 1
        return self.build_regex()

With that I created all the different possible regexes for all zips (by country) we have historically and wrote them back into a db table (i.e. something like this in the end: COUNTRY:RE PATTERN:(\d{5})\b [what ever country this might be ;D])

Maybe it helps someone.

Iey answered 28/1, 2022 at 7:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.