Column type for ZipCode in PostgreSQL database?
Asked Answered
L

3

9

What is the correct column type for holding ZipCode values in PostgreSQL database?

Lur answered 24/3, 2011 at 14:46 Comment(0)
G
8

It is something like xxxxx-xxxx, so varchar(10) is recommended.

If you want to check the syntax of the values in the database, you could create a domain type for zip codes.

CREATE DOMAIN zipcode varchar(10) 
    CONSTRAINT valid_zipcode 
    CHECK (VALUE ~ '[A-Z0-9-]+'); -- or a better regular expression

You could have a look at this site, which proposes this regex:

(^\d{5}(-\d{4})?$)|(^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$)

But you should check it works for the PostgreSQL regex syntax.

Groth answered 24/3, 2011 at 14:51 Comment(5)
Keep in mind that zip coes for different countries look different. If there the OP's schema has non US codes, than you can't use such a pattern.Mirabella
i'm n00b in PostgreSQL, and CREATE DOMAIN it's same with CREATE TABLE?,Lur
No. You define a domain, and afterwards you can use "zipcode" instead of varchar(10). You essentially define a new datatype, that brings its own validation.Groth
Dear downvoter, would you like to explain what's wrong with the answer?Groth
Oh, and just in case: The OP asked for ZIP-Codes and these have a definable pattern. Other countries just do not have ZIP codes (but for example Postleitzahlen in Germany). To get an idea how to really store an address in a country independant manner try to have a look at the address type in the Google Maps API...Groth
W
22

I strongly disagree with the advice presented here.

  1. The accepted answer accepts things that aren't digits.
  2. The question is about Zip Codes, not postal codes.
  3. If we assume the post is wrong and means international postal codes, there are characters that appear in international postal codes that don't appear in that list, and many international - and also the US domestic - postal codes can be over ten characters
  4. If we actually answer the question they asked, about zip codes, then there should be no accomodation for anything but digits (and arguably the hyphen)
  5. US zip codes can be up to 11 digits long (13 characters counting the two dashes) - there is a zip, a zip+4, and a zip+6 (which programmers would call zip+4+2) notation; the last is used by skyscrapers, universities, et cetera
  6. US zip codes are always non-negative integers, and therefore should not be stored as text data, which is subject to non-canon representation problems (ask anyone who's done a system about that time they found out that their zip 00203 didn't match the zip 203 that they accidentally got when constantly unnecessarily parsing string representations)
  7. If you pretend you're actually tracking international post codes, the short character sequence limited text fields here don't even begin to do the job. The word "China" comes to mind.

My opinon:

  1. Decide whether you're actually handling US postal codes or international
  2. If you're handling US postal codes, track them as unsigned integers, and left-pad them with zeros when text representing them. (Think unix timestamps and local TZ representations if you need to understand why this will be simpler in the long run.)
  3. If you're handling international post codes, store them in an unbounded unicode string, tie them to the country they represent, and validate country by country with check constraints. This problem is far more difficult than it sounds up front. International addresses are some of the least standardized things on Earth. Wait'll you find out how Japanese house numbers work, or why the British postal 6-code has the gaps it has.
Washwoman answered 9/9, 2014 at 23:8 Comment(1)
+1 This should be the accepted answer. It adds essential details that should be used in a finished implementation.Lutyens
G
8

It is something like xxxxx-xxxx, so varchar(10) is recommended.

If you want to check the syntax of the values in the database, you could create a domain type for zip codes.

CREATE DOMAIN zipcode varchar(10) 
    CONSTRAINT valid_zipcode 
    CHECK (VALUE ~ '[A-Z0-9-]+'); -- or a better regular expression

You could have a look at this site, which proposes this regex:

(^\d{5}(-\d{4})?$)|(^[ABCEGHJKLMNPRSTVXY]{1}\d{1}[A-Z]{1} *\d{1}[A-Z]{1}\d{1}$)

But you should check it works for the PostgreSQL regex syntax.

Groth answered 24/3, 2011 at 14:51 Comment(5)
Keep in mind that zip coes for different countries look different. If there the OP's schema has non US codes, than you can't use such a pattern.Mirabella
i'm n00b in PostgreSQL, and CREATE DOMAIN it's same with CREATE TABLE?,Lur
No. You define a domain, and afterwards you can use "zipcode" instead of varchar(10). You essentially define a new datatype, that brings its own validation.Groth
Dear downvoter, would you like to explain what's wrong with the answer?Groth
Oh, and just in case: The OP asked for ZIP-Codes and these have a definable pattern. Other countries just do not have ZIP codes (but for example Postleitzahlen in Germany). To get an idea how to really store an address in a country independant manner try to have a look at the address type in the Google Maps API...Groth
B
-2

it depends on what kind of zip you want. if you're sure you will only need to store the standard 5 digit then use an int will be the most space saving.

however if you need to do the 5+4 extended then a 10 digit character field is best. I personally suggest that as it does make it easier in the future if you end up needing to store international postal codes 10 digits covers just about every possible postal code format i've come across.

Brumley answered 24/3, 2011 at 15:11 Comment(2)
Zip codes can have leading zeros so using an int for a zip code is a bad idea. Zip codes are not numbers, they're strings that just happen to be made up of numbers (and sometimes a hyphen).Aba
int(5) with ZEROFILL property to make 5 into 00005.Rostock

© 2022 - 2024 — McMap. All rights reserved.