Mysql Datatype for US Zip (Postal Codes)
Asked Answered
S

3

17

I am writing a web application, that is US specific, so the format that other countries use for postal codes are not important. I have a list of us zip codes that i am trying to load into a database table that includes the

  • 5 digit us zip code
  • latitude
  • longitude
  • usps classification code
  • state code
  • city

the zip code is the primary key as it is what i will be querying against. i started using a medium int 5 but that truncates the zip codes that have leading zeros.

i considered using a char5 but am concerned about the performance hit of indexing against a char variable.

so my question is what is the best mysql datatype to store zip codes as?

Note: i have seen it in several other questions related to zip codes. I am only interested in US 5 digit zip codes. So there is no need to take other countries postal code formats into consideration.

Souther answered 29/9, 2009 at 17:26 Comment(9)
What's the value add in keeping the leading zeros? So long as you display the leading zeroes (.PadLeft), I would keep your database optimized as far as possible.Armory
The zeroes are important data in a zip code. Zip codes are not integers, even if they're made up entirely of numbers.Anticlastic
thats a fair point. i did a few test cases querying for 00210 etc and it returned the proper record. i guess it is more of a gut feeling about concerns for data integrity.Souther
What about the newer Zip+4 codes? Ex: 91210-3889Clomb
@ceejayoz: So long as he displays zip codes with 5 digits, when will this ever pose a problem? In case the US expands the zip code system and integrates characters?Armory
It poses a problem in having to remember to zero-pad it every time. It presents a problem in sorting. Ultimately, there are much better data types like char(5) to store this data with.Anticlastic
@Clomb as far as i know the zip+4 format only provides a more specific location fix within that zip code, and being that i dont have that precise of dataset, and the fact that the precision of the current zip meets my requirements i am not worried about it.Souther
@ceejayoz, sorting a numeric always works the same as a zero padded number stored in a char, or am I missing something?Pome
Does anyone know if there's a MySQL zipcode radius database that my client can purchase in order to do radial/locus searches?Prochronism
D
30

char(5) is the correct way to go. String indexing is quite fast, particularly when it is such a small data set.

You are correct in that you should never use an integer for a zip code, since it isn't truly numeric data.

Edit to add: Check out this for good reasons why you don't use numbers for non-numerically important data: Is it a good idea to use an integer column for storing US ZIP codes in a database?

Dated answered 29/9, 2009 at 17:30 Comment(6)
Erich: Why not an integer? I would think that storing as an integer would help with type checking, you can add leading zeroes in the client, a character can be a non-digit ... Just wondering what point I'm missing.Fatherly
The data set could be millions. Most likely there will be other addresses in the system where a zip is stored, and they should all be the same typePome
Zip Codes are not numeric data. Numeric data is data in which it makes sense to do mathematical operations on, which you would never do with a Zip Code. Otherwise, you are unfairly limiting your dataset. As far as the data set size, if the Zip code is truly the PK, there are only 99,999 values, a very small dataset relatively.Dated
if you have 10 million customers with zip codes that is a big datasetPome
KM, 50 megabytes is not a big data set. An index with 10 million keys in it is relatively small in today's data management.Wording
John: Integer should not be used, because the zipcode "00501" will be stored as "501" which may create problems.Epilate
P
2

go with your medium INT(5) ZEROFILL, it should add the leading zeros for you. No need to impact the index and performance on a formatting issue.

Pome answered 29/9, 2009 at 17:32 Comment(1)
as a char 5 it took .0007 seconds as medint5 zerofill it took .0006 seconds. i think both are valid solutions, but i think i am going to go with char5 and take the slight performance hit for a little more peace of mind on the data integrity front.Souther
C
1

If he makes it Char(6), then he can handle Canadian postal codes as well.

When you consider that there is a maximum of 100,000 5-digit Zip Code and how little space it would take up even if you made the entire table memory-resident, there's no reason not to.

Companionate answered 29/9, 2009 at 17:33 Comment(1)
I saw that from the original post. I just figured I'd mention it in case anyone else looked at this question in the future looking for advice but had a situation where foreign postal codes MIGHT make a difference. My main point was that, in the age of gigabyte memory sticks, a zip code table is pretty small. (I dealt with these when memory was measured in KILObytes)Companionate

© 2022 - 2024 — McMap. All rights reserved.