Extracting a zip code from an address string
Asked Answered
E

8

8

I have some full addresses, for example:

$addr1 = "5285 KEYES DR  KALAMAZOO MI 49004 2613"
$addr2 = "PO BOX 35  COLFAX LA 71417 35"
$addr3 = "64938 MAGNOLIA LN APT B PINEVILLE LA 71360-9781"

I need to get the 5-digit zip code out of the string. How can I do that? Perhaps with RegEx?

An acceptable answer assumes that there could be multiple 5-digit numbers in an address, but the Zip code will always be the last consecutive 5 digit number.

My idea was to use explode then loop through and check each index. Anyone got a better idea?

Any help is greatly appreciated..

Exploiter answered 12/1, 2014 at 12:51 Comment(2)
What have you tried so far? This is a reasonably simple task and a good opportunity to learn regexSchedule
My idea was to use explode then loop through and check each index.Exploiter
C
16

Speaking about US zip-codes, which are pre-followed with two letter state code in order to get a zip-code you could use the following regex:

/\b[A-Z]{2}\s+\d{5}(-\d{4})?\b/

Explanation:

\b         # word boundary
[A-Z]{2}   # two letter state code
\s+        # whitespace
\d{5}      # five digit zip
(-\d{4})?  # optional zip extension
\b         # word boundary

Online Example

Using it in your PHP:

$addr1 = "5285 KEYES DR  KALAMAZOO MI 49004 2613";
$addr2 = "PO BOX 35  COLFAX LA 71417 35";
$addr3 = "64938 MAGNOLIA LN APT B PINEVILLE LA 71360-9781";

function extract_zipcode($address) {
    $zipcode = preg_match("/\b[A-Z]{2}\s+\d{5}(-\d{4})?\b/", $address, $matches);
    return $matches[0];
}

echo extract_zipcode($addr1); // MI 49004
echo extract_zipcode($addr2); // LA 71417
echo extract_zipcode($addr3); // LA 71360-9781

Online Example

EDIT 1:

In order to extend functionality and flexibility, you can specify if you wish to keep state code or not:

function extract_zipcode($address, $remove_statecode = false) {
    $zipcode = preg_match("/\b[A-Z]{2}\s+\d{5}(-\d{4})?\b/", $address, $matches);
    return $remove_statecode ? preg_replace("/[^\d\-]/", "", extract_zipcode($matches[0])) : $matches[0];
}
 
    echo extract_zipcode($addr1, 1); // 49004 (without state code)
    echo extract_zipcode($addr2);    // LA 71417 (with state code)
 

Online Example

Concentrated answered 12/1, 2014 at 13:1 Comment(1)
Failed for 10024, New YorkSkyward
D
1
$addr = "U Square, The Park,  On NH-39,  Village- Kupa, Taluka- Bhiwandi,  District Thane 421101, test test, 454564";

$zipcode = preg_match("/\b\d{6}\b/", $a, $matches); //It will return first occurance of 6 digit no. i.e. Indian pincode

print_r($matches[0]);
Desultory answered 31/10, 2018 at 7:18 Comment(1)
You should never put only code as answer. An explanation is always more than welcome. More over, what do you bring that others doesn't already pointed out? Keep all these in mind next time you answer to someone. That said, I encourage you continuing helping others! Cheers.Rameau
G
0

Well, the problem here is, an address does not have to have a zip code with 4 digits. There are addresses with only 4 digits. Assuming that you have only 5 digit zip code addresses you could use a RegEx of course.

Have a look here, maybe this will help you:

Regex Expression to Find 5-Digit Code

Gorgias answered 12/1, 2014 at 12:54 Comment(0)
R
0

If the last one is always the zip code and they all have 5 digit numbers, you can use something like this:

function getZipCode($address) {
    $ok = preg_match("/(\d\d\d\d\d)/", $address, $matches);
    if (!$ok) {
        // This address doesn't have a ZIP code
    }
    return $matches[count($matches] - 1];
}
Roar answered 12/1, 2014 at 12:55 Comment(1)
This will cause a false positive with $addr3, and your regex in it's current state can be optimised slightly. This regex would might work better: [A-Z]{2} (\d{5}) (although I don't know the US address system very well)Schedule
M
0

I would look for all Numbers with 4 or 5 digits and take the last match.

preg_match( $addr, '/(\d{4,5})/', $matches);
$result = $matches[count($matches) - 1];
Margartmargate answered 12/1, 2014 at 12:57 Comment(0)
S
0

Well, this regex will return the last consecutive five digit string. It uses a negative look-ahead to ensure the absence of 5-digit strings after the one being returned

\b\d{5}\b(?!.*\b\d{5}\b)

so, perhaps:

if (preg_match('/\b\d{5}\b(?!.*\b\d{5}\b)/', $subject, $regs)) {
        $result = $regs[0];
} else {
    $result = "";
}
Splice answered 12/1, 2014 at 13:47 Comment(0)
E
0

Careful, parsing addresses is hard. A lot of these answers make precarious assumptions: mainly, that addresses are a regular language. They are not.

Unless your (US) addresses are guaranteed to be in a particular, standardized format (in which case, a regex might work, just for the ZIP code), you might want to try an API like LiveAddress (I work at SmartyStreets). APIs like this will parse the address for you, returning the components, and also verify it. (By the way, it appears that a few of the addresses you provided are invalid, as in, the USPS does not recognize them.)

Emersonemery answered 12/1, 2014 at 14:14 Comment(0)
S
0
 var zipCode = vm.propertyAddress.match(/\d{5}(-\d{4})?\b/g);

Address : 8585 Summerdale rd Apt-175 SanDiego 92126 CA Result: 92126

This will also work for if only Zipcode is provided

Shedd answered 18/3, 2016 at 17:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.