Split address and numbers
Asked Answered
L

3

6

I'm trying to split street name, house number, and box number from a String.
Let's say the string is "SomeStreet 59A"
For this case I already have a solution with regex. I'm using this function:

address.split(/([0-9]+)/) //output ["SomeStreet","59","A"]

The problem I'm having now, is that some addresses have weird formats. Meaning, the above method does not fit for strings like:

"Somestreet 59-65" // output ["SomeStreet", "59", "-", "65"] Not good

My question for this case is, how to group the numbers to get this desired output:

["Somestreet", "59-65"]

Another weird example is:

"6' SomeStreet 59" // here "6' Somestreet" is the exact street-name.

Expected output: ["6' Somestreet", "59"]

"6' Somestreet 324/326 A/1" // Example with box number   

Expected output: ["6' Somestreet", "324/326", "A/1"]

Bear in mind that this has to be in one executable function to loop through all of the addresses that I have.

Laudatory answered 30/3, 2021 at 8:17 Comment(2)
There are so many different forms of street addresses, trying to come up with a simple function to parse them is futile.Jolee
Try .split(/\s*(\d+(?!['’\d])(?:-\d+)?)/) (see demo) if all acceptable formats are those you listed in the question.Technical
D
4

To support all string formats listed in the question, you can use

.match(/^(.*?)\s+(\d+(?:[-.\/]\d+)?)(?:\s*(\S.*))?$/)
.match(/^(.*)\s+(\d+(?:[-.\/]\d+)?)(?:\s*(\S.*))?$/)

See the regex demo.

Details:

  • ^ - start of string
  • (.*?) - Group 1: any zero or more chars other than line break chars, as few as possible (if you need to match the last number as Group 2, the Number, you need to use .*, a greedy variant)
  • \s+ - one or more whitespaces
  • (\d+(?:[-.\/]\d+)?) - Group 2: one or more digits optionally followed with -/.// and then one or more digits
  • (?:\s*(\S.*))? - an optional occurrence of zero or more whitespaces and - Group 3 - a non-whitespace char and the rest of the string
  • $ - end of string.

See a JavaScript demo:

const texts = ['SomeStreet 59A','Somestreet 59-65',"6' SomeStreet 59", 'Somestreet 1.1', 'Somestreet 65 A/1', "6' Somestreet 324/326 A/1"];
const rx = /^(.*?)\s+(\d+(?:[-.\/]\d+)?)(?:\s*(\S.*))?$/;
for (const text of texts) {
  const [_, street, number, box] = text.match(rx);
  console.log(text, '=>', {"Street":street, "Number":number, "Box":box});
}
Dunford answered 30/3, 2021 at 8:24 Comment(9)
Thanks! one more problem i'm having is with "Somestreet 1.1" I've tried doing the following with your code: \s*(\d+(?!['\d])(?:-\d+)?)(?:.\d)? but the output of the number is 1 instead of 1.1Nettle
@Laudatory No, it should be /\s*(\d+(?!['’\d])(?:[-.]\d+)?)/; where [-.] stands for a hyphen or dot.Technical
Okay so one last time i'm bothering you ^^, I have a case like this: "Somestreet 65 A/1" // output --> ["Somestreet", "65", "A/"]Nettle
@Laudatory This is no longer clear. Do you want to remove the final number? Or a number after / at the end of string? Try /\s*(?:(?<=\/)\d+$|(\d+(?!['’\d])(?:[-.]\d+)?))/, see demo.Technical
I'd like to keep the "1" after the "A/" because in this case it is part of the boxnumberNettle
@Laudatory It is not clear from your comment. Try /\s*(?<!\/)(\d+(?!['’\d])(?:[-.]\d+)?)/, see this demo.Technical
Let us continue this discussion in chat.Nettle
Why are there 2 regexes? Which one should we use?Mallet
@JimmyKane It depends on what kind of input you have, cf. the lazy version with the greedy one.Technical
E
3

If you don't mind a bit of string trimming afterwards, here's a solution:

.split(/(?= \d|\D+$)/)

or to account also for 65 A/1 or 324/326 A/1

.split(/(?= \d|\D+$|(?<!\D) )/)

Regex101.com demo

[
  "Some Street 59A",
  "Some Street 59-69",
  "Some Street 1.1",
  "6' Street 45b",
  "6' Some street 324/326 A/1",
  "Some Street 65 A/1",
  "42th Stack ave. 59-69",
].forEach(str => console.log( str.split(/(?= \d|\D+$|(?<!\D) )/) ));

If you want to keep the number i.e: 59A as a whole, here's another simple solution:

.split(/(?= \d| [\w\d/]+$)/);

Regex101.com demo

[
  "Some Street 59A",
  "Some Street 59-69",
  "Some Street 1.1",
  "6' Street 45b",
  "6' Some street 324/326 A/1",
  "Some Street 65 A/1",
  "42th Stack ave. 59-69",
].forEach(str => console.log( str.split(/(?= \d| [\w\d/]+$)/) ));
Equivoque answered 30/3, 2021 at 8:49 Comment(0)
P
0
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
    <!-- Template for matching the root element -->
        <xsl:template match="/">
    <!-- Call the split-address template with the full address -->
            <xsl:call-template name="split-address">
                <xsl:with-param name="address" select="/root/address" />
            </xsl:call-template>
        </xsl:template>

    <!-- Template to split the address into lines -->
        <xsl:template name="split-address">
                <xsl:param name="address" />
                <xsl:choose>
    <!-- If the length of the address is less than or equal to 35 characters, output it directly -->
                    <xsl:when test="string-length($address) &lt;= 35">
                        <xsl:value-of select="$address" />
                    </xsl:when>
      <!-- Otherwise, find the last space before the 35th character and split the line -->
                    <xsl:otherwise>
        <!-- Get the substring of the address up to the 35th character -->
                        <xsl:variable name="substring" select="substring($address, 1, 35)" />
        <!-- Find the last space before the 35th character -->
                        <xsl:variable name="split-pos" select="string-length(substring-before($substring, ' '))" />
        <!-- Output the first part of the address -->
                        <xsl:value-of select="substring($address, 1, $split-pos)" />
        <!-- Add a line break -->
                        <xsl:text>&#10;</xsl:text>
        <!-- Recursively call the template with the remaining address -->
                        <xsl:call-template name="split-address">
                            <xsl:with-param name="address" select="substring-after($address, substring($address, 1, $split-pos))" />
                        </xsl:call-template>
                    </xsl:otherwise>
                </xsl:choose>
        </xsl:template>
</xsl:stylesheet>

Piggish answered 27/9 at 16:54 Comment(1)
Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.Lagena

© 2022 - 2024 — McMap. All rights reserved.