How to match PHP's explode(';',$s,3) to s.split(';',3) in JavaScript?
Asked Answered
C

3

2

If you run an explode in PHP with the resulting array length limited, it will append the remainder of the string to the last element. This is how exploding a string should behave, since nowhere in the split am I saying that I want to discard my data, just split it. This is how it works in PHP:

# Name;Date;Quote
$s = 'Mark Twain;1879-11-14;"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."';
$a = explode(';',$s,3);
var_dump($a);

array(3) {
  [0]=>
  string(10) "Mark Twain"
  [1]=>
  string(10) "1879-11-14"
  [2]=>
  string(177) ""We haven't all had the good fortune to be ladies; we haven't all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground.""
}

However, if you run the same code in JavaScript:

> var s = 'Mark Twain;1879-11-14;"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."'
undefined
> var a = s.split(';',3);
undefined
> a
[ 'Mark Twain',
  '1879-11-14',
  '"We haven\'t all had the good fortune to be ladies' ]

This makes absolutely no sense, because the whole point of splitting a string is to treat the final portion of the string as a literal, instead of delimited. JavaScript's split with a limit is the exact same as:

# In PHP
$a = array_slice(explode(';',$s), 0, 3);
# Or in JavaScript
var a = s.split(';').slice(0, 3);

If the user in JavaScript only wanted to make use of the first two elements in this array, whether the array is split or not doesn't matter. The first two elements will always have the same value no matter what. The only element that changes, is the last element of the split array.

If the native split with limit method in JavaScript can be replicated using a slice, then what value does it provide?

But I digress, what is the most efficient way to replicate the explode functionality in PHP? Removing each element as a substring until the last element is reached, splitting the entire string and then concatenating the remaining elements, getting the location of the n - 1 delimiter and getting a substring of that, or any other solution I haven't thought of?

Constitute answered 31/10, 2018 at 15:0 Comment(3)
This makes absolute no sense, because the whole point of splitting a string is to treat the final portion of the string as a literal, instead of delimited I could equally say that returning the rest of the string makes no sense - if I get a CSV data and want only the first three columns, why would I care about the rest? Why would I need to care if there are others and how to filter them? The .slice would still require generating the entire length of the array and then discarding part of it. Processing a CSV file would then mean a lot of useless GC runs.Murillo
This seems like you ought to be using JSON...Bligh
@vlaz I don't see what the point of quoting me was since it would be clear without the quote what you were talking about. I see the point with only fetching a few elements from a CSV, but probabilistically-speaking I believe it's less likely that you'd have to process a denormalized CSV and that the efficiency in excluding those values would be negligible.Constitute
C
0

Alright, I created 4 alternative versions of the PHP split string algorithm, along with the two provided by @hanshenrik, and did a basic benchmark on them:

function explode1(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = [];
    var lastIndex = -1;
    var index = 0;
    for (var i = 0; i < limit; i++) {
        index = str.indexOf(delimiter, lastIndex + 1);
        if (i == limit - 1) {
            a.push(str.substring(lastIndex + 1));
        } else {
            a.push(str.substring(lastIndex + 1, index));
        }
        lastIndex = index;
    }
    return a;
}

function explode2(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = str.split(delimiter);
    var ret = a.slice(0, limit - 1);
    ret.push(a.slice(limit - 1).join(delimiter));
    return ret;
}

function explode3(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = s.split(delimiter, limit - 1);
    var index = 0;
    for (var i = 0; i < limit - 1; i++) {
        index = s.indexOf(delimiter, index + 1);
    }
    a.push(str.substring(index + 1));
    return a;
}

function explode4(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = str.split(delimiter, limit - 1);
    a.push(str.substring(a.join(delimiter).length + 1));
    return a;
}

function explode5(delimiter, string, limit) {
    //  discuss at: http://locutus.io/php/explode/
    // original by: Kevin van Zonneveld (http://kvz.io)
    //   example 1: explode(' ', 'Kevin van Zonneveld')
    //   returns 1: [ 'Kevin', 'van', 'Zonneveld' ]

    if (arguments.length < 2 ||
        typeof delimiter === 'undefined' ||
        typeof string === 'undefined') {
        return null
    }
    if (delimiter === '' ||
        delimiter === false ||
        delimiter === null) {
        return false
    }
    if (typeof delimiter === 'function' ||
        typeof delimiter === 'object' ||
        typeof string === 'function' ||
        typeof string === 'object') {
        return {
            0: ''
        }
    }
    if (delimiter === true) {
        delimiter = '1'
    }

    // Here we go...
    delimiter += ''
    string += ''

    var s = string.split(delimiter)

    if (typeof limit === 'undefined') return s

    // Support for limit
    if (limit === 0) limit = 1

    // Positive limit
    if (limit > 0) {
        if (limit >= s.length) {
            return s
        }
        return s
            .slice(0, limit - 1)
            .concat([s.slice(limit - 1)
                .join(delimiter)
            ])
    }

    // Negative limit
    if (-limit >= s.length) {
        return []
    }

    s.splice(s.length + limit)
    return s
}

function explode6(delimiter, string, limit) {
        var spl = string.split(delimiter);
        if (spl.length <= limit) {
                return spl;
        }
        var ret = [],i=0;
        for (; i < limit; ++i) {
                ret.push(spl[i]);
        }
        for (; i < spl.length; ++i) {
                ret[limit - 1] += delimiter+spl[i];
        }
        return ret;
}

var s = 'Mark Twain,1879-11-14,"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."'
console.log(s);

console.time('explode1');
var a1 = explode1(',', s, 3);
//console.log(a1);
console.timeEnd('explode1');

console.time('explode2');
var a2 = explode2(',', s, 3);
//console.log(a2);
console.timeEnd('explode2');

console.time('explode3');
var a3 = explode3(',', s, 3);
//console.log(a3);
console.timeEnd('explode3');

console.time('explode4');
var a4 = explode4(',', s, 3);
//console.log(a4);
console.timeEnd('explode4');

console.time('explode5');
var a5 = explode5(',', s, 3);
//console.log(a5);
console.timeEnd('explode5');

console.time('explode6');
var a6 = explode6(',', s, 3);
//console.log(a6);
console.timeEnd('explode6');

The two best-performing algorithms was explode4 principally, with explode3 a close second in multiple iterations of the benchmark:

$ node explode1.js && node explode2.js && node explode3.js && node 
explode4.js && node explode5.js && node explode6.js
explode1: 0.200ms
explode2: 0.194ms
explode3: 0.147ms
explode4: 0.183ms
explode5: 0.341ms
explode6: 0.162ms

You can run your own benchmarks, but with my tests I can confirm that splitting an array by n - 1 and then getting an index from joining the resulting array is the fastest algorithm matching explode in PHP.

EDIT: It turns out that the garbage collector biased how each successive function was measured, so I split them off into their own individual files and re-ran the benchmarking a few times. It seems explode3 is the best performing, not explode4, but I won't make a decision that I'm not completely sure of.

Constitute answered 5/11, 2018 at 20:51 Comment(0)
C
1

Loctus.io got you covered, they ported php's explode, and a great number of other php functions to javascript

usage:

$s = 'Mark Twain;1879-11-14;"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."';
"Mark Twain;1879-11-14;"We haven't all had the good fortune to be ladies; we haven't all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground.""
$a = explode(';',$s,3);

content of $a as reported by Chrome's javascript console:

0: "Mark Twain"
1: "1879-11-14"
2: ""We haven't all had the good fortune to be ladies; we haven't all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground.""
length: 3

, source: http://locutus.io/php/strings/explode/

function explode (delimiter, string, limit) {
  //  discuss at: http://locutus.io/php/explode/
  // original by: Kevin van Zonneveld (http://kvz.io)
  //   example 1: explode(' ', 'Kevin van Zonneveld')
  //   returns 1: [ 'Kevin', 'van', 'Zonneveld' ]

  if (arguments.length < 2 ||
    typeof delimiter === 'undefined' ||
    typeof string === 'undefined') {
    return null
  }
  if (delimiter === '' ||
    delimiter === false ||
    delimiter === null) {
    return false
  }
  if (typeof delimiter === 'function' ||
    typeof delimiter === 'object' ||
    typeof string === 'function' ||
    typeof string === 'object') {
    return {
      0: ''
    }
  }
  if (delimiter === true) {
    delimiter = '1'
  }

  // Here we go...
  delimiter += ''
  string += ''

  var s = string.split(delimiter)

  if (typeof limit === 'undefined') return s

  // Support for limit
  if (limit === 0) limit = 1

  // Positive limit
  if (limit > 0) {
    if (limit >= s.length) {
      return s
    }
    return s
      .slice(0, limit - 1)
      .concat([s.slice(limit - 1)
        .join(delimiter)
      ])
  }

  // Negative limit
  if (-limit >= s.length) {
    return []
  }

  s.splice(s.length + limit)
  return s
}

edit: if you for some reason need/want a smaller implementation, here's 1 i made in response to the comments:

function explode(delimiter, string, limit) {
    var spl = string.split(delimiter);
    if (spl.length <= limit) {
        return spl;
    }
    var ret = [],i=0;
    for (; i < limit; ++i) {
        ret.push(spl[i]);
    }
    for (; i < spl.length; ++i) {
        ret[limit - 1] += delimiter+spl[i];
    }
    return ret;
}
Cowskin answered 31/10, 2018 at 15:27 Comment(2)
Thanks for the answer! I'll consider this. This code is more of an exception than rule for me. In the entire codebase, I believe this is the only time a string is being split by limit. I'll probably go with the simplest solution that requires the least dependencies.Constitute
@Constitute the code posted has exactly 0 dependencies! if you for some reason need/want a smaller function tho, maybe try: function explode(delimiter, string, limit) { var spl = string.split(delimiter); if (spl.length <= limit) { return spl; } var ret = []; for (var i = 0; i < limit; ++i) { ret.push(spl[i]); } for (; i < spl.length; ++i) { ret[limit - 1] += spl[i]; } return ret; } - less robust, and almost certainly slower than the code above, but a simpler implementation.Cowskin
T
1

According documentation the split function accepts two arguments:

string.split(separator, limit)

However this still gives not the result you want because:

The second parameter is an integer that specifies the number of splits, items after the split limit will not be included in the array

However, I noticed that the ';' in the text has a space behind it. So you could use a regex.

var s = 'Mark Twain;1879-11-14;"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."'
var a = s.split(/;(?! )/,3)
console.log(a);

The Regex (/;(?! ) splits all ';' except if there is a space behind it.

Hope this helps!

Toscano answered 31/10, 2018 at 15:50 Comment(1)
I love witty use of regex, but sadly the Mark Twain quote is just lorem ipsum. My actual information is private data so I can't give an example of it on StackOverflow.Constitute
C
0

Alright, I created 4 alternative versions of the PHP split string algorithm, along with the two provided by @hanshenrik, and did a basic benchmark on them:

function explode1(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = [];
    var lastIndex = -1;
    var index = 0;
    for (var i = 0; i < limit; i++) {
        index = str.indexOf(delimiter, lastIndex + 1);
        if (i == limit - 1) {
            a.push(str.substring(lastIndex + 1));
        } else {
            a.push(str.substring(lastIndex + 1, index));
        }
        lastIndex = index;
    }
    return a;
}

function explode2(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = str.split(delimiter);
    var ret = a.slice(0, limit - 1);
    ret.push(a.slice(limit - 1).join(delimiter));
    return ret;
}

function explode3(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = s.split(delimiter, limit - 1);
    var index = 0;
    for (var i = 0; i < limit - 1; i++) {
        index = s.indexOf(delimiter, index + 1);
    }
    a.push(str.substring(index + 1));
    return a;
}

function explode4(delimiter, str, limit) {
    if (limit == null) {
        return s.split(delimiter);
    }
    var a = str.split(delimiter, limit - 1);
    a.push(str.substring(a.join(delimiter).length + 1));
    return a;
}

function explode5(delimiter, string, limit) {
    //  discuss at: http://locutus.io/php/explode/
    // original by: Kevin van Zonneveld (http://kvz.io)
    //   example 1: explode(' ', 'Kevin van Zonneveld')
    //   returns 1: [ 'Kevin', 'van', 'Zonneveld' ]

    if (arguments.length < 2 ||
        typeof delimiter === 'undefined' ||
        typeof string === 'undefined') {
        return null
    }
    if (delimiter === '' ||
        delimiter === false ||
        delimiter === null) {
        return false
    }
    if (typeof delimiter === 'function' ||
        typeof delimiter === 'object' ||
        typeof string === 'function' ||
        typeof string === 'object') {
        return {
            0: ''
        }
    }
    if (delimiter === true) {
        delimiter = '1'
    }

    // Here we go...
    delimiter += ''
    string += ''

    var s = string.split(delimiter)

    if (typeof limit === 'undefined') return s

    // Support for limit
    if (limit === 0) limit = 1

    // Positive limit
    if (limit > 0) {
        if (limit >= s.length) {
            return s
        }
        return s
            .slice(0, limit - 1)
            .concat([s.slice(limit - 1)
                .join(delimiter)
            ])
    }

    // Negative limit
    if (-limit >= s.length) {
        return []
    }

    s.splice(s.length + limit)
    return s
}

function explode6(delimiter, string, limit) {
        var spl = string.split(delimiter);
        if (spl.length <= limit) {
                return spl;
        }
        var ret = [],i=0;
        for (; i < limit; ++i) {
                ret.push(spl[i]);
        }
        for (; i < spl.length; ++i) {
                ret[limit - 1] += delimiter+spl[i];
        }
        return ret;
}

var s = 'Mark Twain,1879-11-14,"We haven\'t all had the good fortune to be ladies; we haven\'t all been generals, or poets, or statesmen; but when the toast works down to the babies, we stand on common ground."'
console.log(s);

console.time('explode1');
var a1 = explode1(',', s, 3);
//console.log(a1);
console.timeEnd('explode1');

console.time('explode2');
var a2 = explode2(',', s, 3);
//console.log(a2);
console.timeEnd('explode2');

console.time('explode3');
var a3 = explode3(',', s, 3);
//console.log(a3);
console.timeEnd('explode3');

console.time('explode4');
var a4 = explode4(',', s, 3);
//console.log(a4);
console.timeEnd('explode4');

console.time('explode5');
var a5 = explode5(',', s, 3);
//console.log(a5);
console.timeEnd('explode5');

console.time('explode6');
var a6 = explode6(',', s, 3);
//console.log(a6);
console.timeEnd('explode6');

The two best-performing algorithms was explode4 principally, with explode3 a close second in multiple iterations of the benchmark:

$ node explode1.js && node explode2.js && node explode3.js && node 
explode4.js && node explode5.js && node explode6.js
explode1: 0.200ms
explode2: 0.194ms
explode3: 0.147ms
explode4: 0.183ms
explode5: 0.341ms
explode6: 0.162ms

You can run your own benchmarks, but with my tests I can confirm that splitting an array by n - 1 and then getting an index from joining the resulting array is the fastest algorithm matching explode in PHP.

EDIT: It turns out that the garbage collector biased how each successive function was measured, so I split them off into their own individual files and re-ran the benchmarking a few times. It seems explode3 is the best performing, not explode4, but I won't make a decision that I'm not completely sure of.

Constitute answered 5/11, 2018 at 20:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.