array_unique vs array_flip
Asked Answered
H

4

32

If I had an array of signed integers e.g:

Array
(
    [0] => -3
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 3
)

To get unique values I would instinctively use array_unique but after consideration I could perform array_flip twice which would have the same effect, and I think it would be quicker?

array_unique O(n log n) because of the sort operation it uses

array_flip O(n)

Am I correct in my assumptions?

UPDATE / EXAMPLE:

$intArray1 = array(-4,1,2,3);
print_r($intArray1);
$intArray1 = array_flip($intArray1);
print_r($intArray1);
$intArray1 = array_flip($intArray1);
print_r($intArray1);

Array
(
    [0] => -3
    [1] => 1
    [2] => 2
    [3] => 3
    [4] => 3
)
Array
(
    [-3] => 0
    [1] => 1
    [2] => 2
    [3] => 4
)
Array
(
    [0] => -3
    [1] => 1
    [2] => 2
    [4] => 3
)
Hearse answered 30/11, 2011 at 5:40 Comment(0)
B
37

I benchmarked it for you: CodePad

Your intuition on this was correct!

$test=array();
for($run=0; $run<1000; $run++)
$test[]=rand(0,100);

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_unique($test);

$time=microtime(true)-$time;
echo 'Array Unique: '.$time."\n";

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_keys(array_flip($test));

$time=microtime(true)-$time;
echo 'Keys Flip: '.$time."\n";

$time=microtime(true);

for($run=0; $run<100; $run++)
$out=array_flip(array_flip($test));

$time=microtime(true)-$time;
echo 'Flip Flip: '.$time."\n";

Output:

Array Unique: 1.1829199790955
Keys Flip: 0.0084578990936279
Flip Flip: 0.0083951950073242

Note that array_keys(array_flip($array)) will give a new key values in order, which in many cases may be what you want (identical except much faster to array_values(array_unique($array))), whereas array_flip(array_flip($array)) is identical (except much faster) to array_unique($array) where the keys remain the same.

Butterflies answered 30/11, 2011 at 5:52 Comment(9)
There's also array_keys(array_count_values($array)) - see CodePadMelba
@Leith, array_keys(array_count_values( can never be faster than array_keys(array_flip( but the result will always be the same.Butterflies
@Alasdair, why do you say that? The modified CodePad benchmark I linked in my previous comment indicates a significant improvement.Melba
@Leith, yes you are right. Your benchmark does indeed say its faster. That's quite interesting.Butterflies
Is array_flip O(n) or something better?Denesedengue
I just would like to precise, that if you try to flip empty values, like null or false, PHP will throw an error. So don't forget to array_filter() your array before giving it to array_flip; Here is the codepadNoach
Is there a deeper reason why array_unique() has to be slow()? Or could it happen that in a future version of PHP it is suddenly much faster?Cristycriswell
@Cristycriswell it's slow because it has to work for all types, but you can only have integers and strings as array keys, so if you know that's your data set you can flip flip it. But if it might contain arrays, closures, other objects, etc. then you'd be left with only array_uniqueOrthohydrogen
A small update for PHP 7.2: Array Unique: 0.0026860237121582; Keys Flip: 0.00074195861816406; Flip Flip: 0.00077009201049805. Seems like nowadays the Keys Flip solution is the fastest of the three for large arrays. Because the array_unique() became much faster, and it is the best solution for small arrays. Here are results for an array of 10-elements and time for 1M iterations on each case:Array Unique: 0.58485817909241; Keys Flip: 0.84798097610474; Flip Flip: 0.88532090187073.International
D
5

Caution: this technique is NOT a drop-in replacement for array_unique(). It only works for arrays with values that are are valid keys. (eg: string, integer, things can can be cast to int). And certainly does not work for arrays of objects.

$input = [true, false, 1, 0, 1.2, "1", "two", "0"];
var_export(array_unique($input));
array (
  0 => true,
  1 => false,
  3 => 0,
  4 => 1.2,
  6 => 'two',
)

vs:

var_export(array_keys(array_flip($input)));

PHP Warning:  array_flip(): Can only flip STRING and INTEGER values! 
in php shell code on line 1

array (
  0 => 1,
  1 => 0,
  2 => 'two',
)
Dextrogyrate answered 21/9, 2015 at 20:33 Comment(1)
this isn't an answer really, but saved me some time so thanks!Tribble
B
3

Nothing is better than running your own benchmark.

➜  8321620  cat first.php
<?php

$arr = array(-3, 1, 2, 3, 3);

for($i = 0; $i <= 1000000; $i++) {
    array_unique($arr);
}
➜  8321620  time php first.php
php first.php  3.24s user 0.01s system 99% cpu 3.251 total
➜  8321620  cat second.php
<?php

$arr = array(-3, 1, 2, 3, 3);

for($i = 0; $i <= 1000000; $i++) {
    array_flip(array_flip($arr));
}
➜  8321620  time php second.php
php second.php  1.50s user 0.01s system 99% cpu 1.514 total

Update: Array with 1000 elements.

➜  8321620  cat first.php
<?php

$arr = array();
for($i = 0; $i <= 1000; $i++) {
    $arr[] = rand(0, 1000);
}

for($i = 0; $i <= 10000; $i++) {
    array_unique($arr);
}
➜  8321620  time php first.php
php first.php  27.50s user 0.03s system 99% cpu 27.534 total
➜  8321620  cat second.php
<?php

$arr = array();
for($i = 0; $i <= 1000; $i++) {
    $arr[] = rand(0, 1000);
}

for($i = 0; $i <= 10000; $i++) {
    array_flip(array_flip($arr));
}
➜  8321620  time php second.php 
php second.php  1.59s user 0.01s system 99% cpu 1.604 total

So yes, your assumption was correct.

Broadbent answered 30/11, 2011 at 5:51 Comment(0)
O
-2

you would have to use

array_keys( array_flip( $array ) );

which would take more time

I would go for array_unique. It has the added benefit of explaining whats happening.

Obelisk answered 30/11, 2011 at 5:43 Comment(7)
Isn't array_keys also O(n)?Broadbent
My point is you have to use 2 functions, not 1. It will take more time and obfuscate the code as well.Obelisk
I have given an example of my array_flip code, array_keys isn't requiredHearse
Well, now you're using array_flip twiceObelisk
yes, that was my question in the first place, array_flip(array_flip($array)) and array_keys(array_flip($array)) both appear to be quicker that array_uniqueHearse
@Galen, Twice O(n) is much faster than once O(n log n).Broadbent
Im not saying its not faster. I'm saying is it worth the speed to obfuscate the code a little? If the speed benefits are that great then go for it.Obelisk

© 2022 - 2024 — McMap. All rights reserved.