PHP: use array_unique on an array of arrays? [duplicate]
Asked Answered
O

7

24

I have an array

Array(
[0] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[1] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[2] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 8
        [frame_id] => 8
    )

[3] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

[4] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )

)

As you can see key 0 is the same as 1, 3 and 4. And key 2 is different from them all.

When running the array_unique function on them, the only left is

Array (
[0] => Array
    (
        [0] => 33
        [user_id] => 33
        [1] => 3
        [frame_id] => 3
    )
)

Any ideas why array_unique isn't working as expected?

Osteotomy answered 1/4, 2010 at 14:50 Comment(0)
Z
88

It's because array_unique compares items using a string comparison. From the docs:

Note: Two elements are considered equal if and only if (string) $elem1 === (string) $elem2. In words: when the string representation is the same. The first element will be used.

The string representation of an array is simply the word Array, no matter what its contents are.

You can do what you want to do by using the following:

$arr = array(
    array('user_id' => 33, 'frame_id' => 3),
    array('user_id' => 33, 'frame_id' => 3),
    array('user_id' => 33, 'frame_id' => 8)
);

$arr = array_intersect_key($arr, array_unique(array_map('serialize', $arr)));

//result:
array
  0 => 
    array
      'user_id' => int 33
      'user' => int 3
  2 => 
    array
      'user_id' => int 33
      'user' => int 8

Here's how it works:

  1. Each array item is serialized. This will be unique based on the array's contents.

  2. The results of this are run through array_unique, so only arrays with unique signatures are left.

  3. array_intersect_key will take the keys of the unique items from the map/unique function (since the source array's keys are preserved) and pull them out of your original source array.

Zepeda answered 1/4, 2010 at 14:53 Comment(5)
Could you quickly elaborate how complex this approach is. For example if the have n items in the array "arr" and each item has m attributes. I just would like to know if it scales for my applicaton, where if have about 10 to 50 items with about 5 to 15 propertys each.Diapophysis
@Zepeda I think this is going to make all of my hopes and dreams come true!Lawanda
@PascalKlein - that's a good question and I'm disappointed it isn't answered. This solution works but serializing every member of the array is going to scale poorly for large arrays / arrays with large sub-arrays. Depending on your specific situation, though, this may be the only viable solution. In my case I was able to simplify by making some assumptions (if $a['id'] === $b['id'] then assume $a === $b) but otherwise mimicking this logic. That is, I replaced 'serialize' with my own callback that just returns $arg['id'], which is much faster than serialize would've been.Gilder
Note that sometimes json_encode might be faster, so check this too.Bisulfate
Really like this one, but found out that the removed entries still exist as NULL if you loop over. solved this by putting an array_values() around the code.Riobard
T
7

Here's an improved version of @ryeguy's answer:

<?php

$arr = array(
    array('user_id' => 33, 'tmp_id' => 3),
    array('user_id' => 33, 'tmp_id' => 4),
    array('user_id' => 33, 'tmp_id' => 5)
);


# $arr = array_intersect_key($arr, array_unique(array_map('serialize', $arr)));
$arr = array_intersect_key($arr, array_unique(array_map(function ($el) {
    return $el['user_id'];
}, $arr)));

//result:
array
  0 => 
    array
      'user_id' => int 33
      'tmp_id' => int 3

First, it doesn't do unneeded serialization. Second, sometimes attributes may be different even so id is the same.

The trick here is that array_unique() preserves the keys:

$ php -r 'var_dump(array_unique([1, 2, 2, 3]));'
array(3) {
  [0]=>
  int(1)
  [1]=>
  int(2)
  [3]=>
  int(3)
}

This let's array_intersect_key() leave the desired elements.

I've run into it with Google Places API. I was combining results of several requests with different type of objects (think tags). But I got duplicates, since an object may be put into several categories (types). And the method with serialize didn't work, since the attrs were different, namely, photo_reference and reference. Probably these are like temporary ids.

Thole answered 7/12, 2016 at 15:2 Comment(5)
Didn't think you could improve that but you DID!Glottology
@x-yuri: can you explain why array_intersect_key works this way? intersecting a simple array Array ( [0] => 33 ) in combination with array of array, i.e. $arr = array( array('user_id' => 33, 'tmp_id' => 3), ... );? somehow it will match those 33 even when there is a type mismatch (and other level)Countdown
@Countdown var_dump(array_intersect([33], [['user_id' => 33, 'tmp_id' => 3]]));? It gives me an empty array. But this way, var_dump(array_intersect(['Array'], [[]]));, there is a match, because... because it's php :) Well, I didn't mean it seriously. Joking aside, because of the way it compares the elements. If you give an example where there is a match, I would probably be able to tell the reason. You might want to specify your version. Also, try raising the error reporting level, if not at max. That might help.Thole
@x-yuri, it is your example in steps: array_map(...) yields (using print_r): Array ( [0] => 33 [1] => 33 [2] => 33 ), array_unique(..): Array ( [0] => 33 ) and print_r($arr): Array ( [0] => Array ( [user_id] => 33 [tmp_id] => 3 ) [1] => Array ( [user_id] => 33 [tmp_id] => 4 ) [2] => Array ( [user_id] => 33 [tmp_id] => 5 ) ) then the final will be Array ( [0] => Array ( [user_id] => 33 [tmp_id] => 3 ) ). What kind of magic is happening here? It is somehow deciding to go a level deeper to compare (I think string compare would go awry as the second starts with Array [vs 33] as first element)?Countdown
@Countdown I think I now see what confuses you. I've updated the answer, check it out. Do note that the keys of the array returned by array_unique() are not 0, 1, 2. They are 0, 1, 3. This let's array_intersect_key() to return the desired elements.Thole
E
3

array_unique() only supports multi-dimensional arrays in PHP 5.2.9 and higher.

Instead, you can create a hash of the array and check it for unique-ness.

$hashes = array(); 

foreach($array as $val) { 
    $hashes[md5(serialize($val))] = $val; 
} 

array_unique($hashes);
Evadnee answered 1/4, 2010 at 14:56 Comment(4)
the serialized string should be enough for comparing the strings. when using md5 with long serialized strings, you risk collision. Also, I'd use array_map with un/serialize before and after unique.Authors
@Authors collisions are so improbable that it isn't even worth worrying about.Zepeda
@Zepeda that depends on your application. and also doesn't change that it's unnecessary in the first place.Authors
There is no need to call array_unique using this concept, as when building $hashes this will be already unique, better use array_values to get integer based array.Bisulfate
B
2

array_unique deosn't work recursive, so it just thinks "this are all Arrays, let's kill all but one... here we go!"

Bournemouth answered 1/4, 2010 at 14:54 Comment(0)
S
1

Quick Answer (TL;DR)

  • Distinct values may be extracted from PHP Array of AssociativeArrays using foreach
  • This is a simplistic approach

Detailed Answer

Context

  • PHP 5.3
  • PHP Array of AssociativeArrays (tabluar composite data variable)
  • Alternate name for this composite variable is ArrayOfDictionary (AOD)

Problem

  • Scenario: DeveloperMarsher has a PHP tabular composite variable
    • DeveloperMarsher wishes to extract distinct values on a specific name-value pair
    • In the example below, DeveloperMarsher wishes to get rows for each distinct fname name-value pair

Solution

  • example01 ;; DeveloperMarsher starts with a tabluar data variable that looks like this

    $aodtable = json_decode('[
    {
      "fname": "homer"
      ,"lname": "simpson"
    },
    {
      "fname": "homer"
      ,"lname": "jackson"
    },
    {
      "fname": "homer"
      ,"lname": "johnson"
    },
    {
      "fname": "bart"
      ,"lname": "johnson"
    },
    {
      "fname": "bart"
      ,"lname": "jackson"
    },
    {
      "fname": "bart"
      ,"lname": "simpson"
    },
    {
      "fname": "fred"
      ,"lname": "flintstone"
    }
    ]',true);
    
  • example01 ;; DeveloperMarsher can extract distinct values with a foreach loop that tracks seen values

    $sgfield  =   'fname';
    $bgnocase =   true;
    
    //
    $targfield  =   $sgfield;
    $ddseen     =   Array();
    $vout       =   Array();
    foreach ($aodtable as $datarow) {
    if( (boolean) $bgnocase == true ){ @$datarow[$targfield] = @strtolower($datarow[$targfield]); }
    if( (string) @$ddseen[ $datarow[$targfield] ] == '' ){
      $rowout   = array_intersect_key($datarow, array_flip(array_keys($datarow)));
      $ddseen[ $datarow[$targfield] ] = $datarow[$targfield];
      $vout[] = Array( $rowout );
    }
    }
    //;;
    
    print var_export( $vout, true );
    

Output result

array (
  0 =>
  array (
    0 =>
    array (
      'fname' => 'homer',
      'lname' => 'simpson',
    ),
  ),
  1 =>
  array (
    0 =>
    array (
      'fname' => 'bart',
      'lname' => 'johnson',
    ),
  ),
  2 =>
  array (
    0 =>
    array (
      'fname' => 'fred',
      'lname' => 'flintstone',
    ),
  ),
)

Pitfalls

  • This solution does not aggregate on fields that are not part of the DISTINCT operation
  • Arbitrary name-value pairs are returned from arbitrarily chosen distinct rows
  • Arbitrary sort order of output
  • Arbitrary handling of letter-case (is capital A distinct from lower-case a ?)

See also

  • php array_intersect_key
  • php array_flip
Systemize answered 27/2, 2017 at 21:7 Comment(0)
R
1
function array_unique_recursive($array)
{
    $array = array_unique($array, SORT_REGULAR);

    foreach ($array as $key => $elem) {
        if (is_array($elem)) {
            $array[$key] = array_unique_recursive($elem);
        }
    }

    return $array;
}

Doesn't that do the trick ?

Redolent answered 8/11, 2018 at 12:30 Comment(0)
A
0
`

    $arr = array(
        array('user_id' => 33, 'tmp_id' => 3),
        array('user_id' => 33, 'tmp_id' => 4),
        array('user_id' => 33, 'tmp_id' => 3),
        array('user_id' => 33, 'tmp_id' => 4),
    );
    $arr1 = array_unique($arr,SORT_REGULAR);
    echo "<pre>";
    print_r($arr1);
    echo "</pre>";
   Array(   
        [0] => Array(
                    [user_id] => 33
                    [tmp_id] => 3
        )
        [1] => Array(
                     [user_id] => 33
                     [tmp_id] => 4
          )
        )
    

`
Adulterate answered 4/6, 2021 at 11:59 Comment(1)
Please post some explanation along with your code.Lelialelith

© 2022 - 2024 — McMap. All rights reserved.