array_diff() with "one-for-one" element removal when there are duplicate array values
Asked Answered
R

4

10

I have two arrays containing repeating values:

$test1 = [
    "blah1",
    "blah1",
    "blah1",
    "blah1",
    "blah2"
];

$test2 = [
    "blah1",
    "blah1",
    "blah1",
    "blah2"
];

I am trying to get array difference:

$result = array_diff($test1,$test2);

echo "<pre>";
print_r($result);

I need it to return array with single value blah1, yet it returns empty array instead.

I suspect it has something to do with fact there are duplicate values in both arrays, but not sure how to fix it.

Rebirth answered 9/2, 2016 at 2:21 Comment(2)
Your solution is good, but it will fails if you have $array1 = [ 'a', 'b', 'c' ] and $array2 = [ 'd' ]. The output should be the same as $array1, but will be [ 'b', 'c' ] because the array_search() will returns false when looking for d, and the unset() will drop the first key of $array1 because false == 0. An if should helps on it (gist, run).Marten
@Rebirth I think you should add your own solution as an answer, cause I haven't found a better way to do it.Puberulent
L
4

array_diff compares the first array to the other array(s) passed as parameter(s) and returns an array, containing all the elements present in the first array that are not present in any other arrays. Since $test1 and $test2 both contain "blah1" and "blah2", and no other values, actually, the expected behavior of array_diff is the one that you have experienced, that is, to return an empty array, since, there is no element in $test1 which is not present in $test2.

Further read. Also, read some theory to understand what you are working with.

Logography answered 9/2, 2016 at 2:37 Comment(5)
I understand now, but what should I use instead to get the desired effect?Rebirth
I guess I can loop over test1 and remove each matching value in both arrays, but was looking for more elegant solutionRebirth
@Acidon, what should be the result if you have 5 pieces of "blah1" in the first array without "blah2" and the second array is left unchanged?Logography
it should be "blah1","blah1" . The function in EDIT gets the result.Rebirth
@Acidon, I see. Now I understand the problem you were asking about :)Logography
P
2

Spotted a problem with Acidon's own solution. The problem comes from the fact that unset($array[false]) will actually unset $array[0], so there needs to be an explicit check for false (as David Rodrigues pointed out as well.)

function subtract_array($array1,$array2){
    foreach ($array2 as $item) {
        $key = array_search($item, $array1);
        if ( $key !== false ) {
            unset($array1[$key]);
        }
    }
    return array_values($array1);
}

Some examples

subtract_array([1,1,1,2,3],[1,2]);            // [1,1,3]
subtract_array([1,2,3],[4,5,6]);              // [1,2,3]
subtract_array([1,2,1],[1,1,2]);              // []
subtract_array([1,2,3],[]);                   // [1,2,3]
subtract_array([],[1,1]);                     // []
subtract_array(['hi','bye'], ['bye', 'bye']); // ['hi']
Puberulent answered 13/12, 2020 at 6:49 Comment(0)
I
1

Depending on the scope of your task, it may be necessary to only remove elements from the first array which are "one-for-one" represented in the second array. In other cases, it may be appropriate to cross-check the differences in a "one-for-one" manner for both arrays and combine the remaining elements.

Consider this altered sample data set:

$test1 = [
    "blah1",
    "blah1",
    "blah2",
    "blah4",
    "blah5"
];

$test2 = [
    "blah1", // under-represented
    "blah2", // equally found
    "blah3", // not found
    "blah4", // over-represented
    "blah4", //       "
];

Below are four different functions (with indicative names) to offer varied utility.

Codes: (Demo)

  • unilateral difference (iterated array searches):

    function removeBValuesFromA(array $a, array $b): array
    {
        foreach ($b as $bVal) {
            $k = array_search($bVal, $a);
            if ($k !== false) {
                unset($a[$k]);
            }
        }
        return array_values($a);
    }
    
  • bilateral difference (iterated array searches):

    function bidirectionalDiff(array $a, array $b): array
    {
        foreach ($b as $bKey => $bVal) {
            $aKey = array_search($bVal, $a);
            if ($aKey !== false) {
                unset($a[$aKey], $b[$bKey]);
            }
        }
        return array_merge($a, $b);
    }
    
  • unilateral difference (condense-compare-expand):

    function removeBValuesFromAViaCounts(array $a, array $b): array
    {
        $toRemove = array_count_values($b);
    
        $result = [];
        foreach (array_count_values($a) as $k => $count) {
            array_push(
                $result,
                ...array_fill(
                    0,
                    max(0, $count - ($toRemove[$k] ?? 0)),
                    $k
                )
            );
        }
        return $result;
    }
    

    or

    function removeBValuesFromAViaCounts(array $a, array $b): array
    {
        $toRemove = array_count_values($b);
    
        $result = [];
        foreach (array_count_values($a) as $k => $count) {
            for ($i = 0; $i < $count - ($toRemove[$k] ?? 0); ++$i) {
                $result[] = $k;
            }
        }
        return $result;
    }
    
  • bilateral difference (condense-compare-expand):

    function bidirectionalDiffViaCounts(array $a, array $b): array
    {
        $bCounts = array_count_values($b);
    
        $result = [];
        foreach (array_count_values($a) as $k => $count) {
            array_push(
                $result,
                ...array_fill(
                    0,
                    abs($count - ($bCounts[$k] ?? 0)),
                    $k
                )
            );
            unset($bCounts[$k]);
        }
        foreach ($bCounts as $k => $count) {
            array_push(
                $result,
                ...array_fill(0, $count, $k)
            );
        }
        return $result;
    }
    

Execution:

var_export([
    'removeBValuesFromA' => removeBValuesFromA($test1, $test2),
    'bidirectionalDiff' => bidirectionalDiff($test1, $test2),
    'removeBValuesFromAViaCounts' => removeBValuesFromAViaCounts($test1, $test2),
    'bidirectionalDiffViaCounts' => bidirectionalDiffViaCounts($test1, $test2),
]);

Outputs:

array (
  'removeBValuesFromA' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
  ),
  'bidirectionalDiff' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
    2 => 'blah3',
    3 => 'blah4',
  ),
  'removeBValuesFromAViaCounts' => 
  array (
    0 => 'blah1',
    1 => 'blah5',
  ),
  'bidirectionalDiffViaCounts' => 
  array (
    0 => 'blah1',
    1 => 'blah4',
    2 => 'blah5',
    3 => 'blah3',
  ),
)
Infeudation answered 2/9, 2023 at 0:59 Comment(0)
S
0

To achieve your goal, you can use custom function:

 function array_diff_countable($array1, $array2) {
    $diff = [];

    $array1Counts = array_count_values($array1);
    $array2Counts = array_count_values($array2);

    foreach ($array1Counts as $key => $count) {
        $diffCount = $count - ($array2Counts[$key] ?? 0);
        if ($diffCount > 0) {
            for ($i = 0; $i < $diffCount; $i++) {
                $diff[] = $key;
            }
        }
    }

    return $diff;
}
Swede answered 12/3, 2024 at 15:34 Comment(2)
"use this custom function" is not an overly generous explanation to future readers. Please edit your answer to explain how your script works and maybe why you would compel them to use your answer over other competing answers. This answer certainly resembles the approach that I've demonstrated in two snippets in my answer; perhaps explain the advantage of your script over my earlier posted scripts.Infeudation
To be clear, this answer is performing a "unilateral difference" 3v4l.org/lgfYr (which doesn't not make it incorrect for the asked question). It may be important for future readers to understand that it will not generate a result with the extra elements from the second array. As a matter of editing, this answer can safely remove if ($diffCount > 0) { because that logic is enforced by the for loop itself.Infeudation

© 2022 - 2025 — McMap. All rights reserved.