How to implement multiple custom sorting rules based on hyphen-delimited substrings?
Asked Answered
E

3

0

I have an array of filenames of this form:

"A - 1.2 - Floor Plan.PDF"

I need to sort the array first by the category at the beginning, in the following order:

1. Category: A
2. Category: ESC
3. Category: C
4. Category: M
5. Category: E
6. Category: P

Then I need to sort the array by the numbers following the category.

Here's an example of the array to be sorted:

$arr[0] = "A - 1.0 - Title Page.PDF";
$arr[1] = "A - 2.2 - Enlarged Floor Plans";
$arr[2] = "A - 2.1.0 - Structural Details.PDF";
$arr[3] = "E - 1.0 - Electrical Title Page.PDF";
$arr[4] = "A - 1.2 - Floor Plan.PDF";
$arr[5] = "P - 1.0 - Plumbing Title Page.PDF";
$arr[6] = "A - 2.1.1 - Structural Details.PDF";
$arr[7] = "C - 1.0 - Civil Title Page.PDF";
$arr[8] = "M - 1.0 - Mechanical Title Page.PDF";
$arr[9] = "ESC - 1.0 - Erosion Control Plan.PDF";

Ideally, this array would then become

$arr[0] = "A - 1.0 - Title Page.PDF";
$arr[1] = "A - 1.2 - Floor Plan.PDF";
$arr[2] = "A - 2.1.0 - Structural Details.PDF";
$arr[3] = "A - 2.1.1 - Structural Details.PDF";
$arr[4] = "A - 2.2 - Enlarged Floor Plans";
$arr[5] = "ESC - 1.0 - Erosion Control Plan.PDF";
$arr[6] = "C - 1.0 - Civil Title Page.PDF";
$arr[7] = "M - 1.0 - Mechanical Title Page.PDF";
$arr[8] = "E - 1.0 - Electrical Title Page.PDF";
$arr[9] = "P - 1.0 - Plumbing Title Page.PDF";

I have the following regular expression for grouping the file names appropriately:

^([A-Z]+?) ?- ?([0-9]+)\.([0-9]+)(\.([0-9]+))?.*$

I want the array sorted by group 1, then by group 2, then by group 3. If Group 5 exists, then sort last by group 5. Ignore group 4.

It may be easier to sort the categories lexicographically. If so, that would be alright; though it would be preferable if they were sorted in the order mentioned above.

Is there any way to do this with PHP?

Euchromosome answered 9/12, 2011 at 14:54 Comment(5)
Shouldn't this be in that order: A - 1.0, A - 1.2, A - 2.1.0, A - 2.1.1, A - 2.2, C - 1.0, E - 1.0, ESC - 1.0, M - 1.0, P - 1.0 after sort?Homans
@Lolo, is it not? I compared your sorted version with mine. I can't find a difference.Euchromosome
You have: [...], ESC, C, M, E, [...] I just wonder why not: [...] C, E, ESC, M, [...]Homans
@Lolo The categories are a custom order. These are construction documents. And it's conventional to include the erosion control plan with the civil engineering, since they are typically created by the same firm.Euchromosome
Fair enough. I made changes to my answer. there is complete comparing function now, which orders elements in the way you want.Homans
H
5

There is sort function which takes compare method as an argument. You can use it for example like this:

$order = array('A', 'ESC', 'C', 'M', 'E', 'P'); // order of categories
$order = array_flip($order); // flip order array, it'll look like: ('A'=>0, 'ESC'=>1, ...)

function cmp($a, $b)
{
    global $order;

    $ma = array();
    $mb = array();
    preg_match('/^([A-Z]+?) ?- ?([0-9]+)\.([0-9]+)(?:\.([0-9]+))?.*$/', $a, $ma);
    preg_match('/^([A-Z]+?) ?- ?([0-9]+)\.([0-9]+)(?:\.([0-9]+))?.*$/', $b, $mb);

    if ($ma[1] != $mb[1]) {
        return ($order[$ma[1]] < $order[$mb[1]]) ? -1 : 1;
    }
    if ($ma[2] != $mb[2]) {
        return $ma[2] < $mb[2] ? -1 : 1;
    }
    if ($ma[3] != $mb[3]) {
        return $ma[3] < $mb[3] ? -1 : 1;
    }
    // I've changed a regex a little bit, so the last number is 4th group now
    if (@$ma[4] != @$mb[4]) { 
        return @$ma[4] < @$mb[4] ? -1 : 1;
    }
    return 0;
}
usort($arr, "cmp");
Homans answered 9/12, 2011 at 15:0 Comment(0)
P
1

How about:

$arr[0] = "A - 1.0 - Title Page.PDF";
$arr[1] = "A - 2.2 - Enlarged Floor Plans";
$arr[2] = "A - 2.1.0 - Structural Details.PDF";
$arr[3] = "E - 1.0 - Electrical Title Page.PDF";
$arr[4] = "A - 1.2 - Floor Plan.PDF";
$arr[5] = "P - 1.0 - Plumbing Title Page.PDF";
$arr[6] = "A - 2.1.1 - Structural Details.PDF";
$arr[7] = "C - 1.0 - Civil Title Page.PDF";
$arr[8] = "M - 1.0 - Mechanical Title Page.PDF";
$arr[9] = "ESC - 1.0 - Erosion Control Plan.PDF";


function cmp($a,$b) {
    $arr_a = split(' - ', $a);
    $arr_b = split(' - ', $b);
    if ($arr_a[0] == $arr_b[0])
        return strcmp($arr_a[1], $arr_b[1]);
    return strcmp($arr_a[0], $arr_b[0]);
}

usort($arr, "cmp");
print_r($arr);

output:

Array
(
    [0] => A - 1.0 - Title Page.PDF
    [1] => A - 1.2 - Floor Plan.PDF
    [2] => A - 2.1.0 - Structural Details.PDF
    [3] => A - 2.1.1 - Structural Details.PDF
    [4] => A - 2.2 - Enlarged Floor Plans
    [5] => C - 1.0 - Civil Title Page.PDF
    [6] => E - 1.0 - Electrical Title Page.PDF
    [7] => ESC - 1.0 - Erosion Control Plan.PDF
    [8] => M - 1.0 - Mechanical Title Page.PDF
    [9] => P - 1.0 - Plumbing Title Page.PDF
)
Pampa answered 9/12, 2011 at 15:29 Comment(0)
I
1

After breaking your strings into their meaningful parts, I feel a cascading set of ternary expressions to be a bit tidier than if blocks to reach subsequent tie-breaking conditions.

Also, using version_compare() is VERY appropriate for your middle substring -- this will ensure that when your major/minor/micro versions move into double-digit territory, the natural sorting will still be in effect.

Pass your custom priorities array into the custom function score using a use() declaration.

Code: (Demo)

$arr = [
    "A - 1.0 - Title Page.PDF",
    "A - 2.2 - Enlarged Floor Plans",
    "A - 2.1.0 - Structural Details.PDF",
    "E - 1.0 - Electrical Title Page.PDF",
    "A - 1.2 - Floor Plan.PDF",
    "P - 1.0 - Plumbing Title Page2.PDF",
    "A - 2.1.1 - Structural Details.PDF",
    "C - 1.0 - Civil Title Page.PDF",
    "M - 1.0 - Mechanical Title Page.PDF",
    "ESC - 1.0 - Erosion Control Plan.PDF",
    "P - 1.0 - Plumbing Title Page.PDF",
];

$priorities = array_flip(['A', 'ESC', 'C', 'M', 'E', 'P']);

usort($arr, function ($a, $b) use ($priorities) {
    [$categoryA, $versionA, $nameA] = explode(' - ', $a, 3);
    [$categoryB, $versionB, $nameB] = explode(' - ', $b, 3);

    return $priorities[$categoryA] <=> $priorities[$categoryB]  // priorities as first criteria
        ?: version_compare($versionB, $versionA)                // then descending versions as second criteria
        ?: $nameA <=> $nameB;                                   // then compare names ascending
});
var_export($arr);

Output:

array (
  0 => 'A - 2.2 - Enlarged Floor Plans',
  1 => 'A - 2.1.1 - Structural Details.PDF',
  2 => 'A - 2.1.0 - Structural Details.PDF',
  3 => 'A - 1.2 - Floor Plan.PDF',
  4 => 'A - 1.0 - Title Page.PDF',
  5 => 'ESC - 1.0 - Erosion Control Plan.PDF',
  6 => 'C - 1.0 - Civil Title Page.PDF',
  7 => 'M - 1.0 - Mechanical Title Page.PDF',
  8 => 'E - 1.0 - Electrical Title Page.PDF',
  9 => 'P - 1.0 - Plumbing Title Page.PDF',
  10 => 'P - 1.0 - Plumbing Title Page2.PDF',
)

Alternatively, you can use a single spaceship operator comparison on balanced arrays for the exact same effect: (Demo)

usort($arr, function ($a, $b) use ($priorities) {
    [$categoryA, $versionA, $nameA] = explode(' - ', $a, 3);
    [$categoryB, $versionB, $nameB] = explode(' - ', $b, 3);

    return [$priorities[$categoryA], version_compare($versionB, $versionA), $nameA]
           <=> 
           [$priorities[$categoryB], version_compare($versionA, $versionB), $nameB];
});

I believe the benefit to the first snippet is that subsequent tie-breakers are not executed unless reached. The second snippet would be populating all elements whether the comparison is needed or not. If this is incorrect, anyone is welcome to correct me with a comment.

Infuscate answered 9/6, 2020 at 6:35 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.