How to I determine whether two file paths (or file URLs) identify the same file or directory on macOS?
Asked Answered
S

1

7

Imagine this simple example of two paths on macOS:

  • /etc/hosts
  • /private/etc/hosts

Both point to the same file. But how do you determine that?

Another example:

  • ~/Desktop
  • /Users/yourname/Desktop

Or what about upper / lower case mixes on a case-insensitive file system:

  • /Volumes/external/my file
  • /Volumes/External/My File

And even this:

  • /Applications/Über.app

Here: The "Ü" can be specified in two unicode composition formats (NFD, NFC). For an example where this can happen when you use the (NS)URL API see this gist of mine.

Since macOS 10.15 (Catalina) there are additionally firmlinks that link from one volume to another in a volume group. Paths for the same FS object could be written as:

  • /Applications/Find Any File.app
  • /System/Volumes/Data/Applications/Find Any File.app

I like to document ways that reliably deal with all these intricacies, with the goal of being efficient (i.e. fast).

Subjunction answered 6/4, 2021 at 12:12 Comment(0)
S
13

There are two ways to check if two paths (or their file URLs) point to the same file system item:

  • Compare their paths. This requires that the paths get prepared first.
  • Compare their IDs (inodes). This is overall safer as it avoids all the complications with unicode intricacies and wrong case.

Comparing file IDs

In ObjC this is fairly easy (note: Accordingly to a knowledgeable Apple developer one should not rely on [NSURL fileReferenceURL], so this code uses a cleaner way):

NSString *p1 = @"/etc/hosts";
NSString *p2 = @"/private/etc/hosts";
NSURL *url1 = [NSURL fileURLWithPath:p1];
NSURL *url2 = [NSURL fileURLWithPath:p2];

id ref1 = nil, ref2 = nil;
[url1 getResourceValue:&ref1 forKey:NSURLFileResourceIdentifierKey error:nil];
[url2 getResourceValue:&ref2 forKey:NSURLFileResourceIdentifierKey error:nil];

BOOL equal = [ref1 isEqual:ref2];

The equivalent in Swift (note: do not use fileReferenceURL, see this bug report):

let p1 = "/etc/hosts"
let p2 = "/private/etc/hosts"
let url1 = URL(fileURLWithPath: p1)
let url2 = URL(fileURLWithPath: p2)

let ref1 = try url1.resourceValues(forKeys[.fileResourceIdentifierKey])
                   .fileResourceIdentifier
let ref2 = try url2.resourceValues(forKeys[.fileResourceIdentifierKey])
                   .fileResourceIdentifier
let equal = ref1?.isEqual(ref2) ?? false

Both solution use the BSD function lstat under the hood, so you could also write this in plain C:

static bool paths_are_equal (const char *p1, const char *p2) {
  struct stat stat1, stat2;
  int res1 = lstat (p1, &stat1);
  int res2 = lstat (p2, &stat2);
  return (res1 == 0 && res2 == 0) &&
         (stat1.st_dev == stat2.st_dev) && (stat1.st_ino == stat2.st_ino);
}

However, heed the warning about using these kind of file references:

The value of this identifier is not persistent across system restarts.

This is mainly meant for the volume ID, but may also affect the file ID on file systems that do not support persistent file IDs.

Comparing paths

To compare the paths you must get their canonical path first.

If you do not do this, you can not be sure that the case is correct, which in turn will lead to very complex comparison code. (See using NSURLCanonicalPathKey for details.)

There are different ways how the case can be messed up:

  • The user may have entered the name manually, with the wrong case.
  • You have previously stored the path but the user has renamed the file's case in the meantime. You path will still identify the same file, but now the case is wrong and a comparison for equal paths could fail depending on how you got the other path you compare with.

Only if you got the path from a file system operation where you could not specify any part of the path incorrectly (i.e. with the wrong case), you do not need to get the canonical path but can just call standardizingPath and then compare their paths for equality (no case-insensitive option necessary).

Otherwise, and to be on the safe side, get the canonical path from a URL like this:

import Foundation

let uncleanPath = "/applications"
let url = URL(fileURLWithPath: uncleanPath)

if let resourceValues = try? url.resourceValues(forKeys: [.canonicalPathKey]),
    let resolvedPath = resourceValues.canonicalPath {
        print(resolvedPath) // gives "/Applications"
    }

If your path is stored in an String instead of a URL object, you could call stringByStandardizingPath (Apple Docs). But that would neither resolve incorrect case nor would it decompose the characters, which may cause problems as shown in the aforementioned gist.

Therefore, it's safer to create a file URL from the String and then use the above method to get the canonical path or, even better, use the lstat() solution to compare the file IDs as shown above.

There's also a BSD function to get the canonical path from a C string: realpath(). However, this is not safe because it does not resolve the case of different paths in a volume group (as shown in the question) to the same string. Therefore, this function should be avoided for this purpose.

Subjunction answered 6/4, 2021 at 12:12 Comment(3)
Does the first method work with APFS firmlinks? My guess is that it wouldn't.Yancey
@Yancey I honestly don't know. Let us know if you've checked it.Subjunction
In the case of realpath() both Rust and Zig call it and Rust doesn't resolve firmlinks but Zig does. I can only guess that they link to different libc's. I plan to experiment further.Yancey

© 2022 - 2024 — McMap. All rights reserved.