Native path separator bug in C++17 std::filesystem::path?
Asked Answered
G

3

20

I encountered a problem when upgrading from #include <experimental/filesystem> to #include <filesystem>. It seems that the std::filesystem::path::wstring method is not returning the same string as in experimental::filesystem. I wrote the following small test program with output result included.

#include <iostream>
#include <filesystem>
#include <experimental/filesystem>

namespace fs = std::filesystem;
namespace ex = std::experimental::filesystem;
using namespace std;

int main()
{
    fs::path p1{ L"C:\\temp/foo" };    
    wcout << "std::filesystem Native: " << p1.wstring() << "  Generic: " << p1.generic_wstring() << endl;

    ex::path p2{ L"C:\\temp/foo" };
    wcout << "std::experimental::filesystem Native: " << p2.wstring() << "  Generic: " << p2.generic_wstring() << endl;
}

/* Output:
std::filesystem Native: C:\temp/foo  Generic: C:/temp/foo
std::experimental::filesystem Native: C:\temp\foo  Generic: C:/temp/foo
*/

According to https://en.cppreference.com/w/cpp/filesystem/path/string:

Return value

The internal pathname in native pathname format, converted to specified string type.

The program ran on Windows 10 and was compiled with Visual Studio 2017 version 15.8.0. I would expect the native pathname to be C:\temp\foo.

Question: Is this a bug in std::filesystem::path?

Grizelda answered 16/8, 2018 at 22:22 Comment(9)
The C Runtime APIs on Windows are "flippy" in that they accept ` \ ` or ` / ` as separators. The Win32 APIs are a little less forgiving, especially if you use Long UNC, and require ` \ ` to be used.Konstanze
If std::filesystem even supposed to support paths with forward-slashes on Windows? Can you provide an example in which the actually paths only have backslahes?Bonita
In Visual Studio, <filesystem> is Microsoft specific as stated here.Ifni
Part of the point of std::experimental is that you're supposed to expect the final version to behave differently and be willing to live with the final differences.Pryor
Whether or not this is a "bug", you can call path::make_preferred() to convert any foreslashes to backslashes.Orff
It's not a bug in the standard -- the question is whether it's a bug in MSVC's implementationBaronetage
@Ifni The linked post is from a time where std::filesystem was not finalized (while there is no updated post yet, as far as I'm aware of). When compiling in VS15.8 with std=c++17, the included header states // filesystem standard header.Deliver
(To expand on what @Konstanze said, MS uses \ as a path separator and / as a command-line option prefix because most of the original MS-DOS utilities came from IBM, which used / for command-line switches, and MS-DOS 1.0 (as PC-DOS 1.0) didn't support directories. In MS-DOS 2.0 & later, they kept / for backwards compatibility, and introduced \ for paths because it was visually similar to the Unix /; I think 2.0 also accepted \ as a path separator right from the start, but I'm not sure. Either way, whenever it was added, it was mainly for consistency with Unix syntax IIRC.)Brackett
/ path separator is pretty normal for filesystem paths in Windows. At some point they even started supporting Tab-competion for /-separated paths in cmd.exe.Ascent
D
8

No, it is not a bug!

string()et al and c_str()/native() return the internal pathname in native pathname format.

What native does mean

MS states, it uses ISO/IEC TS 18822:2015. The final draft defines the native pathname format in §4.11 as follows:

The operating system dependent pathname format accepted by the host operating system.

In Windows, native() returns the path as std::wstring().

How to force the usage of backslashes as directory separator in Windows

The standard defines the term preferred-separator (see also §8.1 (pathname format grammar)):

An operating system dependent directory separator character.

A path can be converted (in place) to the preferred-separator with path::make_preferred. In Windows, it has the noexcept operator.

Why you shouldn't worry

The MS documentation about paths states about the usage of / vs \

File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\?\" prefix as detailed in the following sections.

and in the documentation about C++ file navigation, the slash (known as fallback-separator in newer drafts) is even used directly after the root-name:

path pathToDisplay(L"C:/FileSystemTest/SubDir3/SubDirLevel2/File2.txt ");

Example for VS2017 15.8 with -std:C++17:

#include <filesystem>
#include <iostream>
namespace fs = std::filesystem;

void output(const std::string& type, fs::path& p)
{
    std::cout
        << type << ":\n"
        << "- native: " << p.string() << "\n"
        << "- generic: " << p.generic_string() << "\n"
        << "- preferred-separator" << p.make_preferred() << "\n";
}

int main()
{
    fs::path local_win_path("c:/dir/file.ext");
    fs::path unc_path("//your-remote/dir/file.ext");

    output("local absolute win path", local_win_path);
    output("unc path", unc_path);

    unc_path = "//your-remote/dir/file.ext"; // Overwrite make_preferred applied above.
    if (fs::is_regular_file(unc_path))
    {
        std::cout << "UNC path containing // was understood by Windows std filesystem";
    }
}

Possible output (when unc_path is an existing file on an existing remote):

local absolute win path:
- native: c:/dir/file.ext
- generic: c:/dir/file.ext
- preferred-separator"c:\\dir\\file.ext"
unc path:
- native: //your-remote/dir/file.ext
- generic: //your-remote/dir/file.ext
- preferred-separator"\\\\your-remote\\dir\\file.ext"
UNC path containing // was understood by Windows std filesystem

So explicit path transformations to the preferred-separator should only be necessary when working with libraries that enforce the usage of that separator for their file system interaction.

Deliver answered 4/9, 2018 at 10:47 Comment(0)
H
12

Roughly, a bug in a compiler happens when it exhibits behavior that is forbidden by the standard (either explicitly or implicitly), or behavior that diverges from the documentation of said compiler.

The standard imposes no restrictions on the format of native path strings, except that the format should be accepted by the underlying operating system (quote below). How could it impose such restrictions? The language has no say in how paths are handled by the host OS, and to do it confidently it would have to know every single target it may be compiled to, which is clearly not feasible.

[fs.class.path]

5   A pathname is a character string that represents the name of a path. Pathnames are formatted according to the generic pathname format grammar ([fs.path.generic]) or according to an operating system dependent native pathname format accepted by the host operating system.

(Emphasis mine)

The documentation of MSVC implies that the forward slash is perfectly acceptable as a separator:

Common to both systems is the structure imposed on a pathname once you get past the root name. For the pathname c:/abc/xyz/def.ext:

  • The root name is c:.
  • The root directory is /.
  • The root path is c:/.
  • The relative path is abc/xyz/def.ext.
  • The parent path is c:/abc/xyz.
  • The filename is def.ext.
  • The stem is def.
  • The extension is .ext.

It does mention a preferred separator, but this really only implies the behavior of std::make_preferred, and not of the default path output:

A minor difference is the preferred separator, between the sequence of directories in a pathname. Both operating systems let you write a forward slash /, but in some contexts Windows prefers a backslash \.

The question of whether this is a bug, then, is easy: Since the standard imposes no restrictions on the behavior, and the compiler's documentation implies no mandatory need for a backward slash, there can be no bug.

Left is the question of whether this is a quality of implementation issue. After all, compiler and library implementers are expected to know all quirks about their target, and implement features accordingly.

It's up for debate which slash ('\' or '/') you should use in Windows, or whether it really matters at all, so there can be no authoritative answer. Any answer that advocates for one or the other must be very careful to not be too much opinion-based. Also, the mere existence of path::make_preferred indicates that the native path is not necessarily the preferred one. Consider the zero-overhead principle: Making the path always be the preferred one would incur an overhead on the people who don't need to be that pedantic when handling paths.

Finally, the std::experimental namespace is what it says on the box: You shouldn't expect the final standardized library to behave the same as its experimental version, or even expect that a final standardized library will exist at all. It's just the way it is, when dealing with experimental stuff.

Hospice answered 31/8, 2018 at 20:33 Comment(3)
There's a lot more ways to exhibit bugs than you imply. Misleading diagnostics, failure to optimise, and other such things are bugs despite not being standard conformance issues. Not following the path of least possible surprise is as buggy a bug as bugs get.Circumfuse
@n.m. All of those are QoI issues. Sure, they might be (rightly) reported in a bug tracker to be fixed, but, it is debatable whether they are actually bugs, since they don't really cause wrong behavior.Hospice
@n.m. Having said that, I improved the post by adding specific information about MSVC. I hope you find that it's better now.Hospice
D
8

No, it is not a bug!

string()et al and c_str()/native() return the internal pathname in native pathname format.

What native does mean

MS states, it uses ISO/IEC TS 18822:2015. The final draft defines the native pathname format in §4.11 as follows:

The operating system dependent pathname format accepted by the host operating system.

In Windows, native() returns the path as std::wstring().

How to force the usage of backslashes as directory separator in Windows

The standard defines the term preferred-separator (see also §8.1 (pathname format grammar)):

An operating system dependent directory separator character.

A path can be converted (in place) to the preferred-separator with path::make_preferred. In Windows, it has the noexcept operator.

Why you shouldn't worry

The MS documentation about paths states about the usage of / vs \

File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\?\" prefix as detailed in the following sections.

and in the documentation about C++ file navigation, the slash (known as fallback-separator in newer drafts) is even used directly after the root-name:

path pathToDisplay(L"C:/FileSystemTest/SubDir3/SubDirLevel2/File2.txt ");

Example for VS2017 15.8 with -std:C++17:

#include <filesystem>
#include <iostream>
namespace fs = std::filesystem;

void output(const std::string& type, fs::path& p)
{
    std::cout
        << type << ":\n"
        << "- native: " << p.string() << "\n"
        << "- generic: " << p.generic_string() << "\n"
        << "- preferred-separator" << p.make_preferred() << "\n";
}

int main()
{
    fs::path local_win_path("c:/dir/file.ext");
    fs::path unc_path("//your-remote/dir/file.ext");

    output("local absolute win path", local_win_path);
    output("unc path", unc_path);

    unc_path = "//your-remote/dir/file.ext"; // Overwrite make_preferred applied above.
    if (fs::is_regular_file(unc_path))
    {
        std::cout << "UNC path containing // was understood by Windows std filesystem";
    }
}

Possible output (when unc_path is an existing file on an existing remote):

local absolute win path:
- native: c:/dir/file.ext
- generic: c:/dir/file.ext
- preferred-separator"c:\\dir\\file.ext"
unc path:
- native: //your-remote/dir/file.ext
- generic: //your-remote/dir/file.ext
- preferred-separator"\\\\your-remote\\dir\\file.ext"
UNC path containing // was understood by Windows std filesystem

So explicit path transformations to the preferred-separator should only be necessary when working with libraries that enforce the usage of that separator for their file system interaction.

Deliver answered 4/9, 2018 at 10:47 Comment(0)
M
5

Either one of those could be considered "native" on the platform, so either one of those options is equally valid. The Filesystem API makes no guarantees that the "native" version will be identical to the string you gave it, regardless of platform. Nor is there a guarantee that the "native" string will only use the native directory separator if the generic "/" character is equivalent to it.

Mathilda answered 17/8, 2018 at 2:13 Comment(2)
Many, but not all Windows APIs accept "/". For example, the shell path handling functions only accept "\".Orff
@zett42: Yes, but it's pretty fuzzy whether those are really part of the OS. Really, they come with a utility program (the explorer.exe shell) that ships with the OS, but if you change the shell, it doesn't cease to be Windows.Stopover

© 2022 - 2024 — McMap. All rights reserved.