Can filesystem::canonical be used to prevent filepath injection for filepaths passed to fstream
Asked Answered
F

2

10

I have a public folder pub with subfolders and files in it. A user gives me now a relative filepath, I perform some mappings, and I read the file with fstream and return it to the user.

The problem is now if the user gives me a path like e.g. ../fileXY.txt or some other fancy stuff considering path traversal or other types of filepath injection. fstream is just gonna accept it and read potential files outside of my public pub folder or even worse give them a list of all files on my system etc... .

Before reinventing the wheel, I searched in the filesystem library and I have seen there is this std::filesystem::canonical function and there is quite a talk about the normal form. I have a general question here, can this function and the variant std::filesystem::weakly_canonical be used to prevent this types of vulnerabilities? So basically is it enough?

Further, my system's filesystem library is still in experimental mode and the std::filesystem::weakly_canonical is missing. But I cannot use the canonical because the files must exist in canonical. In my case I have certain mappings and the files dont exist in that sense. So I would need to mimic the weakly_canonical function, but how?

I have seen a related stackoverflow question on realpath for nonexisting paths and he was suggested to repeat the canonical as long as the path exist and then to add the nonexisting part to it, but that is again vulnerable to these type of injections. So do I have to roll my own weakly_canonical or can I somehow mimic it by combining some std::experimental::filesystem functions?

Febrific answered 10/4, 2019 at 10:40 Comment(11)
Even if the user gives you such a path you should have no problems because the security should be per user and checked by the OS. If the user intentionally gives you such a path and has write access to that file, they could do it anyway without your app.Astaire
@MichaelChourdakis I am not sure if I understand this correctly. Just to clarify, the user gives me a relative path, which is a relative path to my system (server application) and he is reading files not on his system but on the servers system. And he should not be allowed to read files outside of the public folder.Febrific
for a server application, you should not allow the user to refer to a local file anyway. Why do you want to get such input from the user?Astaire
@MichaelChourdakis E.g. if you want to manage user uploaded files you would have to give them (download) access to these uploaded files. I dont know how to manage a server application different, considering that a simple http GET request is requesting a certain local file no?Febrific
You should implement it using a database. This database will store a primary key ID of the user's upload, and a user-specified filename which will be only there for display purposes. When you present the list to the user to download, you will present the filenames he has specified, but use the ID to read the data from your database, not by using the filename.Astaire
@MichaelChourdakis I have this mapping for user files. Tough Idk why I didn't just do the same for this public folder. Probably I should just do the same. But on the other side I would avoid some sql calls if I just check if the files exist and then return them. Thank you for noting this. This is gonna be at least an alternative.Febrific
No, using SQL is way more preferable otherwise you risk security issues too severe for a web app. Besides, the user might input a filename incompatible with your filesystemsAstaire
@MichaelChourdakis But that is actually the question here. So if std::filesystem::canonical would prevent these security issues.Febrific
@MichaelChourdakis I am sry for being annoying, can you give me some reference which confirms that database lookups are always the way to go, even if the nature of the problem is a real folder?Febrific
When designing a server program, you want as much abstraction as possible. The users should never be able to access real files on the disk directly. Take for example google drive, its an ID that you use to access a file, not the file's name, let alone the physical location of the dat. There are various references, for example this but ultimately you should do like all other applications do.Astaire
Thank you very much. Thank you for your time!Febrific
A
1

Short answer no.

Long answer this is modeled after posix realpath

I understand the source of confusion. From realpath

The realpath() function shall derive, from the pathname pointed to by file_name, an absolute pathname that resolves to the same directory entry, whose resolution does not involve '.', '..

From cppref path you can also see that the double dot is removed. However the path still points to the same file. It's just that redundant elements are removed.

If you are processing values from a db/webapp/whatever where your program has different privileges than the user who supplied the path, you need to sanitize the filename first by escaping double dots. Dots are fine.

Perhaps you can use a regex to escape double dots with a backslash thus rendering them ineffective.

#include <iostream> 
#include <filesystem>
#include <string>
#include <regex>




int main() 
{ 
    
     std::string bad = "../bad/../other";
    std::filesystem::path p(bad);
    
    
    std::cout << std::filesystem::weakly_canonical(p) << std::endl;
    
   
    std::regex r(R"(\.\.)");
    p = std::regex_replace(bad, r, "\\.\\.");
    std::cout << std::filesystem::weakly_canonical(p) << std::endl;
    
}

Output

"/tmp/other"

"/tmp/1554895428.8689194/\.\./bad/\.\./other"

Run sample

Appolonia answered 10/4, 2019 at 11:25 Comment(5)
Thank you for the answer. By removing these redundant elements. Does the resulting path guarantee that it is not harmful against these attacks(not just dot dot attack, I am no expert in filepath injection, but i think there is many more like slash slash and others). So e.g. if i have a relative path, apply realpath or canonical, does the resulting path guarantee that there is no director traversal? So is the file guaranteed to be in all subdirectories separated by slash in the resulting path.Febrific
If you apply canonical first, the double dots get converted to the directory names and the dots get removed all together. It depends on the operation you are performing and what path names you allow such as white spaces which require quotes. I am pretty sure you can find something that does with some googlinAppolonia
@czorp look into the execve family of functions, they do not use a shell and thus escaping doesn't matter, as if you just call main directlyAppolonia
Thanks, I'll have a look at them :DFebrific
@czorp well if this entry answers your question and you are satisfied by the information you gained, perhaps you might want to select an answerAppolonia
G
1

I can see how you could employ weakly_canonical() to prevent path traversal - similar to what is described here - by checking that the result is prefixed with your base path. E.g.

#include <iostream>
#include <filesystem>
#include <optional>

// Returns the canonical form of basepath/relpath if the canonical form
// is under basepath, otherwise returns std::nullopt.
// Note that one would probably require that basepath is sanitized, 
// safe for use in this context and absolute.
// Thanks to https://portswigger.net/web-security/file-path-traversal 
// for the basic idea.
std::optional<std::filesystem::path> abspath_no_traversal(
        const std::filesystem::path & basepath,
        const std::filesystem::path & relpath) {

    const auto abspath = std::filesystem::weakly_canonical(basepath / relpath);

    // thanks to https://mcmap.net/q/86046/-how-do-i-check-if-a-c-std-string-starts-with-a-certain-string-and-convert-a-substring-to-an-int
    const auto index = abspath.string().rfind(basepath.string(), 0);
    if (index != 0) {
        return std::nullopt;
    }
    return abspath;
}

Since I am no security expert, I welcome any corrections.

Gelignite answered 11/9, 2020 at 0:11 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.