How can I calculate the complete buffer size for GetModuleFileName?
Asked Answered
H

9

15

The GetModuleFileName() takes a buffer and size of buffer as input; however its return value can only tell us how many characters is has copied, and if the size is not enough (ERROR_INSUFFICIENT_BUFFER).

How do I determine the real required buffer size to hold entire file name for GetModuleFileName()?

Most people use MAX_PATH but I remember the path can exceed that (260 by default definition)...

(The trick of using zero as size of buffer does not work for this API - I've already tried before)

Histoplasmosis answered 30/4, 2009 at 7:45 Comment(1)
Looks like there are some gotchas with this function, have a look at the qt src: qt.gitorious.org/qt/qt/commit/…Brana
G
11

Implement some reasonable strategy for growing the buffer like start with MAX_PATH, then make each successive size 1,5 times (or 2 times for less iterations) bigger then the previous one. Iterate until the function succeeds.

Godden answered 30/4, 2009 at 9:48 Comment(4)
If the buffer is too small to hold the module name, the string is truncated to nSize characters including the terminating null character, the function returns nSize, and the function sets the last error to ERROR_INSUFFICIENT_BUFFER. Windows XP: If the buffer is too small to hold the module name, the function returns nSize. The last error code remains ERROR_SUCCESS. If nSize is zero, the return value is zero and the last error code is ERROR_SUCCESS.Shirlyshiroma
Note that on XP, the buffer is not null-terminated if truncation occurs. But either way, the function returns nSize on all Windows versions if the buffer is too small, so you can usually ignore the error code and the current buffer content, just grow the buffer and increment nSize, and try again, repeating until it stops returning nSize.Tensive
@Remy Lebeau: What if the function fails for some other reason than insufficient buffer length? He may end up with insufficient RAM amount.Madiemadigan
@Madiemadigan if any error other than ERROR_INSUFFICIENT_BUFFER occurs, the return value will be 0 instead of nSize. I was suggesting a loop ONLY when the return value is nSize.Tensive
A
14

The usual recipe is to call it setting the size to zero and it is guaranteed to fail and provide the size needed to allocate sufficient buffer. Allocate a buffer (don't forget room for nul-termination) and call it a second time.

In a lot of cases MAX_PATH is sufficient because many of the file systems restrict the total length of a path name. However, it is possible to construct legal and useful file names that exceed MAX_PATH, so it is probably good advice to query for the required buffer.

Don't forget to eventually return the buffer from the allocator that provided it.

Edit: Francis points out in a comment that the usual recipe doesn't work for GetModuleFileName(). Unfortunately, Francis is absolutely right on that point, and my only excuse is that I didn't go look it up to verify before providing a "usual" solution.

I don't know what the author of that API was thinking, except that it is possible that when it was introduced, MAX_PATH really was the largest possible path, making the correct recipe easy. Simply do all file name manipulation in a buffer of length no less than MAX_PATH characters.

Oh, yeah, don't forget that path names since 1995 or so allow Unicode characters. Because Unicode takes more room, any path name can be preceeded by \\?\ to explicitly request that the MAX_PATH restriction on its byte length be dropped for that name. This complicates the question.

MSDN has this to say about path length in the article titled File Names, Paths, and Namespaces:

Maximum Path Length

In the Windows API (with some exceptions discussed in the following paragraphs), the maximum length for a path is MAX_PATH, which is defined as 260 characters. A local path is structured in the following order: drive letter, colon, backslash, components separated by backslashes, and a terminating null character. For example, the maximum path on drive D is "D:\<some 256 character path string><NUL>" where "<NUL>" represents the invisible terminating null character for the current system codepage. (The characters < > are used here for visual clarity and cannot be part of a valid path string.)

Note File I/O functions in the Windows API convert "/" to "\" as part of converting the name to an NT-style name, except when using the "\\?\" prefix as detailed in the following sections.

The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters. This type of path is composed of components separated by backslashes, each up to the value returned in the lpMaximumComponentLength parameter of the GetVolumeInformation function. To specify an extended-length path, use the "\\?\" prefix. For example, "\\?\D:\<very long path>". (The characters < > are used here for visual clarity and cannot be part of a valid path string.)

Note The maximum path of 32,767 characters is approximate, because the "\\?\" prefix may be expanded to a longer string by the system at run time, and this expansion applies to the total length.

The "\\?\" prefix can also be used with paths constructed according to the universal naming convention (UNC). To specify such a path using UNC, use the "\\?\UNC\" prefix. For example, "\\?\UNC\server\share", where "server" is the name of the machine and "share" is the name of the shared folder. These prefixes are not used as part of the path itself. They indicate that the path should be passed to the system with minimal modification, which means that you cannot use forward slashes to represent path separators, or a period to represent the current directory. Also, you cannot use the "\\?\" prefix with a relative path, therefore relative paths are limited to MAX_PATH characters as previously stated for paths not using the "\\?\" prefix.

When using an API to create a directory, the specified path cannot be so long that you cannot append an 8.3 file name (that is, the directory name cannot exceed MAX_PATH minus 12).

The shell and the file system have different requirements. It is possible to create a path with the Windows API that the shell user interface might not be able to handle.

So an easy answer would be to allocate a buffer of size MAX_PATH, retrieve the name and check for errors. If it fit, you are done. Otherwise, if it begins with "\\?\", get a buffer of size 64KB or so (the phrase "maximum path of 32,767 characters is approximate" above is a tad troubling here so I'm leaving some details for further study) and try again.

Overflowing MAX_PATH but not beginning with "\\?\" appears to be a "can't happen" case. Again, what to do then is a detail you'll have to deal with.

There may also be some confusion over what the path length limit is for a network name which begins "\\Server\Share\", not to mention names from the kernel object name space which begin with "\\.\". The above article does not say, and I'm not certain about whether this API could return such a path.

Alvaalvan answered 30/4, 2009 at 7:51 Comment(2)
Some APIs (like MultibyteToWideChar) allows zero size input, but GetModuleFileName does not - its return value is "how many bytes copied". So size of zero always return zero. This does not work.Histoplasmosis
Ouch. You're right about that, and I've edited my answer to reflect what I would likely try to do to work around the issue. Since a path can reach "approximately 32,000 characters" and may be Unicode, guessing the buffer size well is not trivial.Alvaalvan
G
11

Implement some reasonable strategy for growing the buffer like start with MAX_PATH, then make each successive size 1,5 times (or 2 times for less iterations) bigger then the previous one. Iterate until the function succeeds.

Godden answered 30/4, 2009 at 9:48 Comment(4)
If the buffer is too small to hold the module name, the string is truncated to nSize characters including the terminating null character, the function returns nSize, and the function sets the last error to ERROR_INSUFFICIENT_BUFFER. Windows XP: If the buffer is too small to hold the module name, the function returns nSize. The last error code remains ERROR_SUCCESS. If nSize is zero, the return value is zero and the last error code is ERROR_SUCCESS.Shirlyshiroma
Note that on XP, the buffer is not null-terminated if truncation occurs. But either way, the function returns nSize on all Windows versions if the buffer is too small, so you can usually ignore the error code and the current buffer content, just grow the buffer and increment nSize, and try again, repeating until it stops returning nSize.Tensive
@Remy Lebeau: What if the function fails for some other reason than insufficient buffer length? He may end up with insufficient RAM amount.Madiemadigan
@Madiemadigan if any error other than ERROR_INSUFFICIENT_BUFFER occurs, the return value will be 0 instead of nSize. I was suggesting a loop ONLY when the return value is nSize.Tensive
T
3

Using

extern char* _pgmptr

might work.

From the documentation of GetModuleFileName:

The global variable _pgmptr is automatically initialized to the full path of the executable file, and can be used to retrieve the full path name of an executable file.

But if I read about _pgmptr:

When a program is not run from the command line, _pgmptr might be initialized to the program name (the file's base name without the file name extension) or to a file name, relative path, or full path.

Anyone who knows how _pgmptr is initialized? If SO had support for follow-up questions I would posted this question as a follow up.

Tenderhearted answered 27/7, 2013 at 14:59 Comment(0)
P
3

While the API is proof of bad design, the solution is actually very simple. Simple, yet sad it has to be this way, for it's somewhat of a performance hog as it might require multiple memory allocations. Here is some keypoints to the solution:

  • You can't really rely on the return value between different Windows-versions as it can have different semantics on different Windows-versions (XP for example).

  • If the supplied buffer is too small to hold the string, the return value is the amount of characters including the 0-terminator.

  • If the supplied buffer is large enough to hold the string, the return value is the amount of characters excluding the 0-terminator.

This means that if the returned value exactly equals the buffer size, you still don't know whether it succeeded or not. There might be more data. Or not. In the end you can only be certain of success if the buffer length is actually greater than required. Sadly...

So, the solution is to start off with a small buffer. We then call GetModuleFileName passing the exact buffer length (in TCHARs) and comparing the return result with it. If the return result is less than our buffer length, it succeeded. If the return result is greater than or equal to our buffer length, we have to try again with a larger buffer. Rinse and repeat until done. When done we make a string copy (strdup/wcsdup/tcsdup) of the buffer, clean up, and return the string copy. This string will have the right allocation size rather than the likely overhead from our temporary buffer. Note that the caller is responsible for freeing the returned string (strdup/wcsdup/tcsdup mallocs memory).

See below for an implementation and usage code example. I have been using this code for over a decade now, including in enterprise document management software where there can be a lot of quite long paths. The code can ofcourse be optimized in various ways, for example by first loading the returned string into a local buffer (TCHAR buf[256]). If that buffer is too small you can then start the dynamic allocation loop. Other optimizations are possible but that's beyond the scope here.

Implementation and usage example:

/* Ensure Win32 API Unicode setting is in sync with CRT Unicode setting */
#if defined(_UNICODE) && !defined(UNICODE)
#   define UNICODE
#elif defined(UNICODE) && !defined(_UNICODE)
#   define _UNICODE
#endif

#include <stdio.h> /* not needed for our function, just for printf */
#include <tchar.h>
#include <windows.h>

LPCTSTR GetMainModulePath(void)
{
    TCHAR* buf    = NULL;
    DWORD  bufLen = 256;
    DWORD  retLen;

    while (32768 >= bufLen)
    {
        if (!(buf = (TCHAR*)malloc(sizeof(TCHAR) * (size_t)bufLen))
        {
            /* Insufficient memory */
            return NULL;
        }

        if (!(retLen = GetModuleFileName(NULL, buf, bufLen)))
        {
            /* GetModuleFileName failed */
            free(buf);
            return NULL;
        }
        else if (bufLen > retLen)
        {
            /* Success */
            LPCTSTR result = _tcsdup(buf); /* Caller should free returned pointer */
            free(buf);
            return result;
        }

        free(buf);
        bufLen <<= 1;
    }

    /* Path too long */
    return NULL;
}

int main(int argc, char* argv[])
{
    LPCTSTR path;

    if (!(path = GetMainModulePath()))
    {
        /* Insufficient memory or path too long */
        return 0;
    }

    _tprintf("%s\n", path);

    free(path); /* GetMainModulePath malloced memory using _tcsdup */ 

    return 0;
}

Having said all that, I like to point out you need to be very aware of various other caveats with GetModuleFileName(Ex). There are varying issues between 32/64-bit/WOW64. Also the output is not necessarily a full, long path, but could very well be a short-filename or be subject to path aliasing. I expect when you use such a function that the goal is to provide the caller with a useable, reliable full, long path, therefor I suggest to indeed ensure to return a useable, reliable, full, long absolute path, in such a way that it is portable between various Windows-versions and architectures (again 32/64-bit/WOW64). How to do that efficiently is beyond the scope here.

While this is one of the worst Win32 APIs in existance, I wish you alot of coding joy nonetheless.

Political answered 20/1, 2015 at 22:5 Comment(0)
C
2

My example is a concrete implementation of the "if at first you don't succeed, double the length of the buffer" approach. It retrieves the path of the executable that is running, using a string (actually a wstring, since I want to be able to handle Unicode) as the buffer. To determine when it has successfully retrieved the full path, it checks the value returned from GetModuleFileNameW against the value returned by wstring::length(), then uses that value to resize the final string in order to strip the extra null characters. If it fails, it returns an empty string.

inline std::wstring getPathToExecutableW() 
{
    static const size_t INITIAL_BUFFER_SIZE = MAX_PATH;
    static const size_t MAX_ITERATIONS = 7;
    std::wstring ret;
    DWORD bufferSize = INITIAL_BUFFER_SIZE;
    for (size_t iterations = 0; iterations < MAX_ITERATIONS; ++iterations)
    {
        ret.resize(bufferSize);
        DWORD charsReturned = GetModuleFileNameW(NULL, &ret[0], bufferSize);
        if (charsReturned < ret.length())
        {
            ret.resize(charsReturned);
            return ret;
        }
        else
        {
            bufferSize *= 2;
        }
    }
    return L"";
}
Cinematograph answered 20/10, 2017 at 21:45 Comment(0)
U
0

Here is a another solution with std::wstring:

DWORD getCurrentProcessBinaryFile(std::wstring& outPath)
{
    // @see https://msdn.microsoft.com/en-us/magazine/mt238407.aspx
    DWORD dwError  = 0;
    DWORD dwResult = 0;
    DWORD dwSize   = MAX_PATH;

    SetLastError(0);
    while (dwSize <= 32768) {
        outPath.resize(dwSize);

        dwResult = GetModuleFileName(0, &outPath[0], dwSize);
        dwError  = GetLastError();

        /* if function has failed there is nothing we can do */
        if (0 == dwResult) {
            return dwError;
        }

        /* check if buffer was too small and string was truncated */
        if (ERROR_INSUFFICIENT_BUFFER == dwError) {
            dwSize *= 2;
            dwError = 0;

            continue;
        }

        /* finally we received the result string */
        outPath.resize(dwResult);

        return 0;
    }

    return ERROR_BUFFER_OVERFLOW;
}
Upheave answered 19/7, 2019 at 14:4 Comment(1)
Note that calling GetLastError() if GetModuleFileName() succeeds (when the return value is > 0 and < dwSize) is undefined behavior. A valid error code is returned only if GetModuleFileName() fails (when the return value is 0 or dwSize). Also, this code should be calling GetModuleFileNameW() explicitly, rather than relying on the project configuration making GetModuleFileName() map to GetModuleFileNameW() in the preprocessor stage.Tensive
D
0

Here's an implementation in Free Pascal (FPC)/Delphi in case anyone needs it:

function GetExecutablePath(): UnicodeString;
const
  MAX_CHARS = 65536;
var
  NumChars, BufSize, CharsCopied: DWORD;
  pName: PWideChar;
begin
  // Poorly designed API...
  result := '';
  NumChars := 256;
  repeat
    BufSize := (NumChars * SizeOf(WideChar)) + SizeOf(WideChar);
    GetMem(pName, BufSize);
    CharsCopied := GetModuleFileNameW(0,  // HMODULE hModule
      pName,                              // LPWSTR  lpFilename
      NumChars);                          // DWORD   nSize
    if (CharsCopied < NumChars) and (CharsCopied <= MAX_CHARS) then
      result := UnicodeString(pName)
    else
      NumChars := NumChars * 2;
    FreeMem(pName, BufSize);
  until (CharsCopied >= MAX_CHARS) or (result <> '');
end;
Doubler answered 7/6, 2023 at 15:23 Comment(0)
Z
-1

Windows cannot handle properly paths longer than 260 characters, so just use MAX_PATH. You cannot run a program having path longer than MAX_PATH.

Zwiebel answered 8/4, 2015 at 6:16 Comment(7)
This is incorrect. From MSDN: The Windows API has many functions that also have Unicode versions to permit an extended-length path for a maximum total path length of 32,767 characters.Chorography
Please tell me the way of launching my application from a directory with path longer than MAX_PATH. Windows Explorer fails to enter this directory, as well as cmd.exe. Total Commander enters this directory but fails to execute my program. "\\?\...." also fails. Also Windows + R dialog's edit box is limited to MAX_PATH characters. The question is about GetModuleFileName, not about other winapi functions.Zwiebel
The question is about GetModuleFileName but you write about entire Windows: "Windows cannot handle properly paths longer than 260 characters". The statement is apparently incorrect.Chorography
Also, on starting such application: How to get CreateProcess/CreateProcessW to execute a process in a path > MAX_PATH charactersChorography
There is no solution. CreateProcess("\\?\...") also fails.Zwiebel
The link above shows how to start these applications (it does work). Then GetModuleFileName is applicable not only to applications - it works with DLLs as well. LoadLibrary article on MSDN has a comment from someone that prefixed path worked for him (in x64 build, but still).Chorography
BTW Your last comment on CreateProcess("\\?\...", ...) is also incorrect. It fails for 32-bit builds and require a shortened path as a workaround. It does work well with 64-bit code.Chorography
T
-3

My approach to this is to use argv, assuming you only want to get the filename of the running program. When you try to get the filename from a different module, the only secure way to do this without any other tricks is described already, an implementation can be found here.

// assume argv is there and a char** array

int        nAllocCharCount = 1024;
int        nBufSize = argv[0][0] ? strlen((char *) argv[0]) : nAllocCharCount;
TCHAR *    pszCompleteFilePath = new TCHAR[nBufSize+1];

nBufSize = GetModuleFileName(NULL, (TCHAR*)pszCompleteFilePath, nBufSize);
if (!argv[0][0])
{
    // resize memory until enough is available
    while (GetLastError() == ERROR_INSUFFICIENT_BUFFER)
    {
        delete[] pszCompleteFilePath;
        nBufSize += nAllocCharCount;
        pszCompleteFilePath = new TCHAR[nBufSize+1];
        nBufSize = GetModuleFileName(NULL, (TCHAR*)pszCompleteFilePath, nBufSize);
    }

    TCHAR * pTmp = pszCompleteFilePath;
    pszCompleteFilePath = new TCHAR[nBufSize+1];
    memcpy_s((void*)pszCompleteFilePath, nBufSize*sizeof(TCHAR), pTmp, nBufSize*sizeof(TCHAR));

    delete[] pTmp;
    pTmp = NULL;
}
pszCompleteFilePath[nBufSize] = '\0';

// do work here
// variable 'pszCompleteFilePath' contains always the complete path now

// cleanup
delete[] pszCompleteFilePath;
pszCompleteFilePath = NULL;

I had no case where argv didn't contain the file path (Win32 and Win32-console application), yet. But just in case there is a fallback to a solution that has been described above. Seems a bit ugly to me, but still gets the job done.

Tequilater answered 13/1, 2015 at 0:5 Comment(9)
this code has tons of issues, including corrupting the stack. If your solution includes "you might get crash reports", you probably just shouldn't post it.Idiot
I posted it so maybe one could figure out a solution to handle the problem. This was just a first idea.Tequilater
I don't think you even understand what your code does. Instead of passing in a real buffer, you're passing in an arbitrary pointer in the stack. If you're comfortable trashing your stack (hint: you're not), then why bother to call it a 2nd time? You already have your data on the 1st call.Idiot
Well, I do understand what I'm doing and it was not my final thoughts. When I first came up with the idea I didn't exactly know that the msdn documentation on this function is actually wrong (or misleading), telling that the buffer is not enough with ERROR_INSUFFICIENT_BUFFER. I found out about the problem after I tested it myself. I came up with a better solution.Tequilater
None of your solutions provide anything that hasn't already been discussed in this thread, and again, there's no reason to call GetModuleFileName twice... I honestly think you should just drop it.Idiot
You think so, but no one came up with using argv for this, so don't tell me it was already discussed here.Tequilater
That's because argv is not a solution to anything. The solutions others have discussed are superior to yours, I'm afraid to say.Idiot
How can you tell this if you don't even know what people have in mind. You could also use argv itself, but you'd possibly have to convert to unicode. And there is no "one solution", so there is no superior one - it always depends on the program. I just shared my idea which works best for me.Tequilater
Well one of the problems with your solution is you're assuming NULL for the 1st param. And some solutions are better than others. You needlessly call the API twice. Call it once; that would be better. But still an incomplete solution. Your code basically just demonstrates how to superficially call GetModuleFileName. It does nothing to solve the buffer size problem.Idiot

© 2022 - 2024 — McMap. All rights reserved.