How does Windows determine/handle the DOS short name of any given file?
Asked Answered
W

5

13

I have a folder with these files:

alongfilename1.txt <--- created first
alongfilename3.txt <--- created second

When I run DIR /x in command prompt, I see these short names assigned:

ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename3.txt

Now, if I add another file:

alongfilename1.txt 
alongfilename2.txt <--- created third
alongfilename3.txt

I see this:

ALONGF~1.TXT alongfilename1.txt
ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt

Fine. It seems to be assigning the "~#" according to the date/time that I created the file. Is this correct?

Now, if I delete "alongfilename1.txt", the other two files keep their short names.

ALONGF~3.TXT alongfilename2.txt
ALONGF~2.TXT alongfilename3.txt

When will that ID (in this case, ~1) be released for use in another shortname. Will it ever?

Also, is it possible that a file on my machine has a short name of X, whereas the same file has a short name of Y on another machine? I'm particularly concerned for installations whose custom actions utilize DOS short names.

Thanks, guys.

Whomever answered 27/11, 2008 at 15:32 Comment(0)
P
6

The short filename is created with the file. The algorithm works like this (usually, but see moocha's reply):

counter = 1
stripped_filename = strip_dots(strip_non_ascii_characters(filename))
shortfn = first_6_characters(stripped_filename)
while (file_exists(shortfn + "~" + counter + "." + extension)) {
    increment counter by 1
    if more digits are added to counter, shorten shortfn by 1 
    /* e.g. if counter comes to 9 and shortf~9.txt is taken. try short~10.txt next */
}

This means that once the file is created, it will keep its short name until it's deleted.

As soon as the file is deleted, the short name may be used again.

If you move the file somewhere else, it may get a new short name (e.g. you're moving c:\somefilewithlongname.txt ("c:\somefi~1.txt") to d:\stuff\somefilewithlongname.txt, if there's d:\stuff\somefileelse.txt ("d:\stuff\somefi~1.txt"), the short name of the moved file will be somefi~2.txt). It seems that the short name is only persistent within a given directory on a given machine.

So: the short filenames will be generated by the filesystem, usually by the method outlined above. It is better to assume that short filenames are not persistent, as c:\longfi~1.txt on one machine might be "c:\longfilename.txt", whereas on another it might be "c:\longfish_story.txt"; also, when a file is deleted, the short name is immediately available again.

Photovoltaic answered 27/11, 2008 at 15:43 Comment(1)
Also spaces are stripped and extension is shortened to 3 charsSpecialist
I
9

If I were you, I would never rely on any version of any file system driver (be it Microsoft's, be it another OS's) to be consistent about the algorithm it uses to generate short file names. The exact behavior of the Microsoft Fastfat and NTFS drivers is not "officially" documented (except as somewhat high level overviews) thus are not part of the API contract. What works today might not work tomorrow if you update the driver.

In addition, there is absolutely no requirement that short names contain tilde characters - see for example this post by Raymond Chen.

There's a treasure trove of info to be found about this topic in the MSDN blogs - for example:

Also, do not rely on the sole presence of alphanumerical characters. Look at the Linux VFAT driver which says, for example, that any combination of uppercase letters, digits, and the following characters is valid: $ % ' ` - @ { } ~ ! # ( ) & _ ^. NTFS will operate in compatibility mode with that...

Ihab answered 27/11, 2008 at 15:51 Comment(1)
Yes, and volumes provided by Samba will not use predictable short names.Squirearchy
P
6

The short filename is created with the file. The algorithm works like this (usually, but see moocha's reply):

counter = 1
stripped_filename = strip_dots(strip_non_ascii_characters(filename))
shortfn = first_6_characters(stripped_filename)
while (file_exists(shortfn + "~" + counter + "." + extension)) {
    increment counter by 1
    if more digits are added to counter, shorten shortfn by 1 
    /* e.g. if counter comes to 9 and shortf~9.txt is taken. try short~10.txt next */
}

This means that once the file is created, it will keep its short name until it's deleted.

As soon as the file is deleted, the short name may be used again.

If you move the file somewhere else, it may get a new short name (e.g. you're moving c:\somefilewithlongname.txt ("c:\somefi~1.txt") to d:\stuff\somefilewithlongname.txt, if there's d:\stuff\somefileelse.txt ("d:\stuff\somefi~1.txt"), the short name of the moved file will be somefi~2.txt). It seems that the short name is only persistent within a given directory on a given machine.

So: the short filenames will be generated by the filesystem, usually by the method outlined above. It is better to assume that short filenames are not persistent, as c:\longfi~1.txt on one machine might be "c:\longfilename.txt", whereas on another it might be "c:\longfish_story.txt"; also, when a file is deleted, the short name is immediately available again.

Photovoltaic answered 27/11, 2008 at 15:43 Comment(1)
Also spaces are stripped and extension is shortened to 3 charsSpecialist
C
3

I believe MSDOS stores the association between the long and the short name in a per directory file.

It does not depends on the date/time.

If you move your files in a new directory... this will reset the algo mentionned by Piskvor applies itself again

In the new directory (after a move), you will get:

ALONGF~1.TXT alongfilename1.txt
ALONGF~2.TXT alongfilename2.txt
ALONGF~3.TXT alongfilename3.txt

even though alongfilename2.txt has initially been created third.

Cutie answered 27/11, 2008 at 15:46 Comment(0)
C
0

This link says how NTFS does it. I would guess it's still the same idea on more recent version.

In Windows 2000, both FAT and NTFS use the Unicode character set for their names, which contain several forbidden characters that MS-DOS cannot read. To generate a short MS-DOS-readable file name, Windows 2000 deletes all of these characters from the LFN and removes any spaces. Because an MS-DOS-readable file name can have only one period, Windows 2000 also removes all extra periods from the file name. Next, Windows 2000 truncates the file name, if necessary, to six characters and appends a tilde ( ~ ) and a number. For example, each non-duplicate file name is appended with ~1 . Duplicate file names end with ~2 , then ~3, and so on. After the file names are truncated, the file name extensions are truncated to three or fewer characters. Finally, when displaying file names at the command line, Windows 2000 translates all characters in the file name and extension to uppercase.

Coral answered 27/11, 2008 at 15:47 Comment(0)
S
0

When the files are provided by a network server which is running Samba, then the short names are generated by the server, and they do not follow a predictable pattern.

So it is not safe to assume that you can predict the form of the short name.

    G:\>dir /x *.txt

 Directory of G:\

08/25/2009  12:34 PM             1,848 S2XYYV~1.TXT strace_output.txt
03/01/2010  05:32 PM           325,428 TEY7IH~O.TXT tomcat-dump-march-1.txt
03/11/2010  12:01 AM             5,811 DI356A~S.TXT ddmget-output.txt
01/23/2009  01:03 PM           313,880 DLA94Q~K.TXT ddm-log-fn.txt
04/20/2010  07:42 PM             7,491 A50QZP~A.TXT april-20-2010.txt
Squirearchy answered 24/6, 2011 at 17:32 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.