Why is File.pathSeparatorChar a semicolon on Windows?
Asked Answered
O

3

8

The javadoc states that File.pathSeparatorChar is:

The system-dependent path-separator character. This field is initialized to contain the first character of the value of the system property path.separator. This character is used to separate filenames in a sequence of files given as a path list. On UNIX systems, this character is :; on Microsoft Windows systems it is ;.

But that seems strange, because a semicolon is not a forbidden character for Windows paths (for references, those are \ / : * ? " < > |, cf the rename feature of Windows Explorer).

For example, with the following code:

String test = "C:\\my;valid;path" + File.pathSeparator + "C:\\Windows";
String[] tests = test.split(File.pathSeparator);

tests will contain C:\\my valid path C:\\Windows, which isn't what it'd expect.

So the question is: why isn't this character a colon, like on Unix? I could force my code to use a colon, but that seems to defeat the purpose of having a constant in the JDK.

Edit: Reimeus explained why it can't be a colon on Windows (it's the drive separator), but what I'm really interested in is the reason why it's not a character that can't appear in a path, such as |.

Ovine answered 14/4, 2015 at 9:6 Comment(1)
To add to the discussion, I asked the question that arose from my answer as a separate question. One promising answer is that if a folder name contains a ;, it needs to be wrapped in quotes in the path list. This seems reasonable, even though you then can't use split to split the path list into parts.Lubberly
A
4

The PATH separator has been a semicolon for a very long time, presumably since the very first release of MS-DOS. (I'm assuming, as per Thorsten's answer, that Java simply defered to the Windows convention, presumably because Java programmers are likely to assume that they can use pathSeparatorChar to parse the value of PATH rather than only to parse file lists produced by Java itself.)

The most obvious options for such a separator (by analogy with English) are the period, the comma, and the semicolon. The period would conflict with the 8.3 file name format. The choice of the semicolon over the comma may well have been arbitrary.

At any rate, semicolons were not legal characters in file names at that time, so there was no reason to prefer the comma. And, of course, since nowadays both commas and semicolons are legal, we wouldn't be any better off if they had. :-)

Albi answered 15/4, 2015 at 1:29 Comment(1)
Your analogy with English is a good point. Also, thanks for pointing out the 8.3 file name format.Eboat
L
15

You're confusing the path separator with the directory separator.

The path separator is what separates paths entries in the PATH environment variable. It is ; on Windows. For example:

PATH=C:\Windows;C:\Program Files

The directory separator separates single folder names when specifying a file or folder name. It is \ on Windows. For example:

C:\Windows\Temp\Test.txt

After our discussion in the comments I finally understood your problem :-D The question "why" can probably only be answered by Microsoft. It is - I agree with you - not a smart idea to use a separator that's allowed in folder names. Maybe this comes from the old days of 8-character names?

The real question should be how you can determine whether the ; is part of the folder or acts as the separator. I'm going to ask that question myself, because I find this rather interesting.

Lubberly answered 14/4, 2015 at 9:9 Comment(7)
The Javadoc states This character is used to separate filenames in a sequence of files given as a path list.. I'm wondering how that can work, since a semicolon can be part of the path itself. In your example, what if you want to add C:\My;valid;folder to the path?Eboat
@Metoule not under Windows, at least; it is indeed the case for most Unix systemsContribute
Thus my question of why it's a semicolon ;)Eboat
You can not do that on Windows, as ; can never be part of a valid file or folder name. You can, however, separate multiple valid file or folder names using ;.Lubberly
That's not true. You can create a folder with a semicolon, thus my question.Eboat
Wow, I just noticed it does actually work to create a folder named Test;Test - in that case your question is a good one and I will upvote this. Well, my answer isn't wrong even though it may not help you as much as I hoped it would :-DLubberly
@Reimeus But he's right. I just created a folder named Test;Test1 on my Windows 7 system. If you wanted to add that folder to the path, how would you know that Test1 belongs to the folder name and is not a new folder in the list?Lubberly
A
4

The PATH separator has been a semicolon for a very long time, presumably since the very first release of MS-DOS. (I'm assuming, as per Thorsten's answer, that Java simply defered to the Windows convention, presumably because Java programmers are likely to assume that they can use pathSeparatorChar to parse the value of PATH rather than only to parse file lists produced by Java itself.)

The most obvious options for such a separator (by analogy with English) are the period, the comma, and the semicolon. The period would conflict with the 8.3 file name format. The choice of the semicolon over the comma may well have been arbitrary.

At any rate, semicolons were not legal characters in file names at that time, so there was no reason to prefer the comma. And, of course, since nowadays both commas and semicolons are legal, we wouldn't be any better off if they had. :-)

Albi answered 15/4, 2015 at 1:29 Comment(1)
Your analogy with English is a good point. Also, thanks for pointing out the 8.3 file name format.Eboat
E
2

A colon : is used to denote a drive letter

Electrolysis answered 14/4, 2015 at 9:8 Comment(7)
Ok, that does answer the question as I formulated it, but doesn't really explain why it's not another character which doesn't have another meaning, such as | (which is what I'm really interested in).Eboat
I guess the question "why" can only be answered by Microsoft. The more interesting question is how to really split a list of folders if the separator can be part of the folder name. Can you make the assumption that all the paths in your list need to be absolute?Lubberly
They'll all be relative, so I can use something else. But I got curious, and that's why I asked the question :)Eboat
@Metoule | does have another meaning, its the pipe operator, in fact all the other listed characters have their own purpose, directory separator, command switch, wildcard character, etc. ; has no special meaningElectrolysis
Agreed, but as far as I can tell, the pipe operator only has meaning on the command line, and can't be used in a path. Thus if I'm only interested in using it as a path separator, it should work.Eboat
IMO, a colon would be an odd choice anyway. It has a meaning in English that isn't analogous to the meaning of a list delimiter. Come to think of it, I suppose that leads to another question: why does UNIX use a colon as a list delimiter? :-)Albi
@Metoule: using a pipe would make it inconvenient to set or modify PATH on the command line. (That probably isn't the original reason, however, because I don't think the pipe was a special command-line character at the time the semicolon was chosen.)Albi

© 2022 - 2024 — McMap. All rights reserved.