Do dotfiles have a file extension?
Asked Answered
U

4

41

Do dotfiles, such as .htaccess .gitignore and .config, have a file extension and no filename or are they considered to have a filename and no extension?


I'm trying to implement some utility functions in PHP, which is notorious for doing things wrong and I noticed that PHP's pathinfo function considers dotfiles to have a file extension and no filename, whereas node's path.extname considers dotfiles to have a filename and no extension.

I'm unclear as to whether a standard exists, or whether this amounts to developer preference.

Ubiquitarian answered 24/8, 2015 at 2:58 Comment(9)
Very interesting. Well, given that they both treat the dotfiles differently, it seems reasonable to assume that they're both operating off of different standards (whether or not a "true" standard exists appears irrelevant in this case). Thoughts?Friedman
Interesting indeed. Boost.Filesystem, for example, would consider it to be a file extension with no filename. "path extension(const path& p) const; ... if p.filename() contains a dot but does not consist solely of one or to two dots, returns the substring of p.filename() starting at the rightmost dot and ending at the path's end. Otherwise, returns an empty path object."Stainless
@JoshBeam, PHP has a tendency to march to the beat of a different drummer, and then fall out of step and take a couple wrong turns from that different drummer as well. Just look at how it handles query strings, or any of its standard library. That said, I've always considered dotfiles to be extension-only files with no filename, so I was surprised that node would consider them to not have an extension.Ubiquitarian
@Ubiquitarian This decision might be because the library designer thought it would make no sense to have a file with no file name. When it comes to cross-platform and cross-filesystem libraries, there's no universal standard.Stainless
@Ubiquitarian I too have always considered dot files to be nameless, extension-only files based on the fact that when I see or use them, I only ever use one instance at a time. Name to me would imply possibility of multiple (named) instances.Data
@vandsh, yes, multiple named instances makes sense. You would expect multiple files to be able to have the .js extension, but a .htaccess, for example, is the file itself.Friedman
Unless you're running MS-DOS, the filesystem has no concept of a thing called an "extension".Teratogenic
If you ask Microsoft, it's no valid file at all. Ever tried creating a .htaccess file with Windows Explorer? It just gives "You must type a file name" ... and it's like that from Win 2000 all the way up to Win 10. I know why I'm sticking to Linux for Webdev ...Eared
@s1lv3r, It's literally just Explorer protecting the newbie users - you can create it from the command prompt or Notepad "Save As" dialog just file. Sneaky trick: call it .htaccess. (note trailing dot) and it will accept it, stripping the last dot (trailing dots are not permitted in windows).Habsburg
A
21

You pays your money and you takes your pick: Yes, No, Maybe.

It comes down to your definition of 'extension'.

  • Is it "anything after the last dot in the name"? If so, those files have no name and are all extension.
  • Is it "anything after a dot that isn't the first character in the name"? If so, those files don't have an extension.
  • If you use some other definition, then the answer will need to be adjusted accordingly.

Remember that SCCS files used a prefix s. (amongst others; you'd see p. files too — and there were many transient file names with other prefixes). Does an SCCS file s.something have an extension or a prefix? (With s.source.c, it is reasonably straight-forward; there's a prefix, a name and an extension or suffix, or you could ignore the prefix as a special case and the name is s.source and the extension is .c.) What about the default executable name, a.out? What about a name such as ..dot; does it have an extension, and if so, what is it?

Note that the answer on DOS was more formalized. There the file system used to enforce (once upon another millennium or so) names with 8.3 and the extension was tangible. But that's a bygone era for the most part (and there are few who miss it).

Anthony Arnold and paxdiablo both noted that names ending with .tar.gz exist — what's the extension on such files?

If you treat the extension of somecode-8.76.tar.gz as anything other than .gz, you are opening yourself up to a bag'o'worms. The contained file is somecode-8.76.tar; that itself can be reasonably said to have an extension of .tar. Defining the extension of the whole gzipped tar file as .tar.gz raises the question of "why isn't it .76.tar.gz" and also means that you need to revisit the SCCS file naming convention. Absorbing the .76 portion of the name into either .76.tar.gz or .76.tar as a suffix is making life complex indeed. It's a valid question, but anything other than "an extension is the string from the last dot to the end of the name" is fraught indeed — or requires interpretation of the meaning of the extension, and gets into another complex area that it generally is better to avoid.

Note that Unix at the O/S or file system level doesn't care about the extension on files. Programs can decide they care about extensions, but that's up to the program. The extension is an indicator of the file type; it is not definitive. That's why the file program exists to identify the contents of files. It looks at the contents of the file to identify the content; it doesn't pay attention to the file extension (so it doesn't have to decide what the extension is, either).

Abridge answered 24/8, 2015 at 3:7 Comment(7)
Yes, UNIX itself has no concept of extensions, that's something layered from above such as program launchers. You may just as well ask whether the file extension for x.tar.gz is tar.gz or just gz :-) For dotfiles. characters 2-x have nothing to do with the type of the file, so I wouldn't call them an extension myself.Malan
@paxdiablo: I agree that .htaccess (for instance, or .profile or .bashrc) is all name and no extension, but it does depend on the definition of extension, and any definition is potentially somewhat contentious.Abridge
@JonathanLeffler True, but relying on extensions for anything other than convenience is also fraught with danger. Right now, we seem to universally accept . as the extension delimiter, and we generally rely on extensions as an easy indicator of the file format or usage. However, it's not set in stone and neither it a formal standard.Stainless
@JonathanLeffler Furthermore, how do you handle the case were .tar.gz == .tgz? This is domain specific, and trying to capture this kind of arbitrary language in a filesystem API is nonsense. I would rather do away with file extensions at a machine level, and just leave them as something that the user can have to help identify file types.Stainless
@AnthonyArnold: the .tgz case is simple; the extension is .tgz. The name of the contained file isn't so obvious. The mapping from 'compressed = .tgz' to 'decompressed = .tar' is non-standard; that's one reason why the .tgz extension isn't used very widely (not least because it is hard to generalize for .tar.bz2, .tar.xz or .tar.lz etc; presumably you could decided on .tbz, .txz and .tlz [note: I just invented these right now — AFAIK, they are not used anywhere in the real world]), but the consensus seems to be to go with double-suffixes. I do use .tgz, but I'm unusual.Abridge
It's worth noting that tmux configuration files are .tmux.conf, which could be argued to have both a name and extension, as well as the initial dot.Politicize
@ZacCrites: thank you for an example that isn't a file I created (I have a number of files such as .default.mk and .profile.old lurking under my home directory — but I chose those names so I wouldn't count them). I did find a .zoid.log.yaml file when I went to look just now; also a .serverauth.4590 file; those are not names I chose. And yes, I'd argue immediately that these all have a name beginning with a dot and an extension (or two).Abridge
T
7

The reason why someone would want to know the extension of a file is that he wants to know the type (or some other metadata) of the file, right?

The name of a dotfile has nothing to do with its type. The part after the dotfile's first dot is its name and no extension. But a dotfile could also have an extension (for example .mongorc.js or some other hidden file on a UNIX system).

So I would say, the way node's path.extname is doing it is the right way.

Return the extension of the path, from the last '.' to end of string in the last portion of the path. If there is no '.' in the last portion of the path or the first character of it is '.', then it returns an empty string.


There are two different things here:

  • There is on one hand the mechanism of determining some metadata by a filename extension, for example to open it with a "default" application in non unix-like systems.

  • Dotfiles on the other hand are just hidden files (with a normal name and a dot in front, to make them not visible to ls) coming from unix-like systems. So .gitignore is just a file named gitignore and marked hidden - there's no extension here.

Tribasic answered 24/8, 2015 at 3:27 Comment(3)
I'm going to play devil's advocate on this one. File extensions are typically used as a hint of how the file should be handled by the operating system for their default action. This is just convenience so that when you double click on a .txt file it opens in your favorite text editor. If .gitignore were treated by the operating system as having no file extension, there wouldn't be a consistent way to tell the OS to open the file in your favorite custom .gitignore editor. The OS should then treat it as an extensionless file just like a file named file and ask how you want to handle it.Ubiquitarian
@Ubiquitarian It would be nice to assign default file handlers according to regular expressions. That way, you're free to use whatever naming scheme you like instead of being tied to the "extension" model.Stainless
@Ubiquitarian I tried to explain my opinion a bit more in an edit of my answer.Tribasic
S
3

I feel I have to leave an answer addressing the user's actual problem.

What do you actually need the extension for, in this case? Files like .gitignore, .htaccess etc. are matched against the full name of the file. Not the "extension" (or lack of) but the full name.

I would argue the case for doing away with any checking of file extensions, and checking against the name of the file. For example, in PHP:

basename ("/path/to/.htaccess");
Stainless answered 24/8, 2015 at 3:34 Comment(0)
T
3

I'd say they're just hidden... In Unix/Linux, a file/directory is considered hidden if it's first character is "." (dot), so it's just the first character - not an extension/suffix.

Further more, directories can also be hidden - including the "." and ".." (current and parent) directories - and extension/suffix isn't usually something we associate with directories. In fact, many programs create such hidden-directories for themselves (eg. .foo) rather than just one file (eg. .foorc).

Finally, Unix/Linux has never been that concerned with suffixes anyway - most programs are able to read "their" files without relaying on one. Instead Unix often use tests from the "magic"-file (/etc/magic) and the file command to determent the file-type. (The notable exception being compression-programs like gzip and bzip2, which replaces original with a suffixed compressed version).

I would add that the "rc"-ending (for "run-command") - in for example .bashrc, .wgetrc, .zshrc and so on, could be thought as a the suffix/extension for these types of files - event though there is no dot between the name and the suffix (some - like .rtorrent.rc - do actually have a dot).

Tinfoil answered 24/8, 2015 at 11:59 Comment(3)
directories can also be hidden - including the "." and "..". - Ah, .. is just the hidden version of ., I see...Halflength
Thomas Weller: I can't tell if you were kidding or not; but just in case: .. is the parent of the current directory; . is the current directory.Collect
Not quite... ;-) They're both hidden. The current directory is called "." and the parent directory is called "..". But since hidden files/directories starts with a dot in Unix - which both "." and ".." does - they're also both hidden... I.e. they don't show up if you do a simple ls... you must use ls -a to see them - along with other hidden files/directories.Tinfoil

© 2022 - 2024 — McMap. All rights reserved.