Is this too much code for a header only library?

Asked 7/4, 2010 at 19:14 Answered 8/4, 2010 at 6:39

It seems like I had to inline quite a bit of code here. I'm wondering if it's bad design practice to leave this entirely in a header file like this:

#include <list>
#include <string>
#include <boost/noncopyable.hpp>
#include <boost/make_shared.hpp>
#include <boost/iterator/iterator_facade.hpp>
#include <Windows.h>
#include "../Exception.hpp"

namespace WindowsAPI { namespace FileSystem {

class NonRecursiveEnumeration;
class RecursiveEnumeration;
struct AllResults;
struct FilesOnly;

template <typename Filter_T = AllResults, typename Recurse_T = NonRecursiveEnumeration>
class DirectoryIterator;

template <typename Recurse_T>
struct FileData;

class NonRecursiveEnumeration : public boost::noncopyable
{
    WIN32_FIND_DATAW currentData;
    HANDLE hFind;
    std::wstring root;
public:
    NonRecursiveEnumeration() : hFind(INVALID_HANDLE_VALUE) {
    };
    NonRecursiveEnumeration(const std::wstring& pathSpec) {
        std::wstring::const_iterator lastSlash =
            std::find(pathSpec.rbegin(), pathSpec.rend(), L'\\').base();
        if (lastSlash != pathSpec.end())
            root.assign(pathSpec.begin(), lastSlash);
        hFind = FindFirstFileW(pathSpec.c_str(), &currentData);
        if (hFind == INVALID_HANDLE_VALUE)
            WindowsApiException::ThrowFromLastError();
        while (!wcscmp(currentData.cFileName, L".") || !wcscmp(currentData.cFileName, L"..")) {
            increment();
        }
    };
    void increment() {
        BOOL success =
            FindNextFile(hFind, &currentData);
        if (success)
            return;
        DWORD error = GetLastError();
        if (error == ERROR_NO_MORE_FILES) {
            FindClose(hFind);
            hFind = INVALID_HANDLE_VALUE;
        } else {
            WindowsApiException::Throw(error);
        }
    };
    ~NonRecursiveEnumeration() {
        if (hFind != INVALID_HANDLE_VALUE)
            FindClose(hFind);
    };
    bool equal(const NonRecursiveEnumeration& other) const {
        if (this == &other)
            return true;
        return hFind == other.hFind;
    };
    const std::wstring& GetPathRoot() const {
        return root;
    };
    const WIN32_FIND_DATAW& GetCurrentFindData() const {
        return currentData;
    };
};

//Not implemented yet
class RecursiveEnumeration : public boost::noncopyable
{
};

template <typename Recurse_T>
struct FileData //Serves as a proxy to the WIN32_FIND_DATA struture inside the iterator.
{
    const Recurse_T* impl;
    template <typename Filter_T, typename Recurse_T>
    FileData(const DirectoryIterator<Filter_T, Recurse_T>* parent) : impl(parent->impl.get()) {};
    DWORD GetAttributes() const {
        return impl->GetCurrentFindData().dwFileAttributes;
    };
    bool IsDirectory() const {
        return (GetAttributes() & FILE_ATTRIBUTE_DIRECTORY) != 0;
    };
    bool IsFile() const {
        return !IsDirectory();
    };
    bool IsArchive() const {
        return (GetAttributes() & FILE_ATTRIBUTE_ARCHIVE) != 0;
    };
    bool IsReadOnly() const {
        return (GetAttributes() & FILE_ATTRIBUTE_READONLY) != 0;
    };
    unsigned __int64 GetSize() const {
        ULARGE_INTEGER intValue;
        intValue.LowPart = impl.GetCurrentFindData().nFileSizeLow;
        intValue.HighPart = impl.GetCurrentFindData().nFileSizeHigh;
        return intValue.QuadPart;
    };
    std::wstring GetFolderPath() const {
        return impl->GetPathRoot();
    };
    std::wstring GetFileName() const {
        return impl->GetCurrentFindData().cFileName;
    };
    std::wstring GetFullFileName() const {
        return GetFolderPath() + GetFileName();
    };
    std::wstring GetShortFileName() const {
        return impl->GetCurrentFindData().cAlternateFileName;
    };
    FILETIME GetCreationTime() const {
        return impl->GetCurrentFindData().ftCreationTime;
    };
    FILETIME GetLastAccessTime() const {
        return impl->GetCurrentFindData().ftLastAccessTime;
    };
    FILETIME GetLastWriteTime() const {
        return impl->GetCurrentFindData().ftLastWriteTime;
    };
};

struct AllResults
{
    template <typename Recurse_T>
    bool operator()(const FileData<Recurse_T>&) {
        return true;
    };
}; 

struct FilesOnly
{
    template <typename Recurse_T>
    bool operator()(const FileData<Recurse_T>& arg) {
        return arg.IsFile();
    };
};

#pragma warning(push)
#pragma warning(disable: 4355)
template <typename Filter_T, typename Recurse_T>
class DirectoryIterator : public boost::iterator_facade<DirectoryIterator<Filter_T>, const FileData<Recurse_T>, std::input_iterator_tag>
{
    friend class boost::iterator_core_access;
    boost::shared_ptr<Recurse_T> impl;
    FileData<Recurse_T> derefData;
    Filter_T filter;
    void increment() {
        do {
            impl->increment();
        } while (! filter(derefData));
    };
    bool equal(const DirectoryIterator& other) const {
        return impl->equal(*other.impl);
    };
    const FileData<Recurse_T>& dereference() const {
        return derefData;
    };
public:
    typedef FileData<Recurse_T> DataType;
    friend struct DataType;
    DirectoryIterator(Filter_T functor = Filter_T()) :
        impl(boost::make_shared<Recurse_T>()),
        derefData(this),
        filter(functor) {
    };
    explicit DirectoryIterator(const std::wstring& pathSpec, Filter_T functor = Filter_T()) :
        impl(boost::make_shared<Recurse_T>(pathSpec)),
        derefData(this),
        filter(functor) {
    };
};
#pragma warning(pop)

}}

Lennalennard answered 7/4, 2010 at 19:14 Comment(6)

If it's header only, I think your best bet is to split it into multiple files. Like "detail/DirectoryIterator.hpp", etc, so at least it's not contained in one single file. This is what Boost does. By the way, Boost has a filesystem library, just in case you didn't know. – Freeborn 7/4, 2010 at 19:22

@GMan: +1. Yes, boost has a filesystem library, but it is designed for cross platform compatibility which requires opening a whole can of worms I don't want (it has it's own path processor, for example). With this I think I can make client code clearer that doesn't care about cross platform portability. – Lennalennard 7/4, 2010 at 19:25

Nice to see you're getting good use out of: #2532374 – Liszt 7/4, 2010 at 19:30

@Billy: Hey, no need to thank me - you're using Jerry Coffin's answer. I didn't do anything there but post a pointer to the Boost stuff (and I didn't even remember that I had done that - I just remembered the question and Jerry Coffin's post). – Liszt 7/4, 2010 at 19:53

Your flag-checking functions should be using the & operator (bitand), not the | operator (bitor). – Jerlenejermain 7/4, 2010 at 20:46

@Rob Kennedy: Good point. What the hell was I smoking? Fixed. – Lennalennard 7/4, 2010 at 21:19

I have much more code in some of mine, if that's of any consolation. and so do all C++ Standard Library implementations, Boost and Microsoft (for example, ATL).

Dexterdexterity answered 7/4, 2010 at 19:21 Comment(0)

The only part that strikes me as being open to much question would be implementations of the functions in DirectoryIteratorImpl. It's not a template so it doesn't really have to be in a header, and it has a couple of somewhat longer routines (the "real" constructor and the increment).

The rest are either templates or else composed of such trivial functions you'd want them inline in any case (e.g., the members of FileData). Those will end up in a header in any case.

Nescience answered 7/4, 2010 at 19:37 Comment(2)

Daily vote limit reached; come back in 4 hours. <-- Damn! I'll upvote in 4 hours. – Lennalennard 7/4, 2010 at 19:41

@GMan: Just been out since like 8 this morning. *Bill runs off to meta to petition for more votes. (Just kidding) – Lennalennard 7/4, 2010 at 20:29

As far as the length of the header goes, you can have as much code as you'd like in your header files. The trade off is the amount of code that must be recompiled each time your program is built; code placed in your CPP files can be compiled into object files and linked in on each subsequent build.

I would suggest that each of the method definitions for DirectoryIteratorImpl should be moved to a .cpp file. If you're not defining a method inline inside a class definition, there's no reason for it to be included in the header file.

An unrelated aside: Avoid writing inline DirectoryIteratorImpl(); - actually write your inline functions inline, or don't mark them inline. From the C++ FAQ Lite:

It's usually imperative that the function's definition (the part between the {...}) be placed in a header file. If you put the inline function's definition into a .cpp file, and if it is called from some other .cpp file, you'll get an "unresolved external" error from the linker.

If your functions are "too big" to write in the header file, they're too big to inline and the compiler will likely ignore your inline suggestion anyways.

Insolence answered 7/4, 2010 at 19:30 Comment(8)

#1. You didn't answer my question. I know the code gets put into every file that uses this code -- I was wondering if that would lead to too much code bloat. #2. I can't make the functions be a literal part of the class because half of them depend on the full definition of class FileData being available. Why do you think I declared all FileData's members implicitly inline? :) – Lennalennard 7/4, 2010 at 19:33

@billy Perhaps the body of the question should have specified. Also, my suggestion stands - nix the 'inline' definition if you can't declare the function inline. – Insolence 7/4, 2010 at 19:37

@meagar: I think that's sort of implied. The implications of making a function inline should be inherently obvious. I was asking if that is a good idea. The only reason it'd ever be a bad idea would be if it led to too much code bloat. – Lennalennard 7/4, 2010 at 19:46

@billy I don't think "code bloat" means what you think it means. You cannot solve code bloat by adding more characters to the file, ie, additional inline keywords. – Insolence 7/4, 2010 at 19:55

I don't understand what the problem is with marking a function inline and placing the definition later in the header. The FAQ entry pointed doesn't seem to indicate there's a problem with that. – Liszt 7/4, 2010 at 19:58

@meagar: Code bloat means a large binary. Since the header only library must use internal linkage that code may be duplicated several times in the resultant binary. – Lennalennard 7/4, 2010 at 20:3

@Michael The problem is the headers get very large and must be fully recompiled when the project is built. @billy Then we have different definitions; I refer to the size of the source code only when talking about code bloat. – Insolence 7/4, 2010 at 20:47

@meagar: I understand potential issue regarding the header size, but that would apply whether the inline was, well, inline or separate (but still in the header). The "unrelated aside" made it sound like there was maybe some issue (style or otherwise) with having inline functions declared and defined separately, even in the same header. – Liszt 7/4, 2010 at 20:55

It seems you are programming for Windows here, shall we assume that you are using Visual Studio ?

Anyway, I don't think there is something as too much code in headers.

It's a matter of trade offs mostly:

slower compilation (but we have multicores and precompiled headers)
more frequent recompilation (again, multicores)
perhaps code bloat...

The only point that is annoying (in my opinion) is the latest... and I'll need help here: are we sure the functions are going to be inlined, isn't it possible that the compiler and linker decide not to inline them and transform them into a regular call ?

Frankly, I would not worry too much about that. A number of Boost libraries are header-only even for their non-template parts simply because it makes integration easier (no linking required).

Lavabo answered 8/4, 2010 at 6:39 Comment(0)

Recommended topics

Hot tags