Raise compile-time error if a string has whitespace
Asked Answered
J

2

5

I have a base class that is intended to be inherited by other users of the code I'm writing, and one of the abstract functions returns a name for the object. Due to the nature of the project that name cannot contain whitespace.

class MyBaseClass {

  public:

    // Return a name for this object. This should not include whitespace.
    virtual const char* Name() = 0;

};

Is there a way to check at compile-time if the result of the Name() function contains whitespace? I know compile-time operations are possible with constexpr functions but I'm not sure of the right way to signal to code users that their function returns a naughty string.

I'm also unclear on how to get a constexpr function to actually be executed by the compiler to perform such a check (if constexpr is even the way to go with this).

Jocose answered 10/6, 2021 at 17:13 Comment(10)
en.cppreference.com/w/cpp/string/byte/isspaceThurber
@Thurber compile-time.Bogart
I don't believe there's any way to check a string at compile time.Gunfire
@MarkRansom std::isspace is defined with #define directives so it's available in constexpr contexts.Jocose
That checks a single character, not a string. I stand by my statement until proven otherwise.Gunfire
I'm using const char*, not std::string, and for loops and array indexing are also available in constexpr... Hence my issue is more about checking the return value of a virtual function, not just a string (const char*) in general.Jocose
Do you have a way to prevent a derived class from reading its name from a file (at program start-up)?Quadrate
@Quadrate declaring the pure virtual function as constexpr as in my selected answer accomplishes this. Unless there's a way to read files in a constexpr context. Which wouldn't even surprise me at this point.Jocose
@MichaelHoffmann So this shouldn't surprise you: std::embedLoquacity
Wow. C++ is just a super efficient version of JavaScript now isn't it?Jocose
L
7

I think this is possible in C++20.

Here is my attempt:

#include <string_view>
#include <algorithm>
#include <stdexcept>

constexpr bool is_whitespace(char c) {
    // Include your whitespaces here. The example contains the characters
    // documented by https://en.cppreference.com/w/cpp/string/wide/iswspace
    constexpr char matches[] = { ' ', '\n', '\r', '\f', '\v', '\t' };
    return std::any_of(std::begin(matches), std::end(matches), [c](char c0) { return c == c0; });
}

struct no_ws {
    consteval no_ws(const char* str) : data(str) {
        std::string_view sv(str);
        if (std::any_of(sv.begin(), sv.end(), is_whitespace)) {
            throw std::logic_error("string cannot contain whitespace");
        }
    }
    const char* data;
};

class MyBaseClass {
  public:
    // Return a name for this object. This should not include whitespace.
    constexpr const char* Name() { return internal_name().data; }
  private:
    constexpr virtual no_ws internal_name() = 0;
};

class Dog : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Dog";
    }
};

class Cat : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Cat";
    }
};

class BadCat : public MyBaseClass {
    constexpr no_ws internal_name() override {
        return "Bad cat";
    }
};

There are several ideas at play here:

  • Let's use the type system as documentation as well as constraint. Therefore, let us create a class (no_ws in the above example) that represents a string without whitespaces.

  • For the type to enforce the constraints at compile-time, it must evaluate its constructor at compile time. So let's make the constructor consteval.

  • To ensure that derived classes don't break the contract, modify the virtual method to return no_ws.

  • If you want to keep the interface (i.e returning const char*), make the virtual method private, and call it in a public non-virtual method. The technique is explained here.

Now of course here I am only checking a finite set of whitespace characters and is locale-independent. I think it would very tricky to handle locales at compile-time, so maybe a better way (engineering-wise) would be to explicitly specify a set of ASCII characters allowed in the names (a whitelist instead of a blacklist).

The above example would not compile, since "Bad cat" contains whitespace. Commenting out the Bad cat class would allow the code to compile.

Live demo on Compiler Explorer

Loquacity answered 10/6, 2021 at 18:7 Comment(3)
A useful application of C++20's constexpr virtual. Bravo!Worked
This is almost exactly what I need, very impressive. But is there an easy and consteval-friendly way to check a string_view for any whitespace, not just the space character?Jocose
Also, for anyone else attempting to use this: some compilers may report expression ‘<throw-expression>’ is not a constant expression rather than reporting the logic_error. But if the line is evaluated at all it means the condition to throw the exception was true and the string is invalid. This is the case for me with g++ version 11.Jocose
T
1

Unless the names themselves are all specified at compile-time, there's no way to assert they contain no whitespace characters prior to a runtime check.

Towne answered 10/6, 2021 at 17:40 Comment(3)
What if I change the pure virtual declaration to constexpr so the return values must be specified at compile-time?Jocose
A constexpr function is not guaranteed to actually be called at compile-time, it only grants the compiler permission to call it, if it so chooses. constexpr functions can also be used at runtime, too. And besides, polymorphic function calls don't work at compile-time, anywayProteus
@RemyLebeau This is no longer true in C++ 20. Virtual functions can now be declared constexpr. https://mcmap.net/q/379031/-can-virtual-functions-be-constexprJocose

© 2022 - 2024 — McMap. All rights reserved.