Concat two `const char` string literals
Asked Answered
U

5

17

Is it possible to concat two string literals using a constexpr? Or put differently, can one eliminate macros in code like:

#define nl(str) str "\n"

int main()
{
  std::cout <<
      nl("usage: foo")
      nl("print a message")
      ;

  return 0;
}

Update: There is nothing wrong with using "\n", however I would like to know whether one can use constexpr to replace those type of macros.

Uralaltaic answered 8/11, 2012 at 15:36 Comment(16)
What's wrong with "usage: foo\n" "print a message\n"?Lorrinelorry
Probably best to use std::endl rather than \nPeta
@R.MartinhoFernandes Or even "usage: foo\nprint a message\n"?Guiana
@Douglas probably not. If you want to print a newline, why would you print a newline and flush?Lorrinelorry
@DouglasLeeder In a larger program, yes, it's probably better. Here, it makes absolutely no difference.Guiana
@R.MartinhoFernandes So that you can see roughly how far you've gotten if the program crashes.Guiana
@R.MartinhoFernandes nothing is wrong with that. However I would like to understand whether constexpr could be used to replace those kind of macros.Uralaltaic
@DouglasLeeder: std::endl is overused when you just want '\n'. So I don't think std::endl should be used in place of '\n'.Cogswell
@DouglasLeeder: No, that doesn't make more sense. Strings are useful for more than insertion into an ostream.Suttle
@JamesKanze flushing often costs a lot in performance. For things you want to see in case of a crash such as error messages you can use a separate output channel like std::cerr. There's rarely a need to use std::endl; it's very much overused in introductory C++ materials.Pfeffer
@Pfeffer If the purges cause a performance problem, you do change to '\n'. But std::endl is the default. (In my own code, if I'm outputting several lines with no intervening operations, I'll use '\n' on all but the last. But I would consider this as an "advanced technique". Beginners should use std::endl, period, until they understand the issues well enough to make a knowledgeable choice.)Guiana
@Douglas : See also this question: What is the C++ iostream endl fiasco?Uncontrollable
@JamesKanze using std::endl in favor of \n defeats the whole point about buffered streams, its similar to introducing namespaces but then invoking using namespace std;. People should rely on streams doing the right (tm) thing and should not flush them, unless they now why the want to flush.Uralaltaic
@MichaWiedenmann People who understand buffering will mix \n and std::endl as most appropriate. People who don't understand buffering (and it's generally not the first thing you explain when teaching C++) should use std::endl by default, on the principle of least surprise.Guiana
@James : And then those people post questions on SO asking why on earth their code is so slow. ;-]Uncontrollable
See the code in my answer here: stackoverflow.com/questions/15858141/…Enterogastrone
H
1
  1. Yes, it is entirely possible to create compile-time constant strings, and manipulate them with constexpr functions and even operators. However,

  2. The compiler is not required to perform constant initialization of any object other than static- and thread-duration objects. In particular, temporary objects (which are not variables, and have something less than automatic storage duration) are not required to be constant initialized, and as far as I know no compiler does that for arrays. See 3.6.2/2-3, which define constant initialization, and 6.7.4 for some more wording with respect to block-level static duration variables. Neither of these apply to temporaries, whose lifetime is defined in 12.2/3 and following.

So you could achieve the desired compile-time concatenation with:

static const auto conc = <some clever constexpr thingy>;
std::cout << conc;

but you can't make it work with:

std::cout <<  <some clever constexpr thingy>;

Update:

But you can make it work with:

std::cout << *[]()-> const {
             static constexpr auto s = /* constexpr call */;
             return &s;}()
          << " some more text";

But the boilerplate punctuation is way too ugly to make it any more than an interesting little hack.


(Disclaimer: IANALL, although sometimes I like to play one on the internet. So there might be some dusty corners of the standard which contradicts the above.)

(Despite the disclaimer, and pushed by @DyP, I added some more language-lawyerly citations.)

Homely answered 8/11, 2012 at 19:25 Comment(15)
Could you point out for me where the Standard says temporaries have dynamic storage duration? Cannot find it..Hagans
I'd add 6.7/4, as we deal with block-scope variables here. And this only permits an implementation to do "early" init of local static-storage-duration variables (it's required to be initialized before first block entry).Hagans
@DyP: 12.2/3 "Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created." There are some exceptions following that, but nothing which would allow the temporary to become permanent.Homely
For me, dynamic storage duration is (erroneously?) related to new and delete - where I would expect (effectively) all compilers to put a temporary on the stack.Hagans
@DyP: quite right, I actually meant "automatic", but 12.2/3 seems to say that temporaries have even shorter lives than that. I can't find a phrase to describe temporary object lifetimes other than that, so I edited the response accordingly. Regardless of the standard, which probably does allow a compiler to constant initialize a constexpr temporary -- otherwise user-defined string literals would be a lot less interesting -- I'm pretty sure that compilers don't actually do it, except maybe for user-defined string literals, which are not yet widely implemented.Homely
I've edited my answer as I recalled you can enforce evaluation of constexpr e.g. in template arguments or wherever a constant expression is required. Edit: it seems from the assembly it still doesn't work? What's going on?Hagans
String literals are always lvalues, so I don't see how they could ever be used as constant expressions. (Character literals are a different story -- see e.g. boost::mpl::string<>.)Uncontrollable
@ildjarn, what's wrong with lvalues? liveworkspace.org/code/b7a79af9fab4cb4b72deb5e93c36aba2 clearly shows a string literal used as a constant expression. (Actually, a character from the string literal, but it's definitely a constexpr function with a string literal as an argument.)Homely
The fact that a string literal's characters (as previously mentioned) and size are constant expressions has no reflection on the string literal itself.Uncontrollable
@ildjarn, I don't use the string literal's size. Read that code more carefully. The 32 comes from the ascii value of ' '.Homely
Read that code more carefully -- S has its value because of the string literal's size...Uncontrollable
@ildjarn, but so what? The point is that the size of the std::array (which must be a constant expression) comes from a function whose argument is a string literal.Homely
@ildjarn, perhaps you're reacting to my statement that compile time strings are possible. I stand by it, but the strings' types will probably be more like boost:mpl::string's (I'm guessing). They'll still be character arrays, NUL-terminated if desired, and initializable from string-literals. Anyway, if I can get the length and every individual character out of a string literal, that's good enough for me. I don't know what aspect of a string literal non-const-expressiveness would be, then.Homely
Yes, I was referring to that statement. But, I'm glad we had this discussion, as your particular wording gave me a good idea for an approach to solving this. Thanks :-]Uncontrollable
@ildjarn: I was planning to clean this up a bit, but what the heck. liveworkspace.org/code/c54e3506022c53bf537428c2c26c0502Homely
O
15

A little bit of constexpr, sprinkled with some TMP and a topping of indices gives me this:

#include <array>

template<unsigned... Is> struct seq{};
template<unsigned N, unsigned... Is>
struct gen_seq : gen_seq<N-1, N-1, Is...>{};
template<unsigned... Is>
struct gen_seq<0, Is...> : seq<Is...>{};

template<unsigned N1, unsigned... I1, unsigned N2, unsigned... I2>
constexpr std::array<char const, N1+N2-1> concat(char const (&a1)[N1], char const (&a2)[N2], seq<I1...>, seq<I2...>){
  return {{ a1[I1]..., a2[I2]... }};
}

template<unsigned N1, unsigned N2>
constexpr std::array<char const, N1+N2-1> concat(char const (&a1)[N1], char const (&a2)[N2]){
  return concat(a1, a2, gen_seq<N1-1>{}, gen_seq<N2>{});
}

Live example.

I'd flesh this out some more, but I have to get going and wanted to drop it off before that. You should be able to work from that.

Oribel answered 8/11, 2012 at 17:31 Comment(14)
I also considered this approach, but used another one because of the "implementation quantities" (Annex B). Though I'm not absolutely sure, I think this limits the length of strings you can work on to 256 (or 1024) chars, whereas a string literal itself can be >65k chars long.Hagans
@Dyp (and Xeo): this suffers from the same problem as DyP's clever solution, which is that while it produces the expected output, it actually creates the strings at run-time. In order to get it not to do that, as far as I know, you have to do something like static const auto s = _call to clever constexpr_. I compiled both of these with clang 3.2 and g++ 4.7.2 (which 'sorry's on DyP's) to look at the assembly code generated.Homely
@DyP, I was just rereading that section, in fact. Afaics, it allows compilers to do constant initialization of temporaries, but it certainly doesn't require them to do so, and neither clang nor gcc does. However, it is quite possible that other text in the standard would also get in the way of constant initialization (beyond just proving that the restrictions in 3.6.2/3 don't apply, which might in itself be tricky).Homely
@Oribel when I turn the fcts constexpr and assign constexpr auto s = concat(...); it does not compile on clang 3.1 ("read of uninitialized object") but I don't understand why. When I add an constexpr ctor to gen_seq<0, Is...>, it compiles fine.Hagans
@DyP: Derp, I actually forgot to make them constexpr. Fixed the code and it compiles fine on GCC 4.7.2. Clang 3.1 prob has a bug with constexpr here and 3.2 should compile just as fine.Oribel
@xeo, it compiles fine on clang 3.2, which surprised me by putting the concatenated string in .rodata (and then calling strlen on it to get its length!). gcc-4.7.2 also compiles it, but it constructs the string at runtime (movb $104, (%rsp); movb $101, 1(%rsp)... I noticed that you changed main in liveworkspace to store the result of concat in a variable, but you didn't make the variable static as indicated in my comment above. If the variable is not static, the compiler is not required to do constant initialization. (Making the change gets gcc to do the concat at compile time.)Homely
@rici: Actually, the compiler should be. constexpr variables require compile-time evaluation of the initializer.Oribel
@xeo: 7.1.5(9). constexpr variable declarations require that the initialization use only constant expressions, but they don't require the variable to be constant-initialized (3.6.2(2)). Only variables of static or thread storage need to be constant-initialized. The compiler may constant-initialize (as clang 3.2 does in this case) but doesn't require it, so gcc 4.7.2 is not, imo, incorrect.Homely
@rici: That doesn't make sense... what if you wanted to use a constexpr variable as a template argument right in the next line?Oribel
@xeo: using the constexpr variable as a template argument is independent from how it is initialized at run-time. You can use a static constexpr member at compile time without ever odr-using it, so that it doesn't even exist at run-time. (I've been caught out by that one; if you actually use it at run-time, it needs to be defined.)Homely
@rici: constexpr size_t size = calc_size(); array<int, size> a; -- size can't possibly be initialized at runtime.Oribel
let us continue this discussion in chatHomely
@xeo: Ok, then. In your example, size might not even exist at runtime, never mind be initialized. That's more important for larger objects, though. It's an oversimplification, but I would say that constexpr means that the object exists at compile-time; and static means that the object exists throughout the entire run-time. The two are completely independent. Without static, the lifetime (and storage consumption) of the object might be limited to the lifetime of its scope, which in the case of a large array might well be a good thing.Homely
The link to the example appears to be brokenBarbe
S
1
  • You cannot return a (plain) array from a function.
  • You cannot create a new const char[n] inside a constexpr (§7.1.5/3 dcl.constexpr).
  • An address constant expression must refer to an object of static storage duration (§5.19/3 expr.const) - this disallows some tricks with objects of types having a constexpr ctor assembling the array for concatenation and your constexpr fct just converting it to a ptr.
  • The arguments passed to a constexpr are not considered to be compile-time constants so you can use the fct at runtime, too - this disallows some tricks with template metaprogramming.
  • You cannot get the single char's of a string literal passed to a function as template arguments - this disallows some other template metaprogramming tricks.

So (as far as I know), you cannot get a constexpr that is returning a char const* of a newly constructed string or a char const[n]. Note most of these restrictions don't hold for an std::array as pointed out by Xeo.

And even if you could return some char const*, a return value is not a literal, and only adjacent string literals are concatenated. This happens in translation phase 6 (§2.2), which I would still call a preprocessing phase. Constexpr are evaluated later (ref?). (f(x) f(y) where f is a function is a syntax error afaik)

But you can return from your constexpr fct an object of some other type (with a constexpr ctor or that is an aggregate) that contains both strings and can be inserted/printed into an basic_ostream.


Edit: here's the example. It's quite a bit long o.O Note you can streamline this in order just to get an additional "\n" add the end of a string. (This is more a generic approach I just wrote down from memory.)

Edit2: Actually, you cannot really streamline it. Creating the arr data member as an "array of const char_type" with the '\n' included (instead of an array of string literals) uses some fancy variadic template code that's actually a bit longer (but it works, see Xeo's answer).

Note: as ct_string_vector (the name's not good) stores pointers, it should be used only with strings of static storage duration (such as literals or global variables). The advantage is that a string does not have to be copied & expanded by template mechanisms. If you use a constexpr to store the result (like in the example main), you compiler should complain if the passed parameters are not of static storage duration.

#include <cstddef>
#include <iostream>
#include <iterator>

template < typename T_Char, std::size_t t_len >
struct ct_string_vector
{
    using char_type = T_Char;
    using stringl_type = char_type const*;

private:
    stringl_type arr[t_len];

public:
    template < typename... TP >
    constexpr ct_string_vector(TP... pp)
        : arr{pp...}
    {}

    constexpr std::size_t length()
    {  return t_len;  }

    template < typename T_Traits >
    friend
    std::basic_ostream < char_type, T_Traits >&
    operator <<(std::basic_ostream < char_type, T_Traits >& o,
        ct_string_vector const& p)
    {
        std::copy( std::begin(p.arr), std::end(p.arr),
            std::ostream_iterator<stringl_type>(o) );
        return o;
    }
};

template < typename T_String >
using get_char_type =
    typename std::remove_const < 
    typename std::remove_pointer <
    typename std::remove_reference <
    typename std::remove_extent <
        T_String
    > :: type > :: type > :: type > :: type;

template < typename T_String, typename... TP >
constexpr
ct_string_vector < get_char_type<T_String>, 1+sizeof...(TP) >
make_ct_string_vector( T_String p, TP... pp )
{
    // can add an "\n" at the end of the {...}
    // but then have to change to 2+sizeof above
    return {p, pp...};
}

// better version of adding an '\n':
template < typename T_String, typename... TP >
constexpr auto
add_newline( T_String p, TP... pp )
-> decltype( make_ct_string_vector(p, pp..., "\n") )
{
    return make_ct_string_vector(p, pp..., "\n");
}

int main()
{
    // ??? (still confused about requirements of constant init, sry)
    static constexpr auto assembled = make_ct_string_vector("hello ", "world");
    enum{ dummy = assembled.length() }; // enforce compile-time evaluation
    std::cout << assembled << std::endl;
    std::cout << add_newline("first line") << "second line" << std::endl;
}
Somnambulate answered 8/11, 2012 at 15:36 Comment(3)
The enum isn't necessary; static constexpr auto will do it. In fact, static const auto will make it constant-initialized if possible. But neither of those will let the second std::cout line act in the same way as the macro in the OP, not even to the extent of optimizing add_newline("first line") into a compile-time literal.Homely
@Homely Sry, but I still don't get why static constexpr is sufficient. After all, this is a block-scope static variable and therefore, 6.7/4 holds ("An implementation is permitted.."). Maybe a chat (cannot figure out how to start it o.O)?Hagans
@DyP: 6.7/4 says "Constant initialization (3.6.2) of a block-scope entity with static storage duration, if applicable, is performed before its block is first entered." So if constant initialization is applicable, it's applied. The "An implementation is permitted... of other block-scope variables..." statement applies to initializations for which the conditions in 3.6.2 do not apply. At least, that's my interpretation, but like I said in the disclaimer, IANALL.Homely
S
1

At first glance, C++11 user-defined string literals appear to be a much simpler approach. (If, for example, you're looking for a way to globally enable and disable newline injection at compile time)

Suttle answered 8/11, 2012 at 15:52 Comment(1)
Wait, maybe not... literal concatenation won't preserve the boundaries :(Suttle
H
1
  1. Yes, it is entirely possible to create compile-time constant strings, and manipulate them with constexpr functions and even operators. However,

  2. The compiler is not required to perform constant initialization of any object other than static- and thread-duration objects. In particular, temporary objects (which are not variables, and have something less than automatic storage duration) are not required to be constant initialized, and as far as I know no compiler does that for arrays. See 3.6.2/2-3, which define constant initialization, and 6.7.4 for some more wording with respect to block-level static duration variables. Neither of these apply to temporaries, whose lifetime is defined in 12.2/3 and following.

So you could achieve the desired compile-time concatenation with:

static const auto conc = <some clever constexpr thingy>;
std::cout << conc;

but you can't make it work with:

std::cout <<  <some clever constexpr thingy>;

Update:

But you can make it work with:

std::cout << *[]()-> const {
             static constexpr auto s = /* constexpr call */;
             return &s;}()
          << " some more text";

But the boilerplate punctuation is way too ugly to make it any more than an interesting little hack.


(Disclaimer: IANALL, although sometimes I like to play one on the internet. So there might be some dusty corners of the standard which contradicts the above.)

(Despite the disclaimer, and pushed by @DyP, I added some more language-lawyerly citations.)

Homely answered 8/11, 2012 at 19:25 Comment(15)
Could you point out for me where the Standard says temporaries have dynamic storage duration? Cannot find it..Hagans
I'd add 6.7/4, as we deal with block-scope variables here. And this only permits an implementation to do "early" init of local static-storage-duration variables (it's required to be initialized before first block entry).Hagans
@DyP: 12.2/3 "Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created." There are some exceptions following that, but nothing which would allow the temporary to become permanent.Homely
For me, dynamic storage duration is (erroneously?) related to new and delete - where I would expect (effectively) all compilers to put a temporary on the stack.Hagans
@DyP: quite right, I actually meant "automatic", but 12.2/3 seems to say that temporaries have even shorter lives than that. I can't find a phrase to describe temporary object lifetimes other than that, so I edited the response accordingly. Regardless of the standard, which probably does allow a compiler to constant initialize a constexpr temporary -- otherwise user-defined string literals would be a lot less interesting -- I'm pretty sure that compilers don't actually do it, except maybe for user-defined string literals, which are not yet widely implemented.Homely
I've edited my answer as I recalled you can enforce evaluation of constexpr e.g. in template arguments or wherever a constant expression is required. Edit: it seems from the assembly it still doesn't work? What's going on?Hagans
String literals are always lvalues, so I don't see how they could ever be used as constant expressions. (Character literals are a different story -- see e.g. boost::mpl::string<>.)Uncontrollable
@ildjarn, what's wrong with lvalues? liveworkspace.org/code/b7a79af9fab4cb4b72deb5e93c36aba2 clearly shows a string literal used as a constant expression. (Actually, a character from the string literal, but it's definitely a constexpr function with a string literal as an argument.)Homely
The fact that a string literal's characters (as previously mentioned) and size are constant expressions has no reflection on the string literal itself.Uncontrollable
@ildjarn, I don't use the string literal's size. Read that code more carefully. The 32 comes from the ascii value of ' '.Homely
Read that code more carefully -- S has its value because of the string literal's size...Uncontrollable
@ildjarn, but so what? The point is that the size of the std::array (which must be a constant expression) comes from a function whose argument is a string literal.Homely
@ildjarn, perhaps you're reacting to my statement that compile time strings are possible. I stand by it, but the strings' types will probably be more like boost:mpl::string's (I'm guessing). They'll still be character arrays, NUL-terminated if desired, and initializable from string-literals. Anyway, if I can get the length and every individual character out of a string literal, that's good enough for me. I don't know what aspect of a string literal non-const-expressiveness would be, then.Homely
Yes, I was referring to that statement. But, I'm glad we had this discussion, as your particular wording gave me a good idea for an approach to solving this. Thanks :-]Uncontrollable
@ildjarn: I was planning to clean this up a bit, but what the heck. liveworkspace.org/code/c54e3506022c53bf537428c2c26c0502Homely
C
0

Nope, for constexpr you need a legal function in the first place, and functions can't do pasting etc. of string literal arguments.

If you think about the equivalent expression in a regular function, it would be allocating memory and concatenating the strings - definitely not amenable to constexpr.

Corina answered 8/11, 2012 at 16:25 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.