Transforming a string_view in-place
Asked Answered
A

2

5

std::transform, as of C++20, is declared constexpr. I have a bunch of string utility functions that take std::string arguments, but a lot of the usage ends up just passing in small, short, character literal sequences at compile-time. I thought I would leverage this fact and declare versions that are constexpr and take std::string_views instead of creating temporary std::string variables just to throw them away...

ORIGINAL STD::STRING VERSION:

[[nodiscard]] std::string ToUpperCase(std::string string) noexcept {
    std::transform(string.begin(), string.end(), string.begin(), [](unsigned char c) -> unsigned char { return std::toupper(c, std::locale("")); });
    return string;
}

NEW STD::STRING_VIEW VERSION:

[[nodiscard]] constexpr std::string_view ToUpperCase(std::string_view stringview) noexcept {
    std::transform(stringview.begin(), stringview.end(), stringview.begin(), [](unsigned char c) -> unsigned char { return std::toupper(c, std::locale("")); });
    return stringview;
}

But MSVC complains:

error C3892: '_UDest': you cannot assign to a variable that is const

Is there a way to call std::transform with a std::string_view and put it back into the std::string_view or am I going to have to create a local string and return that, thereby defeating the purpose of using std::string_view in the first place?

[[nodiscard]] constexpr std::string ToUpperCase(std::string_view stringview) noexcept {
    std::string copy{stringview};
    std::transform(stringview.begin(), stringview.end(), copy.begin(), [](unsigned char c) -> unsigned char { return std::toupper(c, std::locale("")); });
    return copy;
}
Ashcroft answered 5/1, 2022 at 19:39 Comment(6)
A string_view is an immutable view into a sequence of characters. It looks but doesn't touch. That's the whole reason it doesn't need to copy anything to form a view of a const char[] C-style string literal.Waddell
No way to avoid copying because string_view only gives const access - and it is good. How would you want to avoid copying in auto upper = ToUpperCase("foo");?Solution
std::string_view refers to a constant contiguous sequence. Why not try std::span<char>?Xenogenesis
@Xenogenesis - because it would have different semantics: in-place modification.Solution
@Solution which is what it sounds like he wants to me. He specifically asks at the end if there is a way to modify it in place.Viridity
Returning a string_view (which is non-owning) is already a dangling reference bug, nevermind what the guts of the function do.Printmaking
T
4

As said in one comment, span is a better vocabulary type for this because individual elements can be modified, giving a sort of mutable string view (msv). Also, I wouldn't make it nodiscard, because it can be useful even without assigning the result:

#include<algorithm>  // for std::transform
#include<cassert>
#include<locale>  // for std::to_upper
#include<string_view>
#include<span>

constexpr auto ToUpperCase(std::span<char> msv) noexcept {
    std::transform(msv.begin(), msv.end(), msv.begin(), [](unsigned char c) -> unsigned char { return std::toupper(c); });
    return msv;
}

int main() {
    auto a = std::string{"compiler"};
    auto&& tmp = ToUpperCase(a);
    auto b = std::string{tmp.begin(), tmp.end()};
    assert( a == "COMPILER");
    assert( b == "COMPILER");
}

https://godbolt.org/z/zPr968PYr


Somewhat departing from your original aim... I think this is more elegant, although subject to bloating and ugly compilation errors. It has the same effect in the cases provided.

Also I don't like the design of span (or string_view for that matter)

(Exercise: add Concepts)

template<class StringRange>
constexpr StringRange&& ToUpperCase(StringRange&& stringview) noexcept {
    std::transform(stringview.begin(), stringview.end(), stringview.begin(),
        [](unsigned char c) -> unsigned char { return std::toupper(c); });
    return std::forward<StringRange>(stringview);
}

https://godbolt.org/z/e9aWKMerE

I find myself using this idiom quite a bit lately.

Troy answered 7/1, 2022 at 20:34 Comment(0)
L
2

You can't in-place transform a std::string_view - what if it has been constructed from char const*?

a lot of the usage ends up just passing in small, short, character literal sequences at compile-time.

...but you can lift string literals to the type level

namespace impl {
    template<std::size_t n> struct Str {
        std::array<char, n> raw{};
        constexpr Str(char const (&src)[n + 1]) { std::copy_n(src, n, raw.begin()); }
    };
    template<std::size_t n> Str(char const (&)[n]) -> Str<n - 1>;
}
template<char... cs> struct Str { static char constexpr value[]{cs..., '\0'}; };
template<impl::Str s>
auto constexpr str_v = []<std::size_t... is>(std::index_sequence<is...>) {
    return Str<s.raw[is]...>{};
}(std::make_index_sequence<s.raw.size()>{});

...and add a special case. In general, this hack can be avoided with range/tuple polymorphic algorithms.

[[nodiscard]] constexpr auto ToUpperCase(auto str) {
    for (auto&& c: str) c = ConstexprToUpper(c); // std::toupper doesn't seem constexpr
    return str;
}
template<char... cs> [[nodiscard]] constexpr auto ToUpperCase(Str<cs...>) {
    return Str<ConstexprToUpper(cs)...>{};
}

So, to use that transformation optimized for character literal sequences, now write ToUpperCase(str_v<"abc">) instead of ToUpperCase("abc"sv). If you always want string_view as output, return std::string_view{Str<ConstexprToUpper(cs)...>::value} in that overload.

Locoweed answered 7/1, 2022 at 13:14 Comment(5)
Could you explain what you have achieved? Usage example of the 2 overloads of ToUpperCase() and in which way they are better than the straightforward implementation OP showed?Solution
@Solution yes: my version provides "way to call std::transform with a std::string_view and put it back into the std::string_view" instead of "create a local string and return that, thereby defeating the purpose of using std::string_view in the first place".Locoweed
Could you add an example of doing it in your answer? Also, are you creating the uppercase of string literals at compile time?Solution
@Solution I've edited my answer to show the usage. Yes, the transformation happens at compile time.Locoweed
Thank you, +1 since it shows advanced techniques, although I would not use this in production.Solution

© 2022 - 2024 — McMap. All rights reserved.