Why doesn't std::string_view have assign() and clear() methods?
Asked Answered
C

6

7

The implementation of these methods seems straightforward to me and they would make usage of std::string and std::string_view more interchangeable. After all, std::string_view has constructors which leave the object in the same state as these methods would. One could workaround the missing methods like this:

std::string s {"abcd"};
std::string_view v {s.c_str()};
std::cout << "ctor:   " << v << std::endl; // "abcd"
v = {s.c_str() + 1, 2};
std::cout << "assign: " << v << std::endl; // "bc"
v = {nullptr}; // or even v = {};
std::cout << "clear:  " << v << std::endl; // ""

So, what are the reasons for not including these two obvious methods in the standard?

UPDATE: One general question in your comments seems to be "What's the point?", so I'll give you some context. I'm parsing a large string with the result being a structure of substrings. That result structure is a natural candidate for string views, so I don't have to copy all those strings, which are even overlapping. Part of the result are maps to string views, so I might need to construct them empty when I get the key and fill them later when I get the value. While parsing I need to keep track of intermediate strings, which involves updating and resetting them. Now they could be replaced by string views as well, and that's how I happened on those missing functions. Of course I could continue using strings or replace them by plain old ptr-ptr or ptr-size pairs, but that's exactly what std::string_view is for, right?

Carloscarlota answered 24/1, 2019 at 11:53 Comment(7)
To be clear: You would expect that these make the object "look at" a different string (not actually modify the currently viewed string)?Grillparzer
Yes, just changing the view, not the underlying string.Carloscarlota
That's semantically different from the corresponding std::string methods (which is probably the reason you're looking for). Just like changing a pointer to a nullptr is different from deleting the pointed-to object.Grillparzer
Afaik, string_view's primary purpose is to replace const string& in a function parameters. It's a good question, what the point of string_view::clear and string_view::assign would be.Nonessential
@MaxLanghof I think I understand what you're saying but I don't agree about the semantical difference. You clear a string and you clear a view, afterwards both are empty. Same semantics. Whether or not there's some memory deallocation or whatever else happening under the hood doesn't actually matter. Otherwise, std::vector<int>::clear() would be semantically different from std::vector<int*>::clear(), no?Carloscarlota
@Carloscarlota No, that vector analogy doesn't hold. You remove (and deallocate) all the elements. Both int and int* are trivial to deallocate, there is no semantic difference. For the record, I agree with you that there are "interpretations" of assign and clear where there is no pronounced semantical difference. But you have to agree that it is potentially confusing whether clear also clears the viewed string itself - after all std::string does that.Grillparzer
@MaxLanghof My point was that vector owns the pointer, which it deallocates (just like the int values), but not the pointee, which it doesn't deallocate. You and I and everyone else knows that. Same with string_view in my opinion, the pointee would never be deallocated on clear(). Why would string_view do that, it's just a view on an existing object, not a smart pointer. Yet people seem to agree with you and expect it to operate on the underlying string. Anyway, your point is clear, even though I don't agree. Thanks for your input, but let's put the semantics discussion to rest.Carloscarlota
E
4

This is only ever really going to be speculation, but general concensus seems to be that these operations would be middlingly unclear.

Personally I think "clearing a view" makes perfect sense (and let's also not forget that remove_prefix and remove_suffix exist! Though see below...), but I also agree that there are other interpretations, which may be common, which make less sense. Recall that string_view is intended to complement const std::string&, not std::string, and neither of the functions you name is a part of std::string's constant interface.

To be honest, the fact that we need this conversation at all is, itself, probably a good reason to just not have the function in the first place.

From the final proposal for string_view, the following passage is not about assign or clear specifically but does act as a relevant view [lol] into the minds of the committee on this subject:

s/remove_prefix/pop_front/, etc.

In Kona 2012, I proposed a range<> class with pop_front, etc. members that adjusted the bounds of the range. Discussion there indicated that committee members were uncomfortable using the same names for lightweight range operations as container operations. Existing practice doesn't agree on a name for this operation, so I've kept the name used by Google's StringPiece.

This proposal did in fact include a clear(), which was unceremoniously struck off the register in a later, isolated, rationale-starved proposal.

Now, one might argue that the functions could therefore have been provided under different names, but that was never proposed, and it's hard to imagine what alternative names would resolve this problem without being simply bad names for the operations.

Since we can assign a new string_view easily enough, including an empty one, the whole problem is solved by simply not bothering to address it.

Exhibitive answered 24/1, 2019 at 15:19 Comment(7)
That's good information on the remove_prefix and remove_suffix. That proposal also addresses why it exists even though it is not in string (utility, supposedly), although not why if it's added to the view, why not also add it to string.Eparchy
"the whole problem is solved by simply not bothering to address it" Well, I was just curious about the rationale behind not including those functions. I was hoping for someone to chime in with the appropriate passage from the standard instead of people voicing their (to me unjustified) discomfort with those hypothetical functions. And there it is and what does it say? They omitted them because some "committee members were uncomfortable" with it. Oh well... Seeing the discussion here it probably was justified... Marked as answered.Carloscarlota
@Carloscarlota To clarify, it is the committee that has solved the problem by not addressing it, by simply not adding those functions. This was a fair question; I just don't think we can do any better than the above. Certainly there is no rationale in normative or non-normative standard text regarding this.Exhibitive
@LightnessRacesinOrbit, is that the final proposal? That seems to have a clear?Eparchy
@JeffGarrett Oh, man, so it does ... (a) someone lied to me, and (b) the plot thickens. I'll have to revisit this in the morning. Good spot!Exhibitive
But then: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4288.html Unfortunately without any explanation...Carloscarlota
@Carloscarlota Lol.Exhibitive
C
5

The std::string interface has a bad reputation due to its blown API, that's why std::string_view is very unlikely to get as many methods as std::string just because it's convenient or makes the two types more interchangeable.

But more important, these types aren't meant to be interchangeable. What does it mean for view on a container of characters to be "cleared"? As clear() is present on all STL containers and does something meaningful, having a std::string_view::clear() would be quite confusing.

In addition, a view on some data is meant for temporary consumption, e.g. a read-only function parameter. Why would you want to assign to it anyway? Here is an example function signature that uses std::string_view:

// Called e.g. with "x86_64-Darwin-16.7.0"
std::string_view extractOSName(std::string_view configStr)
{
    // Parse input, return a new view. Lifetime/ownership managed by caller.
    // No need to re-assign anything, let alone "clearing" them.
}
Corrida answered 24/1, 2019 at 12:2 Comment(3)
I agree that the std::string API is totally overblown, but assign() and clear() functions are pretty much standard in all the containers. And std::string_view acts as a container in the sense that it shouldn't matter if you're looking at a std::string_view or a std::string const &. That's what I meant by interchangeable. But I dont understand what could be confusing about std::string_view::clear(). What else could it do than clearing the view, so it's empty afterwards? Nobody would expect it to actually delete the underlying string, right?Carloscarlota
You say that "it shouldn't matter if you're looking at a std::string_view or a std::string const &", but doesn't that imply that assign doesn't make any sense in the first place, because a std::string const& can't be modified? On the clear subject: I think views are meant to be short-lived objects primarily for function parameters. Re-setting the object they view might simply not be the scenario they were designed for? But that's nothing but a guess.Corrida
I guess one would have to discriminate between constructing a string view and using it. Construction might not be as straightforward as a single constructor call, see the context I provided in my question. There are also std::string_view::remove_prefix/suffix() functions, which are non-const. But using a std::string_view const & should be pretty much indistinguishable from using a std::string const &, I think.Carloscarlota
E
4

This is only ever really going to be speculation, but general concensus seems to be that these operations would be middlingly unclear.

Personally I think "clearing a view" makes perfect sense (and let's also not forget that remove_prefix and remove_suffix exist! Though see below...), but I also agree that there are other interpretations, which may be common, which make less sense. Recall that string_view is intended to complement const std::string&, not std::string, and neither of the functions you name is a part of std::string's constant interface.

To be honest, the fact that we need this conversation at all is, itself, probably a good reason to just not have the function in the first place.

From the final proposal for string_view, the following passage is not about assign or clear specifically but does act as a relevant view [lol] into the minds of the committee on this subject:

s/remove_prefix/pop_front/, etc.

In Kona 2012, I proposed a range<> class with pop_front, etc. members that adjusted the bounds of the range. Discussion there indicated that committee members were uncomfortable using the same names for lightweight range operations as container operations. Existing practice doesn't agree on a name for this operation, so I've kept the name used by Google's StringPiece.

This proposal did in fact include a clear(), which was unceremoniously struck off the register in a later, isolated, rationale-starved proposal.

Now, one might argue that the functions could therefore have been provided under different names, but that was never proposed, and it's hard to imagine what alternative names would resolve this problem without being simply bad names for the operations.

Since we can assign a new string_view easily enough, including an empty one, the whole problem is solved by simply not bothering to address it.

Exhibitive answered 24/1, 2019 at 15:19 Comment(7)
That's good information on the remove_prefix and remove_suffix. That proposal also addresses why it exists even though it is not in string (utility, supposedly), although not why if it's added to the view, why not also add it to string.Eparchy
"the whole problem is solved by simply not bothering to address it" Well, I was just curious about the rationale behind not including those functions. I was hoping for someone to chime in with the appropriate passage from the standard instead of people voicing their (to me unjustified) discomfort with those hypothetical functions. And there it is and what does it say? They omitted them because some "committee members were uncomfortable" with it. Oh well... Seeing the discussion here it probably was justified... Marked as answered.Carloscarlota
@Carloscarlota To clarify, it is the committee that has solved the problem by not addressing it, by simply not adding those functions. This was a fair question; I just don't think we can do any better than the above. Certainly there is no rationale in normative or non-normative standard text regarding this.Exhibitive
@LightnessRacesinOrbit, is that the final proposal? That seems to have a clear?Eparchy
@JeffGarrett Oh, man, so it does ... (a) someone lied to me, and (b) the plot thickens. I'll have to revisit this in the morning. Good spot!Exhibitive
But then: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4288.html Unfortunately without any explanation...Carloscarlota
@Carloscarlota Lol.Exhibitive
E
3

The implementation of these methods seems straightforward to me and they would make usage of std::string and std::string_view more interchangeable.

std::string_view is not intended to be a replacement for std::string. It is intended as a replacement for const std::string&. assign and clear are not member functions of const std::string& you can call.

Eparchy answered 24/1, 2019 at 15:19 Comment(3)
Good point but neither are remove_prefix and remove_suffixExhibitive
@LightnessRacesinOrbit and they're also not present in std::string with confusingly different semantics.Highup
Perhaps assign and clear could be present in natural string_view::reset overloads. That would be consisten with optionals, smart pointers etc.Highup
N
2

std::string_views are references to strings elsewhere in memory. But unlike the good (or bad) old C char *, std::string_views rather have a pointer and a size. That's why they can refer to a specific part of some other string, and that's why remove_prefix() and remove_suffix() are meaningful operations on string_views.

However, you don't need an assign() method for std::string_view, as that would be no different from the assignment operator. std::string has an assign() method to care for a lot of cases, that the assignment operator of std::string cannot handle, because it cannot take more than one argument. A few examples:

std::string A { "This is a somewhat lengthy string" };
std::list<char> L { '1', 'A', 'Z' };
std::string B;

B.assign(80, ' ');               // Create an empty line of spaces
B.assign(A, 10, 8);              // Copy out a substring of another string
B.assign(L.begin(), L.end());    // Assign from an STL container

Of these three cases, only the middle one is meaningful with std::string_view. Both the first and last case would require to allocate a string, and that's what std::string_view cannot do.

The middle case, however, can easily be achieved by using std::string_view::substr(). For example:

std::string A { "This is a somewhat lengthy string" };
std::string_view V { "Directly pointing to A C string" };
std::string_view B;

B = std::string_view{A}.substr(10, 8);    // Reference substring of A
B = B.substr(0, 4);                       // Further shorten the reference
B = V.substr(0, 8);                       // Reassign with another reference
B = B.substr(0, 0);                       // "Clear" the string_view

So both clear() and assign() to a std::string_view can be done using the assignment operator instead. This is not true for std::string: The middle assignment B.assign(A, 10, 8); overwrites the already allocated B with data from A. Solutions using the assignment operator like B = A.substr(10, 8); would lead to the creation of a temporary std::string by the substr() call, then followed by an internal freeing of the memory, that was allocated to B before. This is often not efficient, that's why the substring assign method was added to std::string(). On the other hand, std::string_view::substr() does not require to copy any string data, so the assignment B = std::string_view{A}.substr(10, 8); can easily be optimized!

Please note, that the following code, though, is an error and would leave a dangling reference to a temporary substring on the heap:

std::string A { "This is a somewhat lengthy string" };
std::string_view V { A.substr(10, 8) };

So, if you need to set a std::string_view to a substring of a string, always convert to a std::string_view first (via std::string_view(string)) and then perform the required substr() method call on the std::string_view. Of course, make sure, that the original string outlives the newly generated std::string_view, or you end up with dangling references.

Newsstand answered 13/10, 2021 at 9:31 Comment(2)
Nice workaround using substr(), but you're arguing on my behalf. If assign() and clear() can be expressed in terms of substr(), why not have them in the first place? Plus, I guess remove_prefix()/remove_suffix() could be expressed in a similar manner.Carloscarlota
@Carloscarlota std::string_view::substr() can perform both, what std::string_view::assign() AND std::string_view::clear() can do, and it is even useful for things, that they cannot do. So why create two methods instead of just one?Newsstand
S
0

string_view could be awesome string interface wrapper for plain C strings (char arrays) when the backing store is not in your control, but you still want to access/modify the contents without the ability -- or need -- to meddle with the backing store (i.e. in contexts where dynamic memory allocation is out of question).

Spalato answered 30/9, 2021 at 11:37 Comment(0)
A
0

We don't have assign() or clear() methods for string_view, but we luckily have a combination of the two: swap(string_view& other). If you don't need the contents of the original string_view after swap, you can simply forget about it, e.g.

string_view sv {"original text"};
if (want_new) {
  string_view tmp {"new text"};
  sv.swap(tmp);
}

I use this pattern to extend the string_view class, e.g. to wrap a memory mapped file into a string_view. My derived class has a constructor that takes the file name, and a destructor that unmaps and closes the file (error handling skipped for brevity):

class mmmap_string_view: public string_view {
public:
  explicit mmmap_string_view(fs::path path) {
    fd = open(path.c_str(), O_RDONLY);
    auto sz = fs::file_size(path);
    auto *addr = mmap(NULL, sz, PROT_READ, MAP_PRIVATE, fd, 0);
    string_view tmp {static_cast<const char *>(addr), sz};
    swap(tmp);
  }
  ~mmmap_string_view() {
     munmap(const_cast<char *>(data()), size());
     close(fd);
  }
private:
  int fd = -1;
}
Arris answered 1/10, 2024 at 10:31 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.