How to implement a string that solely allocates on the stack
Asked Answered
C

7

10

In a project about a decade ago, we found that std::vector's dynamic allocations caused a serious performance drain. In this case it was many small vectors allocated, so the quick solution was to write a vector-like class wrapping around a stack-based pre-allocated char array used as the raw storage for its capacity. The result was a static_vector<typename T, std::size_t Max>. Such a thing is easy enough to write if you know a few basics, and you can find quite a few such beasts on the web. In fact, boost has one, too, now.

Working on an embedded platform now, we happen to need a static_basic_string. That would be a string that pre-allocates a fixed maximum amount of memory on the stack and uses that as its capacity.

At first I thought this should be fairly easy (it could be based it on the existing static_vector, after all), but looking again at std::basic_string's interface I am not so sure anymore. It is way more complex than std::vector's interface. Especially implementing the family of find() functions std::basic_string comes with is more than just a tedious task.

That got me thinking again. After all, this is what allocators were created for: replace allocation based on new and delete with some other means. However, to say that the allocator interface is unwieldy would be an understatement. There are a few articles out there explaining it, but there is a reason I have seen so very few home-grown allocators in the last 15 years.

So here is my question:

If you had to implement a basic_string lookalike, how would you do it?

  • Write your own static_basic_string?
  • Write an allocator to pass to std::basic_string?
  • Do something I did not think of?
  • Use something from boost I am not aware of?

As always, there is a rather essential restriction for us: Being on an embedded platform, we are tied to GCC 4.1.2, so we can only employ C++03, TR1, and boost 1.52.

Chil answered 14/10, 2014 at 8:47 Comment(20)
This might be better suited for programmers.se since it's entirely in the designing phaseSoniasonic
@Marco: Mhmm. Looking at programmers.stackexchange.com/help/on-topic, "software architecture and design" is indeed listed there. However. 1) It's buried within a lot of other topics that have nothing to do with my question and seems to indicate they are talking about architecture on a very different level, and 2) I don't consider this a pure design question, because the interface is fixed (it's std::basic_string's interface) and I am more asking about how to implement it. Yes, this is fuzzy, but I do believe the question belongs here more than there. ICBWT.Chil
Why do you think allocators are so scary?.. I've used them on a couple of occasions; after all, this is what they are for. Derive "stack_string" from "basic_string", include allocator with built-in buffer as a member, pass that allocator to basic_string ctor.Mcadams
Is dynamic allocation totally out of the question, or would it be acceptable to do one dynamic allocation per std::basic_string instance, regardless of how its content changes later?Cop
@arunasr I think you should elaborate on this and turn it into an answer (probably with non-public inheritance and a few using declarations).Cop
Custom allocators are the way to go here. However, you should view your problem from a higher level: analyze the memory requirements and the data flow of the whole app. Profile it and get data. Then, find a strategic point where you possibly benefit from an "arena allocator" which both safes RAM and time. Otherwise, you may implement an allocator which initially allocates from the stack and when it exceeds the range, re-alloates the storage on the heap. See also: #11648702Metatarsal
It is inconceivable to me, that you can use boost in your project and , at the same time, have problems with memory allocation.Araby
I still can't imagine that this is a net positive over a std::string with large initial capacity and some restriction at the call site on maximum string size?! The cost of a single dynamic allocation is really worse than this level of complexity and the doubtlessly fragile code that follows? I guess if you're creating a metric frakton of these things but, then, try to re-use instead? More information on the use case and the original problems you encountered may help us to provide a canonical solution that isn't just NIHing.Acetum
Related: #784444Vyse
@Araby "inconceivable...use boost...and...have problems with memory allocation" - care to say something less than totally vague about it?Restorative
@TonyD using boost usually means enormous binaries and lots of allocations from the heap, in my experience, both undesirable on embedded devices.Araby
@Araby IMHO, given how numerous and enormously varied the boost libraries are, and that many of them are header-only templates with low overheads related to their actual use, that's a remarkably wild generalisation. The IT equivalent of "anyone who eats salads gets too much vitamin A".Restorative
@Metatarsal / @LightnessRacesinOrbit: consider a struct with lots of small string members - having them near-contiguous in memory - typically on the same page(s) of cache - sounds very advantageous compared to having each stored by pointers that - even if from the same custom pool - might be impractical to avoid allocating "discontiguously" without other undesirable coordination overheads.Restorative
@CouchDeveloper: The main issue here is that the string must never reallocate. The reasons for this are tied to a platform-dependent interface, but rest assured that we do need strings that never reallocate to satisfy our criterion. Of course, having been at the receiving end of memory fragmentation on a platform with rather limited memory plus having to deal with lots (hundreds) of such strings makes it an optimization target, too. But the main issue is that in this setup reallocations must not happen.Chil
@Lightness: See my comment above. Reallocations must be prevented definitely. The current workaround is to reserve twice the length we figure we need, but if that fails we will get crashes. In order to rule out that crashes are caused by this, I was given the task to eliminate this problem.Chil
@TonyD You have not addressed the "enormous binaries" issue at all, but have labeled my statement as "wild". There's a lot of allocations inside boost, but I am not going to make a detailed analysis about them.Araby
@user1095108: No offense, but these allegations are wild. We are on an embedded platform and make extensive use of boost and the more template-y parts of the standard library as well as writing our own template-heavy code. (Yes, sometimes codebloat from templates is an issue. Surprisingly, we have found that enabling inlining reduced the codebloat enough to currently not to be a headache anymore.) Whether problems are caused by boost, by code from the standard library, or by our own templates, simply doesn't matter. It's a problem with the platform, not with boost.Chil
Can you give a real example of use which is causing a problem? By reusing objects in a different way you max fix the problem.Seaborne
What's an example of a function showing up in your profiler?Seaborne
@Neil: Please read this comment.Chil
T
4

The first question is: how much of the extra interface do you use? Most of std::string's additional interfaces can be trivially implemented using functions in <algorithm> (e.g. std::find, std::find_if and std::search), and in a lot of cases, there are large chunks of it that won't be used anyway. Just implement it on an as needed basis.

I don't think you can make this work with a custom allocator. The only way to get the memory "on stack" would be to declare it as a member of the custom allocator, which would create all sorts of problems when copying them. And allocators must be copiable, and the copies must be idempotent.

Perhaps you can find a free implementation on the net of std::string which uses the small string implementation; then modify it so that the cutoff size (beyond which it uses dynamic allocation) is larger than any strings you actually use. (There are several open-source implementations of the standard library available; the one delivered with g++ still uses COW, but I suspect that most of the others use SSO.)

Tours answered 14/10, 2014 at 9:53 Comment(16)
Regarding the custom allocator: he wrote that he wants to wrap around an existing stack-allocated buffer, which should be possible, right?Upstage
@Upstage Maybe. There's still the problem of scope: when he returns a string, the copy will use the same allocator as what is being copied. If that in turn uses a stack-allocated buffer, he's going to be in deep trouble.Tours
In fact, I have no idea how to make allocators use local stack space, but I wondered whether someone could point out an easy way. Regarding finding an implementation that employs SSO: As I said, I am not worried about the memory management. I have implemented this already (for that static_vector) and I could even base static_basic_string on that, rather than implementing it again.Chil
Your advice to drop the parts of the interface that are harder to do, namely the find() family, seems good enough to me. I just asked all the developers here today when they had last used one of string's find functions, and it turned out that some had never used them, others once in a decade. So I guess I will just skip them for now.Chil
@Chil I'm about the same, and I do a lot of parsing. 99% of the time, I'll just use the functions in <algorithm>. Similarly, things like insert or replace seem to be pretty rare as well; I'll typically be building up a copy, and just be appending.Tours
@James: Thanks. Insert and replace can take a while to iron out the off-by-one errors :), but other than that I see no problem with them. However, finding the last occurrence of a substring is not that easy of an algorithm if you want to do it efficiently. However, the overwhelming majority here said that they haven't used this in years and, should they really need it, they can always copy the string into a standard library one and do it there. I guess I'll go with your advice then.Chil
It is possible to do it with custom silly allocator that simply returns the buffer, as I have shown below. It works on GCC and LLVM flavours of STL as expected (or unexpected, depending on your view). However, it is far too much trouble. There is no way to derive from basic_string that would work on a pre-allocated buffer. You have to instantiate the buffer, the allocator with that buffer and then the string with that allocator. It is far easier to implement a small custom class which replicates the desired subset of basic_string interface, as suggested in this answer.Mcadams
@arunasr Will it work in all cases. I thought of that myself, but then thought that it would cause problems when the implementation wanted to increase the capacity: it would do a second allocation, before freeing the first, and expect to have a pointer to different memory. (An insert in the middle, for example, where the std::string itself thought that it needed to reallocate. I would imagine that it copies the beginning, then the new characters, and then the end. If the new pointer points to the same buffer, copying the new characters will overwrite some of the end.)Tours
@JamesKanze There are 2 cases when the string needs to grow: append and insert. Both tested positive on GCC and LLVM STL, strange though it may be.Mcadams
@arunasr I don't think that append would be a problem: none of the existing characters move, so the copy would be into itself. insert might work, depending on the order of the copy, and the size of the insertion with respect to the number of characters behind it---if the implementation copies all of the old characters first, and the number of char's behind the insertion is not greater than the number being inserted, it will work. And of course, if the implementation uses something like memmove, and copies the new characters last, it will also workTours
@JamesKanze We have discussed to death (as I have done with sbi) what might or might not work... I have shown that it does. The code I've provided works with two different STL implementations. That does not mean it will work with MSVS, for instance, for all the reasons you and sbi have provided (and I agree with). It is easy enough to test that and prove one way or the other if you have MS compiler.Mcadams
@arunasr We must be doing something different, because when I try it, it doesn't work with g++, but it does with VC++. (But I've not tested it exhaustively with either; just one simple test.)Tours
@arunasr Just an update: it works with VC++ as long as the strings aren't longer than the SSO. With strings of 100 or more char's (and a buffer of 1000), it crashes with both g++ and VC++.Tours
@JamesKanze Perhaps g++ version... mine is 4.8.1. I also found that g++ uses allocator incorrectly to deallocate; it provides the wrong size of the block to deallocate. It also uses allocator default constructor, rebind and comparison in a weird way. I am very put off using g++ std::basic_string by what my investigation has shown... CLANG/LLVM works so much better.Mcadams
@arunasr Or perhaps we tested different things. In my case, I created a fairly long string in main, then inserted a string I'd returned from a function into it.Tours
@JamesKanze I did exactly what's below, nothing more, nothing less. With g++ and clang++ -stdlib=libc++Mcadams
A
1

It is easy, write a stack allocator, here's an example:

https://codereview.stackexchange.com/questions/31528/a-working-stack-allocator

With allocators, you can just as easily allocate, for example, from a memory-mapped file, i.e. from the disk drive, or from a static array of chars.

Araby answered 14/10, 2014 at 9:18 Comment(4)
This solution allocates strings inside a custom stack data structure, not on a true stack that compiler uses to remember return address. It still may work as a solution. I upvote.Draught
This code is in C++14. The requirement was C++03...still, the principle holds, it can be re-written. Note that std::basic_string uses the allocator in weird and wonderful ways, so you will have to add things like comparison operators. Also note that std::basic_string (and other containers) use copy constructor of the allocator liberally, and the stack_store does not handle that gracefully.Mcadams
@arunasr It is c++11 code. At least this is what I compile it with.Araby
@Araby The point is, it is not C++03/C++98. And it will not compile if used with basic_string because it does not implement operator== and struct rebind. I am not sure how it will handle copy operation, but by the look of it, it is not going to be pretty.Mcadams
B
1

There are lots of basic_string implementations, some entirely based on dynamic allocation, some other on dynamic allocation only for string wider than a given length (in fact, they use their own internal buffer when it fits).

Using an allocator is probably not the way to go, since the interface between the string and the allocator assumes that the allocator object is part of the container, but the allocated memory comes from outside the container itself. You can arrange it by implementing an allocator using the POSIX alloca, with all the drawbacks.

The problem when implementing strings on stack is that you cannot let it dynamically grow (may be the stack at that time has something more over) but you also have to take care of operations like += that can make a string longer and longer.

So you end up by preallocating (as an array or as a alloca supplied buffer, within your class or within an allocator does not change the problem) an amount of bytes you'll mostly waste either but not filling them all, or by not using them if the string has grown too much and requires to be dynamic.

There is probably a trade-off tightened to the memory-to-cache communication process (usually runs with 128 bytes or 4KBytes), but it is strongly hardware dependent, so the complexity to afford will not most likely pay for.

A more affordable solution can be an allocator that still allocates on the heap, but has the capability to retain and reuse the returned blocks (up to certain limit) reducing the need to ask the system to allocated / deallocate.

But performance, in this case, may not necessarily benefit if the underlying system already implement new/delete in that way.

Basir answered 14/10, 2014 at 9:32 Comment(4)
Yes, this certainly requires overallocation. However, the strings in question are usually limited to 16 or 32 characters, so this shouldn't be a problem. Also, this all is to interface with a C API which presumes pre-allocated buffers, which must not be re-allocated, and thus also overallocate.Chil
If that's the case, you need a size_t plus a space that can either contain 32 chars or a pointer (the discriminator is size()<32). Due to the peculiarity of the data, you probably will make it an independent self-contained class, eventually implicitly convertible to ad from std::string. The use of a custom allocators together with basic_string can be dagerous: what happens if a basic_string (that assumes data are outside of it and can be moved around) tries to "move"?Basir
What you describe is SSO. But we can't use that, because it falls back on dynamic memory – which we cannot use here.Chil
Yep, so it seems. Well, that's what I am doing now.Chil
P
1

LLVM ADT has the SmallString class. It also has SmallVector and many other useful classes.

While the current LLVM code base moves towards using C++11, (not-so-)old versions of LLVM support C++03.

Presber answered 14/10, 2014 at 9:42 Comment(3)
Interesting. Can this be used as a drop-in replacement for std::basic_string?Chil
@Chil its interface looks similar to the basic_string's, but it doesn't have some constructorsPresber
Mhmm. I'd rather have a something I can drop in, recompile, and be done. (As I wrote, the interface is to be std::basic_string.) Thanks anyway!Chil
R
1

An excellent starting point is Alexandrescu's policy-based string class, described in this Dr Dobbs article. It includes a SSO policy that basically does what you want (search the page for SmallStringOpt), and is easy to modify if/as you deem necessary. It's predates C++11 so you're fine there too.

Restorative answered 14/10, 2014 at 11:4 Comment(8)
The intrinsics of memory management for this I have covered. It's the rewriting of some of the other operations that made me question my approach.Chil
@Chil "rewriting of some of the other operations" - when you instantiate Alexandrescu's template with the SSO policy, you get all the other std::string-style operations. How does that not address your concern?Restorative
Ah, OK then, if that is a drop-in replacement for std::basic_string, this should work. Thanks, I'll look at it.Chil
@Chil Yup - first paragraph - "...article provides and explains a policy-based basic_string implementation that gives you twelve Standard-compliant canned implementations featuring various optimizations and tradeoffs...". Cheers.Restorative
Thanks, I could indeed "borrow" those implementations from there (+1). However, as James has pointed out, I could just as well leave them unimplement for the time being, because they are rarely ever needed.Chil
@sbi: yeah - I saw that conversation - pretty surreal. I'd be conscious of future developers' expectations, but I'm sure you can weight all that without me.... Cheers.Restorative
This isn't to be used everywhere, it's a special part of the code that serves a specific API. Strings to interact with that API must not reallocate. If we can drop in something that allows all the string operations commonly done on these strings, we're fine. Since few can remember ever having used this in their whole professional carrier, and since nobody here ever used it in the last few years, and since that API isn't for general use, I think we're good here. And if this becomes a problem, we can still implement it.Chil
@TonyD Future developers' expectations is an issue, and it depends somewhat on the organization. Where I work, there would be no problem: if a developer did use a function which wasn't there, he'd complain to me, and I'd add it. On an as needed basis; in the environment where I'm working, it wouldn't be an issue. But I've worked in other places where it would be. It depends on the work environment. (Typically, if there is a large body of programming users, it will be a problem. If you're not too many, and the communications are good, it won't be.)Tours
A
0

I'd use a combination of implementation-defined VLAs and standard algorithms, I should think.

Acetum answered 14/10, 2014 at 9:46 Comment(0)
M
-1

This is a working code, but NOT THE RECOMMENDED WAY.

This code has a lot of traces to show what it is doing. It does not check that the size of the allocation request does not exceed the buffer. You can this check if necessary. Note that std::basic_string tries to allocate more than necessary.

#include <string>
#include <iostream>

template<typename T, size_t S>
class fixed_allocator
{
  typedef std::allocator<T> _base;

  std::ostream& trace() const { return std::cerr << "TRACE fixed_allocator " << (void*)this ; }

public:
  typedef typename _base::value_type value_type;
  typedef typename _base::pointer pointer;
  typedef typename _base::const_pointer const_pointer;
  typedef typename _base::reference reference;
  typedef typename _base::const_reference const_reference;
  typedef typename _base::size_type size_type;
  typedef typename _base::difference_type difference_type;

  template<class C> struct rebind {
    typedef fixed_allocator<C, S*sizeof(C)/sizeof(T)> other;
  };
  T* buffer_;

  fixed_allocator(T* b) : buffer_(b) { trace() << "ctor: p="  << (void*)b << std::endl; }

  fixed_allocator() : buffer_(0) { trace() << "ctor: NULL" << std::endl; };
  fixed_allocator(const fixed_allocator &that) : buffer_(that.buffer_) { trace() << "ctor: copy " << (void*)buffer_ << " from " << (void*) &that << std::endl; };

  pointer allocate(size_type n, std::allocator<void>::const_pointer hint=0) {
    trace() << "allocating on stack " << n << " bytes" << std::endl;
    return buffer_;
  }

  bool operator==(const fixed_allocator& that) const { return that.buffer_ == buffer_; }
  void deallocate(pointer p, size_type n) {/*do nothing*/}
  size_type max_size() const throw() { return S; }
};

int main()
{
  char buffer_[256];
  fixed_allocator<char, 256> ator(buffer_);
  std::basic_string<char, std::char_traits<char>, fixed_allocator<char, 256> > str(ator);
  str.assign("ipsum lorem");
  std::cout << "String: '" << str << "' length " << str.size() << std::endl;
  std::cout << " has 'l' at " << str.find("l") << std::endl;
  str.append(" dolor sit amet");
  std::cout << "String: '" << str << "' length " << str.size() << std::endl;
  str.insert(0, "I say, ");
  std::cout << "String: '" << str << "' length " << str.size() << std::endl;
  str.insert(7, "again and again and again, ");
  std::cout << "String: '" << str << "' length " << str.size() << std::endl;
  str.append(": again and again and again, ");
  std::cout << "String: '" << str << "' length " << str.size() << std::endl;
  return 0;
}

This code has been tested on GCC and LLVM and performs as expected (or unexpected).

The syntax in unwieldy. It is not possible to derive from basic_string and embed the buffer. Far better way is to create a small specialised buffer_string class with the required subset of basic_string interface.

Mcadams answered 14/10, 2014 at 11:28 Comment(10)
Have you tried manipulating the string after assigning to it for the first time so that it would need to grow?Chil
The string is created empty and then manipulated (assigned). Does that count as "manipulation so that it needs to grow"? I did not try anything more than you see in the code, but I expect basic_string to simply call "allocate", possibly with the hint, and all it will get is the same buffer.Mcadams
str.append(" dolor sit amet"); worked perfectly well, if you allocate big enough buffer. basic_string tried to allocate 51 bytes at this, which cause "stack smash" exception with the original program. Having increased the buffer to 60, no problem.Mcadams
My point is that, when the string needs to reallocate, you simply return the pointer to the current data. However, the string will copy the old data to the newly allocated memory. Yes, I forgot that this would appear to work fine if the start of the string isn't changed. But it's still wrong. (Unless I am – again – missing something, inserting would unveil this flaw.)Chil
Don't be lazy! Cut, paste, compile and play. str.insert(0, "I say, "); works perfectly well.Mcadams
Sigh. Then that's because of the string overallocating (capacity()>size()). Just look at what std::basic_string actually does when the size needs to get extended, and then look at how this fits with your allocator. It cannot work, even though it might appear to. "Appears to work" is simply one way UB can express itself.Chil
In my tests, append, insert(0, ...) and insert(7, ...) work with both GCC and CLANG LLVM. Trace messages show it is resizing the buffer (growing); GCC grows 4 times (25, 36, 51, 77, 129 bytes) and CLANG grows once (48, 129 bytes). There are no overruns. basic_string appears to correctly move the tail of the string before copying the inserted string into the gap. I really do not understand what else you want...(sigh)Mcadams
I might be wrong, of course. From what I understand: If the string contains "lorem ipsum", and you insert "blah " at index 5, then, if the resulting size exceeds the capacity, the string will 1) allocate a new buffer and then either 2a) copy the old buffer to the new buffer and perform insertion there, or 2b) copy the first part of the old buffer over, append the new string, and then append the remainder of the old buffer. Finally it will 3) discard the old buffer. 2a would work with your allocator 2b (more efficient) would fail. The platforms you checked seem to employ 2a.Chil
I would not have expected platforms to do that. But, again, I might be wrong. It could be that, for some reason I didn't think of, vendors are required to use the less efficient copy-first-insert-later. Or my logic module is broken. Whatever is the case here, based on my current knowledge I would not do what you do.Chil
Let us continue this discussion in chat.Mcadams

© 2022 - 2024 — McMap. All rights reserved.