Array placement-new requires unspecified overhead in the buffer?
Asked Answered
S

7

69

5.3.4 [expr.new] of the C++11 Feb draft gives the example:

new(2,f) T[5] results in a call of operator new[](sizeof(T)*5+y,2,f).

Here, x and y are non-negative unspecified values representing array allocation overhead; the result of the new-expression will be offset by this amount from the value returned by operator new[]. This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another. —end example ]

Now take the following example code:

void* buffer = malloc(sizeof(std::string) * 10);
std::string* p = ::new (buffer) std::string[10];

According to the above quote, the second line new (buffer) std::string[10] will internally call operator new[](sizeof(std::string) * 10 + y, buffer) (before constructing the individual std::string objects). The problem is that if y > 0, the pre-allocated buffer will be too small!

So how do I know how much memory to pre-allocate when using array placement-new?

void* buffer = malloc(sizeof(std::string) * 10 + how_much_additional_space);
std::string* p = ::new (buffer) std::string[10];

Or does the standard somewhere guarantee that y == 0 in this case? Again, the quote says:

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

Sugden answered 4/1, 2012 at 0:3 Comment(16)
Question from chat.stackoverflow.com/transcript/message/2270516#2270516Sugden
For clarity purposes, what is "f" ?Sherrill
@JaredKrumsie: The C++11 standard doesn't clarify. Apparently it simply represents any arbitrary value of any arbitrary type. For the purpose of this particular question, I suppose it must represent a char*.Sugden
I don't think you can know that at all. I think placement new was always rather thought of like a tool to use your own memory manager, than something allowing you to pre-allocate memory. Anyway, why don't you simply loop through array with regular new? I don't think it will influence performancee much because placement new is basically a no-op, and constructors for all objects in array have to be called separately anyway.Ianthe
@Ianthe that's not as simple as it looks! If one of the constructors throws midway through the loop you have to clean up the objects you already constructed, something array-new forms do for you. But everything seems to indicate placement-array-new cannot be safely used.Labret
@FredOverflow: Thanks a ton for clarifying the question.Sugden
What is the point of the x and y additional space (had to find the x value here, since it wasn't included)? If it is for when an exception occurs, then it should be stated in the standard. If it is compiler implementation specific, that makes it totally useless from a portability standpoint.Peacock
@Adrian: The point of the space is presumably so that the implementation can tell how many destructors to call. Without that unspecified space, it would be nearly impossible for delete[] to know how many objects there are.Sugden
Same idea. To be able to determine how many objects to call the destruction on. However, since delete[] requires this, this should be defined in the standard somewhere as to what it contains, or at least that it should be either defined as a class/struct that has particular properties.Peacock
@Adrian: An implementaiton could also place other information there, such as alignment, or whatever it needs. If it was defined in the standard, it would be impossible to implement correctly in a standards compliant way. That's why they have implementation defined details...Sugden
True, but it would make for portable code if all of these things could be stipulated in the document and have a means of determining these values through code. I find that C++ is still somewhat of an experimental language, so I think the designers don't want to paint themselves into a corner. But that's the reason for communication between the different stake holders and the designers, to attempt to keep that from happening.Peacock
@Adrian: it's also plausable that it's designed this way so that an implementation could store the number of destructors to call in one place, and have 0 overhead in another place in the same program, if the value is known elsewhere.Sugden
That is what would make sense and how I thought that it was done. However, if that were the case, it should be an implementation detail of the operator new[] and operator delete[] in whatever scope they are located in to deal with this extra overhead internally rather then having this overhead passed along with the minimal required space. I think that was the original intent, but if a constructor throws an exception, this can cause a problem if it's not known how many elements have been constructed. What's really missing from C++ is a way to define how to construct an array of elements.Peacock
This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) Ugh, that's horrible (and I'm not sure I believe it - it's nonsensical).Czarina
Huh, still in the latest draft standard: eel.is/c++draft/expr.new#15. I still don't believe it.Czarina
Wow, ended up here watching youtu.be/IAdLwUXRUvg?t=1337. This is extremely scary, and it ensues that placement new for array is practically un-usable. Are there warnings in place on majors compilers to avoid such catastrophic failure?Pigfish
B
52

Update

Nicol Bolas correctly points out in the comments below that this has been fixed such that the overhead is always zero for operator new[](std::size_t, void* p).

This fix was done as a defect report in November 2019, which makes it retroactive to all versions of C++.

Original Answer

Don't use operator new[](std::size_t, void* p) unless you know a-priori the answer to this question. The answer is an implementation detail and can change with compiler/platform. Though it is typically stable for any given platform. E.g. this is something specified by the Itanium ABI.

If you don't know the answer to this question, write your own placement array new that can check this at run time:

inline
void*
operator new[](std::size_t n, void* p, std::size_t limit)
{
    if (n <= limit)
        std::cout << "life is good\n";
    else
        throw std::bad_alloc();
    return p;
}

int main()
{
    alignas(std::string) char buffer[100];
    std::string* p = new(buffer, sizeof(buffer)) std::string[3];
}

By varying the array size and inspecting n in the example above, you can infer y for your platform. For my platform y is 1 word. The sizeof(word) varies depending on whether I'm compiling for a 32 bit or 64 bit architecture.

Buttonhole answered 4/1, 2012 at 4:13 Comment(16)
theres another good idea I never considered! I believe the spec says it may vary from call to call, but this handles even that correctly!Sugden
How does this account for alignment, though? Is the offset guaranteed to fit into the required alignment?Catechol
Can't you infer y simply by taking the difference of the (bytecast) pointers buffer and p?Catechol
@Kerrek SB: You are correct that I was careless with alignment. I've added alignas to the client code to make things right. The placement new expression should take care of alignment with respect to the "cookie" and the "data" respectively. For example here is how the Itanium ABI does it (sourcery.mentor.com/public/cxx-abi/abi.html#array-cookies). And yes, you can infer y as you suggest. Be aware that y may be dependent on the alignment of the new'd type, and on whether or not that type has a trivial destructor (and other platforms may have other details).Buttonhole
@HowardHinnant: I'm still baffled that the placement version requires any cookie at all. What's it for? What's in it? After all, the only way you can destroy those array elements is by hand, isn't it? Your link even says that there's no cookie for the placement version (size_t, void*). Do you think the non-zeroness of the cookie should be a defect report?Catechol
@Kerrek SB: Well that's a good question and I'm not sure I have a good answer for it. I suppose that some hypothetical user-written placement delete, which is called in case there is an exception thrown during the default construction of each element, might make use of the cookie during clean up. But I don't have a good example of such a case in my back pocket. And even if such a hypothetical user-written placement delete existed, it would necessarily be platform dependent. On the bright side, it is legal for sizeof(y) to be 0. :-)Buttonhole
If you would like to submit a defect report on this, it should be aimed at the CWG (as opposed to the LWG). Here is the CWG issues list: open-std.org/jtc1/sc22/wg21/docs/cwg_active.html . And your best strategy for submitting an issue is to email the author of that list. I don't know if an issue demanding y == 0 always would be successful if for no other reason than backwards compatibility with established ABI's such as the Itanium ABI. Breaking ABI at this low level is very daunting.Buttonhole
@HowardHinnant: Thanks! I posted a DR to the standard mailing list for now, let's see if it makes it past the moderators! Unfortunately I can't reproduce any case where y is ever non-zero, but I don't have access to an Itanium, alas.Catechol
It seems that here is already a defect report on this matter! D'oh...Catechol
Wow, after 7 years of no action maybe you're the person to write that requested paper! :-)Buttonhole
@HowardHinnant: Who, me? I made a DR post on the mailing list, but no response as of yet.Catechol
@KerrekSB: The cookie is needed when the compiler decides it needs the length of the allocated array. Note well: when it so decides, and thus it is not always present. Notably, when objects with trivial destructors are involved, the cookie might just be omitted. This happens with any placement overloads, which means that all of them should be avoided for arrays, not only the "buffer" overload for void*.Fulsome
@Fulsome Array placement-new (void* operator new[]( std::size_t count, void* ptr );) does not allocate memory. It's a no-op.Samaveda
@Fulsome that there's a defect in the standard that has been ignored for far too long.Samaveda
@HowardHinnant: FWIW, this has been changed in C++20. The non-allocating placement operator new[] doesn't deal with these offsets anymore.Plumber
Thanks! I've updated the answer with this information.Buttonhole
C
9

Update: After some discussion, I understand that my answer no longer applies to the question. I'll leave it here, but a real answer is definitely still called for.

I'll be happy to support this question with some bounty if a good answer isn't found soon.

I'll restate the question here as far as I understand it, hoping that a shorter version might help others understand what's being asked. The question is:

Is the following construction always correct? Is arr == addr at the end?

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

We know from the standard that #1 causes the call ::operator new[](???, addr), where ??? is an unspecified number no smaller than N * sizeof(T), and we also know that that call only returns addr and has no other effects. We also know that arr is offset from addr correspondingly. What we do not know is whether the memory pointed to by addr is sufficiently large, or how we would know how much memory to allocate.


You seem to confuse a few things:

  1. Your example calls operator new[](), not operator new().

  2. The allocation functions do not construct anything. They allocate.

What happens is that the expression T * p = new T[10]; causes:

  1. a call to operator new[]() with size argument 10 * sizeof(T) + x,

  2. ten calls to the default constructor of T, effectively ::new (p + i) T().

The only peculiarity is that the array-new expression asks for more memory than what is used by the array data itself. You don't see any of this and cannot make use of this information in any way other than by silent acceptance.


If you are curious how much memory was actually allocated, you can simply replace the array allocation functions operator new[] and operator delete[] and make it print out the actual size.


Update: As a random piece of information, you should note that the global placement-new functions are required to be no-ops. That is, when you construct an object or array in-place like so:

T * p = ::new (buf1) T;
T * arr = ::new (buf10) T[10];

Then the corresponding calls to ::operator new(std::size_t, void*) and ::operator new[](std::size_t, void*) do nothing but return their second argument. However, you do not know what buf10 is supposed to point to: It needs to point to 10 * sizeof(T) + y bytes of memory, but you cannot know y.

Catechol answered 4/1, 2012 at 0:5 Comment(8)
You should expand on the difference between what new does and the operator new function. Until the linked convo, I thought new was simply syntactic sugar. Also, the calls to operator new instead of operator new[] were typos. I did it AGAIN in this comment :(Sugden
@MooingDuck: I, and others, have done so countless times before on SO. I recommend a Good Book, or searching SO.Catechol
But what about new(buf) T[10]? How do you make buf big enough? (Coming from the chat discussion I know this is the actual intended question, but it was not made clear :( )Labret
@R.MartinhoFernandes: You're absolutely right; I've amended the answer, and basically I don't have an answer to the question now. I won't delete this unless someone takes exception to it, but we definitely need a proper answer still.Catechol
Just to clarify, we agree that ::new(buf) T[n] requires exactly sizeof(T[n]) bytes, right? And that it's the unqualified call, new(buf) T[n], that is unspecified?Gifted
@GMan: No! On the contrary: We have no idea how much memory is required by ::new (buf) T[n]! That's what the initial quote from 5.3.4 says: We call ::operator new[](sizeof(T) * n + y, buf), with no knowledge about y.Catechol
@KerrekSB: I think there's a contradiction in your answer. First you say We also know that arr is offset from addr correspondingly for T * arr = ::new (addr) T[N]; then you say arr == buf10 for T * arr = ::new (buf10) T[10]; ... which is it?Chlorosis
@etherice: You're right, I'm not really sure why I wrote that. The global placement allocation function is a no-op, but you can't control the amount of space that's required. So buf10 needs to point to 10 * sizeof(T) + y bytes of memory, but you cannot know y. I'll edit this.Catechol
H
7

Calling any version of operator new[] () won't work too well with a fixed size memory area. Essentially, it is assumed that it delegates to some real memory allocation function rather than just returning a pointer to the allocated memory. If you already have a memory arena where you want to construct an array of objects, you want to use std::uninitialized_fill() or std::uninitialized_copy() to construct the objects (or some other form of individually constructing the objects).

You might argue that this means that you have to destroy the objects in your memory arena manually as well. However, calling delete[] array on the pointer returned from the placement new won't work: it would use the non-placement version of operator delete[] ()! That is, when using placement new you need to manually destroy the object(s) and release the memory.

Hypolimnion answered 4/1, 2012 at 5:58 Comment(5)
Good point about placement operator delete[](). @Mooing Duck: pay attention to it.Niple
I'm aware that placement-newed objects have to be delted manually. uninitialized_fill is a good idea, but you seem to be saying that the overloaded operator new for arrays that takes a buffer in the C++ spec wont work for what it's designed for. Is that what you're saying? (That is what chat determined.)Sugden
placement operator new[]() is working what it is intended for: allocate memory in a way using additional arguments and constructing objects in this memory. What doesn't seem to work portably is the version which only takes a void* to already allocated memory. Given that you wouldn't know where the objects end up at it seems questionable anyway.Bottommost
The entire point is that only the standard delete[] operator requires the information that is stored in the extra bytes (both for going through the array, invoking each element's destructor, and for passing the size of the array to the deallocation function, if it needs it). The interesting question for me is now whether the standard actually says so, or if we've found a defect.Ignatia
I don't think this qualifies as a defect. However, I agree that the standard may be enhanced to remove the possibility of using more memory than the objects need.Bottommost
B
7

As mentioned by Kerrek SB in comments, this defect was first reported in 2004, and it was resolved in 2012 as:

The CWG agreed that EWG is the appropriate venue for dealing with this issue.

Then the defect was reported to EWG in 2013, but closed as NAD (presumably means "Not A Defect") with the comment:

The problem is in trying to use array new to put an array into pre-existing storage. We don't need to use array new for that; just construct them.

which presumably means that the suggested workaround is to use a loop with a call to non-array placement new once for each object being constructed.


A corollary not mentioned elsewhere on the thread is that this code causes undefined behaviour for all T:

T *ptr = new T[N];
::operator delete[](ptr);

Even if we comply with the lifetime rules (i.e. T either has trivial destruction, or the program does not depend on the destructor's side-effects), the problem is that ptr has been adjusted for this unspecified cookie, so it is the wrong value to pass to operator delete[].

Billboard answered 10/3, 2016 at 20:42 Comment(0)
P
4

Note that C++20 changes this answer.

C++17's (and before) [expr.new]/11 clearly says that this function may get an implementation defined offset to its size:

When a new-expression calls an allocation function and that allocation has not been extended, the new-expression passes the amount of space requested to the allocation function as the first argument of type std​::​size_­t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array.

This permits, but does not require, that the size given to the array allocation function could be increased from sizeof(T) * size.

C++20 explicitly disallows this. From [expr.new]/15:

When a new-expression calls an allocation function and that allocation has not been extended, the new-expression passes the amount of space requested to the allocation function as the first argument of type std​::​size_­t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array and the allocation function is not a non-allocating form ([new.delete.placement]).

Emphasis added. Even the non-normative note you quoted was changed:

This overhead may be applied in all array new-expressions, including those referencing a placement allocation function, except when referencing the library function operator new[](std​::​size_­t, void*).

Plumber answered 21/3, 2021 at 19:21 Comment(1)
But other forms of placement new (i.e not the specified non-allocating form) may still incur an overhead?Metallurgy
I
1

After reading corresponding standard sections I am satarting to think that placement new for array types is simply useless idea, and the only reason for it being allowed by standard is generic way in which new-operator is described:

The new expression attempts to create an object of the typeid (8.1) or newtypeid to which it is applied. The type of that object is the allocated type. This type shall be a complete object type, but not an abstract class type or array thereof (1.8, 3.9, 10.4). [Note: because references are not objects, references cannot be created by newexpressions. ] [Note: the typeid may be a cvqualified type, in which case the object created by the newexpression has a cvqualified type. ]

new-expression: 
    ::(opt) new new-placement(opt) new-type-id new-initializer(opt)
    ::(opt) new new-placement(opt) ( type-id ) new-initializer(opt)

new-placement: ( expression-list )

newtypeid:
    type-specifier-seq new-declarator(opt)

new-declarator:
    ptr-operator new-declarator(opt)
    direct-new-declarator

direct-new-declarator:
    [ expression ]
    direct-new-declarator [ constant-expression ]

new-initializer: ( expression-list(opt) )

To me it seems that array placement new simply stems from compactness of the definition (all possible uses as one scheme), and it seems there is no good reason for it to be forbidden.

This leaves us in a situation where we have useless operator, which needs memory allocated before it is known how much of it will be needed. The only solutions I see would be to either overallocate memory and hope that compiler will not want more than supplied, or re-allocate memory in overriden array placement new function/method (which rather defeats the purpose of using array placement new in the first place).


To answer question pointed out by Kerrek SB: Your example:

void * addr = std::malloc(N * sizeof(T));
T * arr = ::new (addr) T[N];                // #1

is not always correct. In most implementations arr!=addr (and there are good reasons for it) so your code is not valid, and your buffer will be overrun.

About those "good reasons" - note that you are released by standard creators from some house-keeping when using array new operator, and array placement new is no different in this respect. Note that you do not need to inform delete[] about length of array, so this information must be kept in the array itself. Where? Exactly in this extra memory. Without it delete[]'ing would require keeping array length separate (as stl does using loops and non-placement new)

Ianthe answered 4/1, 2012 at 22:54 Comment(3)
There is no placement-delete, though, so that last argument doesn't really work...Catechol
This is true, but i guess placement or not it should still produce binary-identical structure in memory.Ianthe
Not in the least! The binary structure isn't mandated anywhere, and it isn't even the same for all standard array-news -- rather, it depends on the type.Catechol
S
1

This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions.

This is a defect in the standard. Rumor has it they couldn't find a volunteer to write an exception to it (Message #1173).

The non-replaceable array placement-new cannot be used with delete[] expressions, so you need to loop through the array and call each destructor.

The overhead is targetted at the user-defined array placement-new functions, which allocate memory just like the regular T* tp = new T[length]. Those are compatible with delete[], hence the overhead that carries the array length.

Samaveda answered 26/4, 2016 at 3:51 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.