Variable Sized Struct C++
Asked Answered
C

11

21

Is this the best way to make a variable sized struct in C++? I don't want to use vector because the length doesn't change after initialization.

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];
};

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bytelength = length;
    return output;
}

Edit: renamed variable names and changed code to be more correct.

Cyclonite answered 27/3, 2009 at 3:48 Comment(1)
I removed the C tag from this question, since you seem to be after C++ answers, and the techniques differ.Garonne
D
9

Some thoughts on what you're doing:

  • Using the C-style variable length struct idiom allows you to perform one free store allocation per packet, which is half as many as would be required if struct Packet contained a std::vector. If you are allocating a very large number of packets, then performing half as many free store allocations/deallocations may very well be significant. If you are also doing network accesses, then the time spent waiting for the network will probably be more significant.

  • This structure represents a packet. Are you planning to read/write from a socket directly into a struct Packet? If so, you probably need to consider byte order. Are you going to have to convert from host to network byte order when sending packets, and vice versa when receiving packets? If so, then you could byte-swap the data in place in your variable length struct. If you converted this to use a vector, it would make sense to write methods for serializing / deserializing the packet. These methods would transfer it to/from a contiguous buffer, taking byte order into account.

  • Likewise, you may need to take alignment and packing into account.

  • You can never subclass Packet. If you did, then the subclass's member variables would overlap with the array.

  • Instead of malloc and free, you could use Packet* p = ::operator new(size) and ::operator delete(p), since struct Packet is a POD type and does not currently benefit from having its default constructor and its destructor called. The (potential) benefit of doing so is that the global operator new handles errors using the global new-handler and/or exceptions, if that matters to you.

  • It is possible to make the variable length struct idiom work with the new and delete operators, but not well. You could create a custom operator new that takes an array length by implementing static void* operator new(size_t size, unsigned int bitlength), but you would still have to set the bitlength member variable. If you did this with a constructor, you could use the slightly redundant expression Packet* p = new(len) Packet(len) to allocate a packet. The only benefit I see compared to using global operator new and operator delete would be that clients of your code could just call delete p instead of ::operator delete(p). Wrapping the allocation/deallocation in separate functions (instead of calling delete p directly) is fine as long as they get called correctly.

Duodecimal answered 27/3, 2009 at 5:15 Comment(0)
B
7

If you never add a constructor/destructor, assignment operators or virtual functions to your structure using malloc/free for allocation is safe.

It's frowned upon in c++ circles, but I consider the usage of it okay if you document it in the code.

Some comments to your code:

struct Packet
{
    unsigned int bitlength;
    unsigned int data[];
};

If I remember right declaring an array without a length is non-standard. It works on most compilers but may give you a warning. If you want to be compliant declare your array of length 1.

Packet* CreatePacket(unsigned int length)
{
    Packet *output = (Packet*) malloc((length+1)*sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

This works, but you don't take the size of the structure into account. The code will break once you add new members to your structure. Better do it this way:

Packet* CreatePacket(unsigned int length)
{
    size_t s = sizeof (Packed) - sizeof (Packed.data);
    Packet *output = (Packet*) malloc(s + length * sizeof(unsigned int));
    output->bitlength = length;
    return output;
}

And write a comment into your packet structure definition that data must be the last member.

Btw - allocating the structure and the data with a single allocation is a good thing. You halve the number of allocations that way, and you improve the locality of data as well. This can improve the performance quite a bit if you allocate lots of packages.

Unfortunately c++ does not provide a good mechanism to do this, so you often end up with such malloc/free hacks in real world applications.

Bifocal answered 27/3, 2009 at 4:3 Comment(3)
Hi this seems to be the best solution, but my compiler (gcc on mingw) won't let me do sizeof(Packet.data). It will however, let me do Packet test; sizeof(test.data);Cyclonite
sizeof(Packet) will be a multiple of the alignment requirement and not the actual size of the struct. For example struct Foo { uint64_t magic; uint32_t length; uint8_t buf[]; } has a sizeof(Foo) == 16 and sizeof(Foo) - sizeof(Foo::buf) == 16, too. The size you want would be offsetof(Foo, buf) == 12.Illiteracy
@Cyclonite you can do sizeof(Packet::data) instead.Oligochaete
R
4

This is OK (and was standard practice for C).

But this is not a good idea for C++.
This is because the compiler generates a whole set of other methods automatically for you around the class. These methods do not understand that you have cheated.

For Example:

void copyRHSToLeft(Packet& lhs,Packet& rhs)
{
    lhs = rhs;  // The compiler generated code for assignement kicks in here.
                // Are your objects going to cope correctly??
}


Packet*   a = CreatePacket(3);
Packet*   b = CreatePacket(5);
copyRHSToLeft(*a,*b);

Use the std::vector<> it is much safer and works correctly.
I would also bet it is just as efficient as your implementation after the optimizer kicks in.

Alternatively boost contains a fixed size array:
http://www.boost.org/doc/libs/1_38_0/doc/html/array.html

Reredos answered 27/3, 2009 at 4:38 Comment(10)
The problem with the boost array template for this user is that the size for boost::array<> is determined at compile time.Binder
If I use vector though, won't that be non-contiguous from the length member?Cyclonite
It may. There is no guarantee on that. But why would you want to make sure of that ?Backfield
@unknown - You didn't mention that as a requirement in your question - you might want to clarify (and mention why you need the data to be contiguous with the length).Binder
Well, I'm not sure it needs to be a requirement, but it just makes more sense to me to make it contiguous. Doesn't it?Cyclonite
No, if it is not a requirement then expecting or requiring a specific memory layout is silly. Let the compiler work that out you work on how to use the object correctlyReredos
He is worried about there being a second allocation: new vector<int>(50); will cause two allocation: one for the vector object and one for the array of 50 ints maintained by the vector object.Halfsole
sizeof(Class) returns the size of the class with all it's "hidden" members, so you can safely malloc(sizeof(Class) + extra) and write to the extra bytes after the Class pointer.Oligochaete
@YoYoYonnY You can but you its not as simple as you make out. There are a couple of issues here. 1: Class is not initialized. 2a: If you copy it (using normal C++) it will break. 2b: If you assign it (using normal C++) it will break. 3: Is your extra correctly aligned (depends on sizeof(Class). 4: You can not use delete (so destructor will not be called for Class). Now 1/3/4 can be easily solved using POD structure but 2 is not so easy to solve as you have bypassed the standard C++ mechanisms.Reredos
@MartinYork All of these except alignment issues are solved by placement new, and malloc/realloc/calloc always allocate memory with maximum alignment (Not sure about new uint8_t[] though). You have to explicitly free this memory as usual. You will have explicitly call the destructor, but there's nothing preventing you from creating a scoped/smart pointer that automatically destructs your class and deallocates the memory for you.Oligochaete
B
3

You can use the "C" method if you want but for safety make it so the compiler won't try to copy it:

struct Packet
{
    unsigned int bytelength;
    unsigned int data[];

private:
   // Will cause compiler error if you misuse this struct
   void Packet(const Packet&);
   void operator=(const Packet&);
};
Burhans answered 27/3, 2009 at 14:3 Comment(0)
B
1

I'd probably just stick with using a vector<> unless the minimal extra overhead (probably a single extra word or pointer over your implementation) is really posing a problem. There's nothing that says you have to resize() a vector once it's been constructed.

However, there are several The advantages of going with vector<>:

  • it already handles copy, assignment & destruction properly - if you roll your own you need to ensure you handle these correctly
  • all the iterator support is there - again, you don't have to roll your own.
  • everybody already knows how to use it

If you really want to prevent the array from growing once constructed, you might want to consider having your own class that inherits from vector<> privately or has a vector<> member and only expose via methods that just thunk to the vector methods those bits of vector that you want clients to be able to use. That should help get you going quickly with pretty good assurance that leaks and what not are not there. If you do this and find that the small overhead of vector is not working for you, you can reimplement that class without the help of vector and your client code shouldn't need to change.

Binder answered 27/3, 2009 at 4:45 Comment(0)
W
1

There are already many good thoughts mentioned here. But one is missing. Flexible Arrays are part of C99 and thus aren't part of C++, although some C++ compiler may provide this functionality there is no guarantee for that. If you find a way to use them in C++ in an acceptable way, but you have a compiler that doesn't support it, you perhaps can fallback to the "classical" way

Whitethroat answered 27/3, 2009 at 9:22 Comment(1)
C99 is included in C++11. Though I realize your comment predates C++11.Esp
H
0

If you are truly doing C++, there is no practical difference between a class and a struct except the default member visibility - classes have private visibility by default while structs have public visibility by default. The following are equivalent:

struct PacketStruct
{
    unsigned int bitlength;
    unsigned int data[];
};
class PacketClass
{
public:
    unsigned int bitlength;
    unsigned int data[];
};

The point is, you don't need the CreatePacket(). You can simply initialize the struct object with a constructor.

struct Packet
{
    unsigned long bytelength;
    unsigned char data[];

    Packet(unsigned long length = 256)  // default constructor replaces CreatePacket()
      : bytelength(length),
        data(new unsigned char[length])
    {
    }

    ~Packet()  // destructor to avoid memory leak
    {
        delete [] data;
    }
};

A few things to note. In C++, use new instead of malloc. I've taken some liberty and changed bitlength to bytelength. If this class represents a network packet, you'll be much better off dealing with bytes instead of bits (in my opinion). The data array is an array of unsigned char, not unsigned int. Again, this is based on my assumption that this class represents a network packet. The constructor allows you to create a Packet like this:

Packet p;  // default packet with 256-byte data array
Packet p(1024);  // packet with 1024-byte data array

The destructor is called automatically when the Packet instance goes out of scope and prevents a memory leak.

Helbon answered 27/3, 2009 at 4:2 Comment(5)
You would be better off using a default value in the constructor and have a single one than having duplicated code with a default constructorZebra
Good point. I've been doing a lot of C# lately, which doesn't allow default parameters. I'll update the post.Helbon
Your initialization of the data member isn't going to work, because it's not a pointer. If you change it to a pointer, you lose the contiguous layout of the length with the data, which I think is the O.P.s goal.Misti
I say the opposite - don't ever use new[]. It's no more lightweight then vector but you have to remember to delete it yourself. The C++ FAQ agrees with me: parashift.com/c++-faq-lite/containers.htmlBurhans
This doesn't work. Variable sized structs must be created with malloc, which means no constructor. If you declare them as variables, the variable size part is set to zero. The space just isn't allocated. And initializing a member array from a heap pointer just doesn't make sense (or compile). This code would work if data were a pointer, but then you might as well be using std::vector.Esp
B
0

You probably want something lighter than a vector for high performances. You also want to be very specific about the size of your packet to be cross-platform. But you don't want to bother about memory leaks either.

Fortunately the boost library did most of the hard part:

struct packet
{
   boost::uint32_t _size;
   boost::scoped_array<unsigned char> _data;

   packet() : _size(0) {}

       explicit packet(packet boost::uint32_t s) : _size(s), _data(new unsigned char [s]) {}

   explicit packet(const void * const d, boost::uint32_t s) : _size(s), _data(new unsigned char [s])
   {
        std::memcpy(_data, static_cast<const unsigned char * const>(d), _size);
   }
};

typedef boost::shared_ptr<packet> packet_ptr;

packet_ptr build_packet(const void const * data, boost::uint32_t s)
{

    return packet_ptr(new packet(data, s));
}
Brunildabruning answered 27/3, 2009 at 10:44 Comment(2)
How is that lighter than std::vector? You've used more memory and put an extra level of indirection into it.Burhans
No exceptions. No validation. You are certain that the memory you have is the memory actually allocated (vector may reserve more). The packet structure can evolve. Not vector. Which means you'd have to put a vector into the message in the end...Brunildabruning
L
0

There's nothing whatsoever wrong with using vector for arrays of unknown size that will be fixed after initialization. IMHO, that's exactly what vectors are for. Once you have it initialized, you can pretend the thing is an array, and it should behave the same (including time behavior).

Lovely answered 27/3, 2009 at 20:23 Comment(0)
M
0

Disclaimer: I wrote a small library to explore this concept: https://github.com/ppetr/refcounted-var-sized-class

We want to allocate a single block of memory for a data structure of type T and an array of elements of type A. In most cases A will be just char.

For this let's define a RAII class to allocate and deallocate such a memory block. This poses several difficulties:

  • C++ allocators don't provide such API. Therefore we need to allocate plain chars and place the structure in the block ourselves. For this std::aligned_storage will be helpful.
  • The memory block must be properly aligned. Because in C++11 there doesn't seem to be API for allocating an aligned block, we need to slightly over-allocate by alignof(T) - 1 bytes and then use std::align.
// Owns a block of memory large enough to store a properly aligned instance of
// `T` and additional `size` number of elements of type `A`.
template <typename T, typename A = char>
class Placement {
 public:
  // Allocates memory for a properly aligned instance of `T`, plus additional
  // array of `size` elements of `A`.
  explicit Placement(size_t size)
      : size_(size),
        allocation_(std::allocator<char>().allocate(AllocatedBytes())) {
    static_assert(std::is_trivial<Placeholder>::value);
  }
  Placement(Placement const&) = delete;
  Placement(Placement&& other) {
    allocation_ = other.allocation_;
    size_ = other.size_;
    other.allocation_ = nullptr;
  }

  ~Placement() {
    if (allocation_) {
      std::allocator<char>().deallocate(allocation_, AllocatedBytes());
    }
  }

  // Returns a pointer to an uninitialized memory area available for an
  // instance of `T`.
  T* Node() const { return reinterpret_cast<T*>(&AsPlaceholder()->node); }
  // Returns a pointer to an uninitialized memory area available for
  // holding `size` (specified in the constructor) elements of `A`.
  A* Array() const { return reinterpret_cast<A*>(&AsPlaceholder()->array); }

  size_t Size() { return size_; }

 private:
  // Holds a properly aligned instance of `T` and an array of length 1 of `A`.
  struct Placeholder {
    typename std::aligned_storage<sizeof(T), alignof(T)>::type node;
    // The array type must be the last one in the struct.
    typename std::aligned_storage<sizeof(A[1]), alignof(A[1])>::type array;
  };

  Placeholder* AsPlaceholder() const {
    void* ptr = allocation_;
    size_t space = sizeof(Placeholder) + alignof(Placeholder) - 1;
    ptr = std::align(alignof(Placeholder), sizeof(Placeholder), ptr, space);
    assert(ptr != nullptr);
    return reinterpret_cast<Placeholder*>(ptr);
  }

  size_t AllocatedBytes() {
    // We might need to shift the placement of for up to `alignof(Placeholder) - 1` bytes.
    // Therefore allocate this many additional bytes.
    return sizeof(Placeholder) + alignof(Placeholder) - 1 +
           (size_ - 1) * sizeof(A);
  }

  size_t size_;
  char* allocation_;
};

Once we've dealt with the problem of memory allocation, we can define a wrapper class that initializes T and an array of A in an allocated memory block.

template <typename T, typename A = char,
          typename std::enable_if<!std::is_destructible<A>{} ||
                                      std::is_trivially_destructible<A>{},
                                  bool>::type = true>
class VarSized {
 public:
  // Initializes an instance of `T` with an array of `A` in a memory block
  // provided by `placement`. Callings a constructor of `T`, providing a
  // pointer to `A*` and its length as the first two arguments, and then
  // passing `args` as additional arguments.
  template <typename... Arg>
  VarSized(Placement<T, A> placement, Arg&&... args)
      : placement_(std::move(placement)) {
    auto [aligned, array] = placement_.Addresses();
    array = new (array) char[placement_.Size()];
    new (aligned) T(array, placement_.Size(), std::forward<Arg>(args)...);
  }

  // Same as above, with initializing a `Placement` for `size` elements of `A`.
  template <typename... Arg>
  VarSized(size_t size, Arg&&... args)
      : VarSized(Placement<T, A>(size), std::forward<Arg>(args)...) {}

  ~VarSized() { std::move(*this).Delete(); }

  // Destroys this instance and returns the `Placement`, which can then be
  // reused or destroyed as well (deallocating the memory block).
  Placement<T, A> Delete() && {
    // By moving out `placement_` before invoking `~T()` we ensure that it's
    // destroyed even if `~T()` throws an exception.
    Placement<T, A> placement(std::move(placement_));
    (*this)->~T();
    return placement;
  }

  T& operator*() const { return *placement_.Node(); }
  const T* operator->() const { return &**this; }

 private:
  Placement<T, A> placement_;
};

This type is moveable, but obviously not copyable. We could provide a function to convert it into a shared_ptr with a custom deleter. But this will need to internally allocate another small block of memory for a reference counter (see also How is the std::tr1::shared_ptr implemented?).

This can be solved by introducing a specialized data type that will hold our Placement, a reference counter and a field with the actual data type in a single structure. For more details see my refcount_struct.h.

Murage answered 7/2, 2021 at 12:26 Comment(0)
L
-1

You should declare a pointer, not an array with an unspecified length.

Laveta answered 27/3, 2009 at 4:10 Comment(1)
But then you have two separately-managed chunks of memory. The OP is allocating the whole structure, array and all, in a single contiguous chunk.Duodecimal

© 2022 - 2024 — McMap. All rights reserved.