How do realloc and memcpy work?

G

8

30

I have two questions.

Do realloc() and memcpy() copy the entries in an array to another in a way faster than just iterating on each element O(N) ? If the answer is yes then what do you think is its complexity ?
If the size allocated is smaller than the original size, does realloc() copy the entries to somewhere else or just leave them as they are decreasing the size of the array ?

Gonna answered 12/12, 2008 at 13:43 Comment(0)

P

21

1 - No. They copy a block at a time. See http://www.embedded.com/design/configurable-systems/4024961/Optimizing-Memcpy-improves-speed for a pretty good analysis.

2 - This is implementation dependent. See http://www.gnu.org/software/libtool/manual/libc/Changing-Block-Size.html for glibc details. "In several allocation implementations, making a block smaller sometimes necessitates copying it"

Patricide answered 12/12, 2008 at 14:28 Comment(0)

E

26

Let's take a little closer look at memcpy and, while we're at it, at "big O" or Landau notation.

First, big-O. As i've talked about elsewhere, it's worth remembering the definition of big O, which is that some function g(n) is said to be O(f(n)) when there exists a constant k for which g(n) ≤ kf(n). What the constant does is lets you ignore the little details in favor of the important part. As everyone has noted, memcpy of n bytes will be O(n) in most any normal architecture, because no matter what you have to move those n bytes, one chunk at a time. So, a first, naive implementation of memcpy in C could be written

unsigned char *
memcpy(unsigned char * s1, unsigned char * s2, long size){
    long ix;
    for(ix=0; ix < size; ix++)
        s1[ix] = s2[ix];
    return s1;
}

This is in fact O(n), and might make you wonder why we even bother with a library routine. however, the thing about the libc functions is that they are the place where platform-specific utilities get written; if you want to optimize for the architecture, this is one of the places you can do it. So, depending on the architecture, there may be a more efficient implementation options; for example, in the IBM 360 archiecture, there is a MOVL instruction that moves data is big chunks using very highly optimized microcode. So in place of that loop, a 360 implementation of memcpy might instead look something like

LR 3,S1      LOAD S1 ADDR in Register 3
LR 4,S2      
MOVL 3,4,SIZE

(No guarantees that's exactly right 360 code by the way, but it'll serve for an illustration.) This implementation looks like instead of doing n steps around the loop as the C code did, it just executes 3 instructions.

What really happens, though, is that it's executing O(n) micro instructions under the covers. What's different between the two is the constant k; because the microcode is much faster, and because there's only three decode steps on the instructions, it is dramatically faster than the naive version, but it's still O(n) -- it's just the constant is smaller.

And that's why you can make good use of memcpy -- it's not asymptotically faster, but the implementation is as fast as someone could make it on that particular architecture.

Explosion answered 27/12, 2008 at 4:19 Comment(0)

P

21

1 - No. They copy a block at a time. See http://www.embedded.com/design/configurable-systems/4024961/Optimizing-Memcpy-improves-speed for a pretty good analysis.

2 - This is implementation dependent. See http://www.gnu.org/software/libtool/manual/libc/Changing-Block-Size.html for glibc details. "In several allocation implementations, making a block smaller sometimes necessitates copying it"

Patricide answered 12/12, 2008 at 14:28 Comment(0)

E

5

There is absolutely no way to copy N items faster than O(N). However, it might be able to copy multiple items at once, or use special processor instructions - so it still might be faster than you could do it yourself.
I don't know for sure, but I'd assume that the memory is completely reallocated. That's the safest assumption, and it's probably implementation dependent anyway.

Ebro answered 12/12, 2008 at 14:2 Comment(0)

A

5

The performance of memcpy can't really be better than O(N) but it can be optimized so that it outperforms manual copying; for example, it might be able to copy 4 bytes in the time it takes you to copy 1 byte. Many memcpy implementations are written in assembly using optimized instructions that can copy multiple elements at a time which is usually faster than copying data one byte at a time.
I don't quite understand this question, if you use realloc to decrease the size of memory and it succeeds (returns non-NULL), the new location will contain the same data as the old location up to the size of the new request. If the memory location was changed as a result of calling realloc (not usual when decreasing the size) the contents will be copied, otherwise no copying needs to happen as the memory hasn't moved.

Alitta answered 12/12, 2008 at 14:4 Comment(0)

M

2

It can be conjectured that memcpy could be written such that it would move large number of bits around. e.g. It's entirely possible to copy the data using SSE instructions, if it is advantageous.

As other said, it won't be faster than O(n), but memory systems often have a preferred block size, and also it's possible to, say, write the size of a cache line at a time.

Moonrise answered 17/12, 2008 at 19:35 Comment(0)

R

0

Presuming you are talking about glibc, and since your questions are implementation dependent, it's probably best just to check the source:

malloc.c

memcpy.c

The way I read it, the answers would be:

O(N) --- there is no way to copy items in better than linear time.
Occasionally large items will be copied when realloc() is used to shrink them.

Roa answered 27/12, 2008 at 2:21 Comment(0)

E

0

The x86 has special instructions for scanning and matching a byte/word in a block of memory as well and one that can be used to copy a block of memory (it is a CISC cpu after all). A lot of C compilers that implement inline assembly language and a pragma to do inlining of entire functions have for many many years taken advantage of this in their library functions.

The ones used for mem copy are movsb/movsw in combination to the rep instruction.

CMPS/MOVS/SCAS/STOS
REP, REPE, REPNE, REPNZ, REPZ

Setup registers with src/trg addresses and int count and away you go.

Eleaseeleatic answered 27/12, 2008 at 5:51 Comment(0)

R

0

Some of the important points related to realloc(check on dev c++) : void *realloc(void *ptr, size_t size);

The realloc() function shall change the size of the memory object pointed to by ptr to the size specified by size.
The contents of the object shall remain unchanged up to the lesser of the new and old sizes.
If the new size is larger, the contents of the newly allocated portion of the object are unspecified.
If size is 0 and ptr is not a null pointer, the object pointed to is freed.
If ptr is a null pointer, realloc() shall be equivalent to malloc() for the specified size.
If ptr does not match a pointer returned earlier by calloc(), malloc(), or realloc() or if the space has previously been deallocated by a call to free() or realloc(), the behavior is undefined.

Regina answered 18/2, 2012 at 3:13 Comment(0)

Recommended topics

Hot tags