Preference between memcpy and dereference

Asked 11/9, 2012 at 6:22 Answered 20/11, 2020 at 12:17

When copying a known struct in memory, would you prefer using memcpy or dereference? why? Specifically, in the following code:

#include <stdio.h>
#include <string.h>

typedef struct {
    int foo;
    int bar;
} compound;

void copy_using_memcpy(compound *pto, compound *pfrom)
{
    memcpy(pto, pfrom, sizeof(compound));
}
void copy_using_deref(compound *pto, compound *pfrom)
{
    *pto = *pfrom;
}

int main(int argc, const char *argv[])
{
    compound a = { 1, 2 };
    compound b = { 0 };
    compound *pa = &a;
    compound *pb = &b;

    // method 1
    copy_using_memcpy(pb, pa);
    // method 2
    copy_using_deref(pb, pa);
    printf("%d %d\n", b.foo, b.bar);

    return 0;
}

Would you prefer method 1 or method 2? I looked at the assembly generated by gcc, and it seems that method 2 uses less instructions than method 1. Does it imply that method 2 is preferable in this case? Thank you.

Chavers answered 11/9, 2012 at 6:22 Comment(2)

I'd just use dereference+assignment instead of memcpy(). Simpler code. If there's any difference, it should be practically negligible with a good compiler. – Desiccator 11/9, 2012 at 6:31

Thank you Shao-Chuan Wang, your code triggers a bug in icc that I reported a while ago, but with a not-so-minimal example :) – Paleozoology 11/9, 2012 at 6:34

I can't think of any good reason to use memcpy() rather than an assignment when copying a struct (as long as you don't need to do a deep copy or something involving the struct hack or a flexible array member, none of which apply in this case).

They have exactly the same semantics, and the assignment (a) likely gives the compiler more opportunities for optimization, and (b) has less risk of getting the size wrong.

Some very old C compilers probably didn't support struct assignment, but that's no longer a significant concern.

(There are additional reasons to prefer assignment in C++, but your question is about C.)

Incidentally, the parentheses in

(*pto) = (*pfrom);

are unnecessary; the unary * binds tightly enough that this:

*pto = *pfrom;

is both correct and sufficiently clear to most readers.

Yuriyuria answered 11/9, 2012 at 6:30 Comment(1)

On a second thought, I found a case that we need to use memcpy instead of using pointer dereference. https://mcmap.net/q/24855/-array-of-zero-length In this case, the size of struct does not reflect the exact size of "data" that it "contains". – Chavers 11/9, 2012 at 17:52

For the exact same reason you mentioned, I'd prefer method 2 (the dereferencing one). Memcpy does a byte-by-byte copy AND has the overhead of a function call, while the dereferencing does the copy only, and doesn't have the extra overhead.

Dereferencing and assigning is also more readable (especially when you omit the superfluous parentheses:

*dest = *src;

)

Unbuild answered 11/9, 2012 at 6:31 Comment(8)

Memcpy doesn't necessarily have the overhead of a function call. :) – Jiggle 11/9, 2012 at 6:32

Memory copies are often inlined by the compiler, no call. – Desiccator 11/9, 2012 at 6:32

@AlexeyFrunze well, think of TCC :) – Unbuild 11/9, 2012 at 6:32

Unless it's the ancient Turbo C(++), which is of approximately zero significance these days, I've never used it. – Desiccator 11/9, 2012 at 6:34

@AlexeyFrunze No, it's the Tiny C Compiler, the only C compiler I know about that is widely used as a library, but does no optimizations. – Unbuild 11/9, 2012 at 6:35

That's OK for TCC, both TCCs. :) – Desiccator 11/9, 2012 at 6:36

@AlexeyFrunze very well said :) – Unbuild 11/9, 2012 at 6:38

We could say, it has the overhead of a function call, which might be zero overhead thanks to inlining. Meanwhile, the copy-by-dereferencing might also have the overhead of a function call, because the implementation is within its rights to implement struct copying by calling memcpy (or some other compiler-generated function that does the same thing). If the struct is above a certain size that could be sensible even if the memcpy isn't inlined, and it's certainly simple for the implementer. – Lilianaliliane 11/9, 2012 at 9:50

I tried to run this with Google's benchmark:

#include <benchmark/benchmark.h>
#include <stdio.h>
#include <string.h>

typedef struct {
    int foo;
    int bar;
    int a;
    int b;
    int c;
    int d;
    int e;
    int f;
    int g;
} compound;

static void copy_using_memcpy(benchmark::State& state) {
    compound a = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound b = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound* pa = &a;
    compound* pb = &b;
    for (auto _ : state) memcpy(pa, pb, sizeof(compound));
}
static void copy_using_deref(benchmark::State& state) {
    compound a = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound b = {0, 0, 0, 0, 0, 0, 0, 0, 0};
    compound* pa = &a;
    compound* pb = &b;
    for (auto _ : state) *pa = *pb;
}

BENCHMARK(copy_using_memcpy);
BENCHMARK(copy_using_deref);

BENCHMARK_MAIN();

The result is like:

> g++ benchmark.cc -lbenchmark -lpthread && ./a.out
2020-11-20T20:12:12+08:00
Running ./a.out
Run on (16 X 1796.56 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x8)
  L1 Instruction 32 KiB (x8)
  L2 Unified 512 KiB (x8)
  L3 Unified 4096 KiB (x1)
Load Average: 0.29, 0.15, 0.10
------------------------------------------------------------
Benchmark                  Time             CPU   Iterations
------------------------------------------------------------
copy_using_memcpy       2.44 ns         2.44 ns    282774223
copy_using_deref        1.77 ns         1.77 ns    389126375

In the original example, with only two fields, the time is roughly the same.

Myrmeco answered 20/11, 2020 at 12:17 Comment(0)

Recommended topics

Hot tags