Nim: Addresses of parameters and mutability
Asked Answered
M

3

10

I'm trying to make up my mind about Nim's policy behind expression has no address. In particular, I have a C function which takes a pointer (+ length etc.) of some data buffer. I know that this function will not modify the data. Simplified:

type
  Buffer = object
    data: seq[float]

proc wrapperForCCall(buf: Buffer) =
  # accessing either buf.addr nor buf.data.addr produces
  # Error: expression has no address
  # workaround:
  var tmp = buf.data          # costly copy
  callToC(tmp.len, tmp.addr)  # now it works

On the one hand this makes sense, since a parameter seems to behave exactly like a let binding, which also "has no address". On the other hand, I'm puzzled by this statement in the manual:

var parameters are never necessary for efficient parameter passing.

As far as I can see, the only way to avoid copying the data is by either:

  • passing the parameter as buf: var Buffer
  • passing a reference, i.e., using a ref object.

In both cases this suggests that my function modifies the data. Furthermore, it introduces mutability on the caller site (i.e. users can no longer use let bindings for their buffers). The key question for me is: Since "I know" that callToC is read-only, can I convince Nim to allow both immutability without a copy? I see that this is dangerous, since I have to know for sure that the call is immutable. Thus, this would require some sort of "unsafe address" mechanism, allowing to force pointers to immutable data?

And my final mystery of parameter addresses: I tried to make the necessity of the copy explicit by changing the type to Buffer {.bycopy.} = object. In this case the copy already happens at call time, and I would expect to have access to the address now. Why is the access denied in this case as well?

Merous answered 8/5, 2015 at 20:46 Comment(0)
E
7

You can avoid the deep copy of buf.data by using shallowCopy, e.g.:

var tmp: seq[float]
shallowCopy tmp, buf.data

The {.byCopy.} pragma only affects the calling convention (i.e. whether an object gets passed on the stack or via a reference.

You cannot take the address of buf or any part of it that isn't behind a ref or ptr because passing a value as a non-var parameter is a promise that the callee does not modify the argument. The shallowCopy builtin is an unsafe feature that circumvents that guarantee (I remember suggesting that shallowCopy should properly be renamed to unsafeShallowCopy to reflect that and to have a new shallowCopy where the second argument is a var parameter also).

Ensemble answered 8/5, 2015 at 21:24 Comment(3)
Thanks a lot, looks like this is exactly what I was looking for. What I do not fully understand yet: What exactly is "copied" in the shallowCopy here at all? If I understand you correctly (i.e., if the performance is similar to the no-copy case), it rather is a kind of "unsafeAlias"?Merous
shallowCopy is the equivalent of a plain assignment in C; whereas = in Nim does a (deep) structural copy for strings, seqs, and non-ref, non-ptr objects containing them (somewhat similar to how assignment for std::string and std::vector work in C++).Ensemble
Ah, I see, for a seq, string, or any other ref type it will just "copy" the pointer.Merous
P
6

Let's start by clarifying the following:

var parameters are never necessary for efficient parameter passing.

This is generally true, because in Nim complex values like objects, sequences and strings will be passed by address (a.k.a. by reference) to procs accepting read-only parameters.

When you need to pass a sequence to an external C/C++ function, things get a bit more complicated. The most common way to do this is to rely on the openarray type, which will automatically convert the sequence to a pair of data pointer and a size integer:

# Let's say we have the following C function:

{.emit: """

#include <stdio.h>

void c_call_with_size(double *data, size_t len)
{
  printf("first value: %f; size: %d \n" , data[0], len);
}

""".}

# We can import it like this:

proc c_call(data: openarray[float]) {.importc: "c_call_with_size", nodecl.}

# The usage is straight-forward:

type Buffer = object
  data: seq[float]

var b = Buffer(data: @[1.0, 2.0])

c_call(b.d)

There won't be any copies in the generated C code.

Now, if the wrapped C library doesn't accept a pair of data/size arguments as in the example here, I'd suggest creating a tiny C wrapper around it (you can create a header file or just use the emit pragma to create the necessary adapter functions or #defines).

Alternatively, if you really want to get your hands dirty, you can extract the underlying buffer from the sequence with the following helper proc:

proc rawBuffer[T](s: seq[T]): ptr T =
  {.emit: "result = `s`->data;".}

Then, it will be possible to pass the raw buffer to C like this:

{.emit: """

#include <stdio.h>

void c_call(double *data)
{
  printf("first value: %f \n", data[0]);
}

""".}

proc c_call(data: ptr float) {.importc: "c_call", nodecl.}

var b = Buffer(data: @[1.0, 2.0])
c_call(b.data.rawBuffer)
Prune answered 10/5, 2015 at 9:36 Comment(4)
One important thing I have learned from your answer: When writing a C binding for a function which takes a mutable array, never forget the var prefix. I just had a case where the author of some C bindings had forgotten the vars. The result: Nim allows me to pass immutable let arrays, but they are indeed modified.Merous
@zah, why does you helper proc emits instead of simply returning return ptr s[0]? Or it is not the same?Resolved
@kerim, taking the address of the first element would work only if you are dealing with a var sequence type. Otherwise, the indexing operator returns values, whose address cannot be taken.Prune
@kerim, I see, just make sure to look at the generated C code to confirm that you are not taking the address of a copy, returned by the index operator. In any case, my "dirtier" version generates more simple C code.Prune
M
3

Nim now has an unsafeAddr operator, which allows to get addresses even for let bindings and parameters, allowing to avoid the shallowCopy workaround. Obviously one has to be very careful that nothing mutates the data behind the pointer.

Merous answered 17/7, 2017 at 17:54 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.