What is the model of value vs. reference in Nim?
Asked Answered
G

2

30

NOTE: I am not asking about difference between pointer and reference, and for this question it is completely irrelevant.

One thing I couldn't find explicitly stated -- what model does Nim use?

Like C++ -- where you have values and with new you create pointers to data (in such case the variable could hold pointer to a pointer to a pointer to... to data)?

Or like C# -- where you have POD types as values, but user defined objects with referenced (implicitly)?

I spotted only dereferencing is automatic, like in Go.

Rephrase. You define your new type, let's say Student (with name, university, address). You write:

var student ...?
  1. to make student hold actual data (of Student type/class)
  2. to make student hold a pointer to the data
  3. to make student hold a pointer to a pointer to the data

Or some from those points are impossible?

Glorification answered 28/2, 2014 at 13:45 Comment(2)
Can the person who upvoted this rephrase it? I am having trouble understanding the question..Rosalynrosalynd
@SimonWhitehead, I updated the question (from the answer to the update I will distill the answer to my original question :-D).Glorification
Y
39

By default the model is of passing data by value. When you create a var of a specific type, the compiler will allocate on the stack the required space for the variable. Which is expected, as Nim compiles to C, and complex types are just structures. But like in C or C++, you can have pointers too. There is the ptr keyword to get an unsafe pointer, mostly for interfacing to C code, and there is a ref to get a garbage collected safe reference (both documented in the References and pointer types section of the Nim manual).

However, note that even when you specify a proc to pass a variable by value, the compiler is free to decide to pass it internally by reference if it considers it can speed execution and is safe at the same time. In practice the only time I've used references is when I was exporting Nim types to C and had to make sure both C and Nim pointed to the same memory. Remember that you can always check the generated C code in the nimcache directory. You will see then that a var parameter in a proc is just a pointer to its C structure.

Here is an example of a type with constructors to be created on the stack and passed in by value, and the corresponding pointer like version:

type
  Person = object
    age: int
    name: string

proc initPerson(age: int, name: string): Person =
  result.age = age
  result.name = name

proc newPerson(age: int, name: string): ref Person =
  new(result)
  result.age = age
  result.name = name

when isMainModule:
  var
    a = initPerson(3, "foo")
    b = newPerson(4, "bar")

  echo a.name & " " & $a.age
  echo b.name & " " & $b.age

As you can see the code is essentially the same, but there are some differences:

  • The typical way to differentiate initialisation is to use init for value types, and new for reference types. Also, note that Nim's own standard library mistakes this convention, since some of the code predates it (eg. newStringOfCap does not return a reference to a string type).
  • Depending on what your constructors actually do, the ref version allows you to return a nil value, which you can treat as an error, while the value constructor forces you to raise an exception or change the constructor to use the var form mentioned below so you can return a bool indicating success. Failure tends to be treated in different ways.
  • In C-like languages theres is an explicit syntax to access either the memory value of a pointer or the memory value pointed by it (dereferencing). In Nim there is as well, and it is the empty subscript notation ([]). However, the compiler will attempt to automatically put those to avoid cluttering the code. Hence, the example doesn't use them. To prove this you can change the code to read:

    echo b[].name & " " & $b[].age

    Which will work and compile as expected. But the following change will yield a compiler error because you can't dereference a non reference type:

    echo a[].name & " " & $a[].age

  • The current trend in the Nim community is to get rid of single letter prefixes to differentiate value vs reference types. In the old convention you would have a TPerson and an alias for the reference value as PPerson = ref TPerson. You can find a lot of code still using this convention.

  • Depending on what exactly your object and constructor need to do, instead of having a initPerson returning the value you could also have a init(x: var Person, ...). But the use of the implicit result variable allows the compiler to optimise this, so it is much more a taste preference or requirements of passing a bool to the caller.
Yap answered 28/2, 2014 at 15:40 Comment(5)
So basically you have C++-like -- var student = Student('Joe Doe') and you have data on stack, or var student= new Student('Joe Doe') and you have pointer to data allocated on heap, correct?Glorification
Yes. I have extended the answer with a typical Nimrod way of initialising both types with constructor procs. The example was tested with the git version of Nimrod, may not work directly with the last stable 0.9.2 release.Yap
+1, absolutely great, thank you. I only miss 2 pieces to understand this model -- can you take a pointer of a (see your code)? Depending on the answer -- and if you would NOT define newPerson could you create on pointer/references to the Person object?Glorification
You get a pointer to a var through the addr operator (see nimrod-lang.org/manual.html#the-addr-operator), but it is not garbage collected like ref. The newPerson proc is just a convenience proc, nothing stops you from writing its body directly, define var c: ref Person and call new(c) to allocate it.Yap
Thank you again! The link you provided also explained a lot to me.Glorification
M
6

It can be either.

type Student = object ...

is roughly equivalent to

typedef struct { ... } Student;

in C, while

type Student = ref object ...

or

type Student = ptr object ...

is roughly equivalent to

typedef struct { ... } *Student;

in C (with ref denoting a reference that is traced by the garbage collector, while ptr is not traced).

Malia answered 28/2, 2014 at 16:19 Comment(4)
Hmm, isn't possible to define Student as you did (as data, not pointer) in your first example, and later write something like this var student = new Student('Joe Doe') (like in C++) and have pointer to Student?Glorification
First of all, new in Nimrod works differently from C++. It only allocates memory, it does not invoke any constructors (like many other things in Nimrod, this behavior is inherited from Pascal/Modula-2 instead). Second, new(student) will require student to be a ref type (because it's defined as proc new*[T](a: var ref T)) and using new(Student) returns an instance of a ref type (because that is defined as proc new*(T: typedesc): ref T).Malia
When you say "student has to be a ref type" -- is there analogy to C#, when you have to define a type in advance as reference type or value type? In other words, if Student is a value type (type Student = object...), you cannot call new Student...?Glorification
No, you'd have to look at C/C++ for an analogy (or, obviously, Pascal/Modula-2). ref T and ptr T are approximately equivalent to T* in C/C++ (with the caveat that ref T is garbage-collected, a concept that C/C++ don't have). It's really not more complicated than that, it's just a different syntax (closer to what you find in the Algol family of languages), but largely the same thing. In C#, the closest analogy would be the difference between a class and a struct.Malia

© 2022 - 2024 — McMap. All rights reserved.