What is reification?
Asked Answered
F

4

179

I know that Java implements parametric polymorphism (Generics) with erasure. I understand what erasure is.

I know that C# implements parametric polymorphism with reification. I know that can make you write

public void dosomething(List<String> input) {}
public void dosomething(List<Int> input) {}

or that you can know at runtime what the type parameter of some parameterised type is, but I don't understand what it is.

  • What is a reified type?
  • What is a reified value?
  • What happens when a type/value is reified?
Faliscan answered 7/8, 2015 at 11:13 Comment(4)
It's not an answer, but may help someway: beust.com/weblog/2011/07/29/erasure-vs-reificationStipend
@Stipend that seems to answer the question "what is erasure" fairly well, and seems to basically answer "what is reification" with "not erasure" - a common theme I found when initially searching for an answer before posting here.Faliscan
...and there was me thinking reification is the process of converting a switch construct back to an if/else, when it had previously been converted from an if/else to a switch...Greegree
Res, reis is Latin for thing, so reification is literally thingification. I have nothing useful to contribute as far as C#'s use of the term, but the fact in and of itself that they used it makes me smile.Peltry
U
232

Reification is the process of taking an abstract thing and creating a concrete thing.

The term reification in C# generics refers to the process by which a generic type definition and one or more generic type arguments (the abstract thing) are combined to create a new generic type (the concrete thing).

To phrase it differently, it is the process of taking the definition of List<T> and int and producing a concrete List<int> type.

To understand it further, compare the following approaches:

  • In Java generics, a generic type definition is transformed to essentially one concrete generic type shared across all allowed type argument combinations. Thus, multiple (source code level) types are mapped to one (binary level) type - but as a result, information about the type arguments of an instance is discarded in that instance (type erasure).

    1. As a side effect of this implementation technique, the only generic type arguments that are natively allowed are those types that can share the binary code of their concrete type; which means those types whose storage locations have interchangeable representations; which means reference types. Using value types as generic type arguments requires boxing them (placing them in a simple reference type wrapper).
    2. No code is duplicated in order to implement generics this way.
    3. Type information that could have been available at runtime (using reflection) is lost. This, in turn, means that specialization of a generic type (the ability to use specialized source code for any particular generic argument combination) is very restricted.
    4. This mechanism doesn't require support from the runtime environment.
    5. There are a few workarounds to retain type information that a Java program or a JVM-based language can use.
  • In C# generics, the generic type definition is maintained in memory at runtime. Whenever a new concrete type is required, the runtime environment combines the generic type definition and the type arguments and creates the new type (reification). So we get a new type for each combination of the type arguments, at runtime.

    1. This implementation technique allows any kind of type argument combination to be instantiated. Using value types as generic type arguments does not cause boxing, since these types get their own implementation. (Boxing still exists in C#, of course - but it happens in other scenarios, not this one.)
    2. Code duplication could be an issue - but in practice it isn't, because sufficiently smart implementations (this includes Microsoft .NET and Mono) can share code for some instantiations.
    3. Type information is maintained, which allows specialization to an extent, by examining type arguments using reflection. However, the degree of specialization is limited, as a result of the fact that a generic type definition is compiled before any reification happens (this is done by compiling the definition against the constraints on the type parameters - thus, the compiler has to be able "understand" the definition even in the absence of specific type arguments).
    4. This implementation technique depends heavily on runtime support and JIT-compilation (which is why you often hear that C# generics have some limitations on platforms like iOS, where dynamic code generation is restricted).
    5. In the context of C# generics, reification is done for you by the runtime environment. However, if you want to more intuitively understand the difference between a generic type definition and a concrete generic type, you can always perform a reification on your own, using the System.Type class (even if the particular generic type argument combination you're instantiating didn't appear in your source code directly).
  • In C++ templates, the template definition is maintained in memory at compile time. Whenever a new instantiation of a template type is required in the source code, the compiler combines the template definition and the template arguments and creates the new type. So we get a unique type for each combination of the template arguments, at compile time.

    1. This implementation technique allows any kind of type argument combination to be instantiated.
    2. This is known to duplicate binary code but a sufficiently smart tool-chain could still detect this and share code for some instantiations.
    3. The template definition itself is not "compiled" - only its concrete instantiations are actually compiled. This places fewer constraints on the compiler and allows a greater degree of template specialization.
    4. Since template instantiations are performed at compile time, no runtime support is needed here either.
    5. This process is lately referred to as monomorphization, especially in the Rust community. The word is used in contrast to parametric polymorphism, which is the name of the concept that generics come from.
Ulphi answered 7/8, 2015 at 11:34 Comment(5)
Great comparison with C++ templates... they seem to fall somewhere in between C#'s and Java's generics. You have different code and structure for handling different specific generic types like in C#, but it's all done in compile-time like in Java.Felonry
Also, in C++ this enables to introduce template specialization, where each (or just some) concrete types can have different implementations. Obviously not possible in Java, but neither in C#.Telford
@Telford though one reason for using that is to reduce the amount of produced code with pointer types, and C# does something comparable with reference types behind the scenes. Still, that's only one reason for using that, and there are definitely times when template specialisation would be nice.Sheepish
For Java, you may want to add that while type information is erased, casts are added by the compiler, making the bytecode indistinguishable from pre-generics bytecode.Sauncho
This is known to duplicate binary code but a sufficiently smart tool-chain could still detect this and share code - I keep seeing this repeated, but IIRC this was only a problem for the first couple of years of C++ templates. I mean, has any toolchain not supported this in this millennium? Pretty sure this problem was solved before .NET was even invented.Replay
S
27

Reification means generally (outside of computer science) "to make something real".

In programming, something is reified if we're able to access information about it in the language itself.

For two completely non-generics-related examples of something C# does and doesn't have reified, let's take methods and memory access.

OO languages generally have methods, (and many that don't have functions that are similar though not bound to a class). As such you can define a method in such a language, call it, perhaps override it, and so on. Not all such languages let you actually deal with the method itself as data to a program. C# (and really, .NET rather than C#) does let you make use of MethodInfo objects representing the methods, so in C# methods are reified. Methods in C# are "first class objects".

All practical languages have some means to access the memory of a computer. In a low-level language like C we can deal directly with the mapping between numeric addresses used by the computer, so the likes of int* ptr = (int*) 0xA000000; *ptr = 42; is reasonable (as long as we've a good reason to suspect that accessing memory address 0xA000000 in this way won't blow something up). In C# this isn't reasonable (we can just about force it in .NET, but with the .NET memory management moving things around it's not very likely to be useful). C# does not have reified memory addresses.

So, as refied means "made real" a "reified type" is a type we can "talk about" in the language in question.

In generics this means two things.

One is that List<string> is a type just as string or int are. We can compare that type, get its name, and enquire about it:

Console.WriteLine(typeof(List<string>).FullName); // System.Collections.Generic.List`1[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
Console.WriteLine(typeof(List<string>) == (42).GetType()); // False
Console.WriteLine(typeof(List<string>) == Enumerable.Range(0, 1).Select(i => i.ToString()).ToList().GetType()); // True
Console.WriteLine(typeof(List<string>).GenericTypeArguments[0] == typeof(string)); // True

A consequence of this is that we can "talk about" a generic method's (or method of a generic class) parameters' types within the method itself:

public static void DescribeType<T>(T element)
{
  Console.WriteLine(typeof(T).FullName);
}
public static void Main()
{
  DescribeType(42);               // System.Int32
  DescribeType(42L);              // System.Int64
  DescribeType(DateTime.UtcNow);  // System.DateTime
}

As a rule, doing this too much is "smelly", but it has many useful cases. For example, look at:

public static TSource Min<TSource>(this IEnumerable<TSource> source)
{
  if (source == null) throw Error.ArgumentNull("source");
  Comparer<TSource> comparer = Comparer<TSource>.Default;
  TSource value = default(TSource);
  if (value == null)
  {
    using (IEnumerator<TSource> e = source.GetEnumerator())
    {
      do
      {
        if (!e.MoveNext()) return value;
        value = e.Current;
      } while (value == null);
      while (e.MoveNext())
      {
        TSource x = e.Current;
        if (x != null && comparer.Compare(x, value) < 0) value = x;
      }
    }
  }
  else
  {
    using (IEnumerator<TSource> e = source.GetEnumerator())
    {
      if (!e.MoveNext()) throw Error.NoElements();
      value = e.Current;
      while (e.MoveNext())
      {
        TSource x = e.Current;
        if (comparer.Compare(x, value) < 0) value = x;
      }
    }
  }
  return value;
}

This doesn't do lots of comparisons between the type of TSource and various types for different behaviours (generally a sign that you shouldn't have used generics at all) but it does split between a code path for types that can be null (should return null if no element found, and must not make comparisons to find the minimum if one of the elements compared is null) and the code path for types that cannot be null (should throw if no element found, and doesn't have to worry about the possibility of null elements).

Because TSource is "real" within the method, this comparison can be made either at runtime or jitting time (generally at jitting time, and certainly the above case would do so at jitting time and not produce machine code for the path not taken) and we have a separate "real" version of the method for each case. (Though as an optimisation, the machine code is shared for different methods for different reference-type type parameters, because it can be without affecting this, and hence we can reduce the amount of machine code jitted.)

(It's not common to talk about reification of generic types in C# unless you also deal with Java, because in C# we just take this reification for granted; all types are reified. In Java, non-generic types are referred to as reified because that is a distinction between them and generic types.)

Sheepish answered 7/8, 2015 at 11:52 Comment(5)
You don't think being able to do what Min does above useful? It's very hard to fulfil its documented behaviour otherwise.Sheepish
I consider the bug to be the (un)documented behaviour and the implication that that behaviour is useful (as an aside, the behaviour of Enumerable.Min<TSource> is different in that it doesn't throw for non-reference types on an empty collection, but returns default(TSource), and is documented only as "Returns the minimum value in a generic sequence." I would argue both should throw on an empty collection, or that a "zero" element should be passed in as a baseline, and the comparator/comparison function should always be passed in)Faliscan
That would be a lot less useful than the current Min, which matches common db behaviour on nullable types without attempting the impossible on non-nullable types. (The baseline idea isn't impossible, but not very useful unless there's a value you can know would never be in the source).Sheepish
Thingification would have been a better name for this. :)Diaphysis
@Diaphysis a thing can be unreal.Sheepish
F
18

As duffymo already noted, "reification" isn't the key difference.

In Java, generics are basically there to improve compile-time support - it allows you to use strongly typed e.g. collections in your code, and have type safety handled for you. However, this only exists at compile-time - the compiled bytecode no longer has any notion of generics; all the generic types are transformed into "concrete" types (using object if the generic type is unbounded), adding type conversions and type checks as needed.

In .NET, generics are an integral feature of the CLR. When you compile a generic type, it stays generic in the generated IL. It's not just transformed into non-generic code as in Java.

This has several impacts on how generics work in practice. For example:

  • Java has SomeType<?> to allow you to pass any concrete implementation of a given generic type. C# cannot do this - every specific (reified) generic type is its own type.
  • Unbounded generic types in Java mean that their value is stored as an object. This can have a performance impact when using value types in such generics. In C#, when you use a value type in a generic type, it stays a value type.

To give a sample, let's suppose you have a List generic type with one generic argument. In Java, List<String> and List<Int> will end up being the exact same type at runtime - the generic types only really exist for compile-time code. All calls to e.g. GetValue will be transformed to (String)GetValue and (Int)GetValue respectively.

In C#, List<string> and List<int> are two different types. They are not interchangeable, and their type-safety is enforced in runtime as well. No matter what you do, new List<int>().Add("SomeString") will never work - the underlying storage in List<int> is really some integer array, while in Java, it is necessarily an object array. In C#, there are no casts involved, no boxing etc.

This should also make it obvious why C# can't do the same thing as Java with SomeType<?>. In Java, all generic types "derived from" SomeType<?> end up being the exact same type. In C#, all the various specific SomeType<T>s are their own separate type. Removing compile-time checks, it's possible to pass SomeType<Int> instead of SomeType<String> (and really, all that SomeType<?> means is "ignore compile-time checks for the given generic type"). In C#, it's not possible, not even for derived types (that is, you can't do List<object> list = (List<object>)new List<string>(); even though string is derived from object).

Both implementations have their pros and cons. There's been a few times when I'd have loved to be able to just allow SomeType<?> as an argument in C# - but it simply doesn't make sense the way C# generics work.

Felonry answered 7/8, 2015 at 11:34 Comment(4)
Well, you can make use of the types List<>, Dictionary<,> and so on in C#, but the gap between that and a given concrete list or dictionary takes quite a bit of reflection to bridge. Variance on interfaces does help in some of the cases where we might once have wanted to bridge that gap easily, but not all.Sheepish
@JonHanna You can use List<> to instantiate a new specific generic type - but it still means creating the specific type you want. But you can't use List<> as an argument, for example. But yes, at least this allows you to bridge the gap using reflection.Felonry
The .NET Framework has three hard-coded generic constraints that aren't storage-location types; all other constraints must be storage-location types. Further, the only times a generic type T can satisfy a storage-location-type constraint U are when T and U are the same type, or U is a type that can hold a reference to an instance of T. It would not be possible to meaningfully have storage location of type SomeType<?> but would in theory be possible to have a generic constraint of that type.Howbeit
It's not true that compiled Java bytecode has no notion of generics. It's just that class instances have no notion of generics. This is an important difference; I've previously written about this at programmers.stackexchange.com/questions/280169/…, if you're interested.Duisburg
M
3

Reification is an object-oriented modeling concept.

Reify is a verb that means "make something abstract real".

When you do object oriented programming it's common to model real world objects as software components (e.g. Window, Button, Person, Bank, Vehicle, etc.)

It's also common to reify abstract concepts into components as well (e.g. WindowListener, Broker, etc.)

Meridithmeriel answered 7/8, 2015 at 11:20 Comment(5)
Reification is a general concept of "making something real" that while it does apply to object-oriented modelling as you say, does also have a meaning in the context of implementation of generics.Sheepish
So I've been educated by reading these answers. I'll amend my answer.Meridithmeriel
This answer does nothing to address the OP's interest in generics and parametric polymorphism.Conjoin
This comment does nothing to address anyone's interest or boost your rep. I see you offered nothing whatsoever. Mine was the first answer, and it did define reification as something broader.Meridithmeriel
Your answer may have been the first, but you answered a different question, not the one asked by the OP, which would have been clear from the content of the question and its tags. Maybe you didn't read the question thoroughly before you wrote your answer, or maybe you didn't know that the term "reification" has an established meaning in the context of generics. Either way, your answer is not useful. Downvote.Fogbound

© 2022 - 2024 — McMap. All rights reserved.