How to avoid massive amounts of boilerplate in Julia with new types?
Asked Answered
D

1

9

I'm considering writing a type similar to those defined in NamedArrays and Images. Let's say I just want essentially an Array with a piece of metadata, say a user-friendly name that I'll write at the top of a file when I write the array to disk. (This detail is not relevant; I'm just contriving an example.)

So I might do

type MyNamedArray
    data::Array
    name::ASCIIString
end

function mywrite(f,x::MyNamedArray)
     write(f,x.name)
     write(f,x.data)
end

or something, and no other behavior needs to be different from the base Array behavior.

In my head, it's "obvious" that I just want every existing function that operates on Arrays to operate on the data field of this type. In another language e.g. Java I might have just subclassed Array and added name as an instance field to the subclass, which would automatically preserve compatibility with all the existing Array operations. But in Julia, if I try a solution such as that above, I now need to define a ton more functions, e.g. as @TimHoly and 'davidavdav' have done in the linked packages.

Of course I am aware that being forced to write out some of these functions by hand is useful for realizing things that you haven't thought through. E.g. in the example MyNamedArray I give above, one could object by pointing out that I haven't defined the name of x::MyNamedArray * y::MyNamedArray. But what if I just don't care about that, and want code that "just works," without so much boilerplate? (See e.g. looping over symbols to push new method definitions in NamedArrays and manually writing out a hundred lines of definitions in Images. The vast majority of these definitions are boilerplate / the "obvious" definition.)

Specifically to continue the example I cited, for MyNamedArray, the default could be x*y is no longer a MyNamedArray, i.e. since every function just defaults to the "inherited" behavior of applying the same function on the underlying data, we can just forget the metadata on all pre-existing functions.

Note, I find Tomas Lycken's answer here insightful, and so are the question and answers here.

The best synthesis I can come up with is "you just have to suck it up and write out the functions, or write a macro that does that for you." If this is the case, so be it; I'm just wondering if I'm missing a better option, particularly a better way to design the solution to make it more Julian and avoid the boilerplate.

Damle answered 7/6, 2016 at 18:32 Comment(0)
R
6

You can get most of the way there by simply subclassing AbstractArray: http://docs.julialang.org/en/latest/manual/interfaces/#abstract-arrays. In fact, you can do one better and subclass DenseArray, which additionally requires defining a stride (and probably pointer) function… allowing your custom array to work with BLAS. That's just a handful of methods you need to define. It's not 100% since many authors still have a tendency to overly restrict methods to only accept Array when they could easily accept all AbstractArrays. This is something that's gotten significantly better in the past two years, and it's still improving.

In general, a pattern I've found very useful here is to define interfaces in terms of abstract supertypes and loosen method signatures as much as possible. If dispatch restrictions aren't necessary, you can allow any type and just lean upon duck-typing. If you only restrict dispatch to specific leaf-types when telling Julia how it should quack or when leaning on its internal implementation, then your work becomes much more extensible and re-usable.

Rolandrolanda answered 7/6, 2016 at 19:25 Comment(9)
A lot of good suggestions here; I'll tinker a bit and report back. Thanks!Damle
I think this raises a related question about the logical relationship between implementing interfaces and subtyping. In other languages with explicit interfaces the distinction is clear to me already. In Julia is it just a "dispatch vs. runtime" distinction? In other words, if you declare a type a subtype of a supertype, then methods defined on the supertype will dispatch on the subtype as well (provided you haven't overridden them). But then the implementation of those methods may fail at runtime if you haven't implemented the informal interface?Damle
Yes, that's duck typing. It really just moves the error down one level — instead of throwing a MethodError for sum(::MyNamedArray), you get a MethodError when it tries to iterate or index into your array within the implementation of sum. It's a runtime missing method error either way.Rolandrolanda
So far a test example is partially working, although oddly I had to implement more methods than the doc referenced claimed I did, which makes me wonder if I've misunderstood something. It does seem that one could write a collection of macros to automatically implement certain interfaces in a pass-through type like this; if I do this I'll post the example.Damle
I appreciate the duck typing explanation (it has never made sense to me before). If you have insight into it, why is this considered a "good thing" in dynamic languages, versus verifying at compile-time that the interface is actually implemented? One answer, as in my question, could be "I just want to pick up and go and not worry about pedantic safety checks," and maybe another is "the Julia devs haven't implemented formal interfaces; they may in the future or they may not"; is there also a stronger positive reason that more experience with dynamically typed languages would help to grok?Damle
Compile time is not well defined in Julia. Definitions and execution can be interleaved… and JIT compilation is really just an implementation detail. Duck typing is simple and very extensible. Anybody can make their own iterable type that works just like the builtin types in for loops, for example.Rolandrolanda
It might be nice if Julia had a feature from CLU, where you can require that the type implements a particular method (and hence get the error earlier).Ess
@MattB. The only problem is you don't realize what you haven't done until it's "too late" and then (as I have been) you end up stress-testing to chase down extra bits of interface that you didn't realize also needed to be implemented, and there's no requirement that someone implement an interface just because the option is provided. As you and every language designer obviously know this, I was thinking I might still be "missing" something.Damle
I agree that Julia with the way method dispatch can change after code execution is sort of an odd case, but I'm not convinced that a formal interface definition couldn't work and help. No one would need to use it of course. Incidentally this connects to @ScottJones's suggestion. I'm also thinking about writing some macros to slightly more formally encode the connection between an abstract type and its informal interface; if I do this I will post them. But again I feel that I must be missing or not groking something--if this is really "good," why haven't better coders already done it?Damle

© 2022 - 2024 — McMap. All rights reserved.