How do multimethods solve the namespace issue?
Asked Answered
P

7

32

I am researching programming language design, and I am interested in the question of how to replace the popular single-dispatch message-passing OO paradigm with the multimethods generic-function paradigm. For the most part, it seems very straightforward, but I have recently become stuck and would appreciate some help.

Message-passing OO, in my mind, is one solution that solves two different problems. I explain what I mean in detail in the following pseudocode.

(1) It solves the dispatch problem:

=== in file animal.code ===

   - Animals can "bark"
   - Dogs "bark" by printing "woof" to the screen.
   - Cats "bark" by printing "meow" to the screen.

=== in file myprogram.code ===

import animal.code
for each animal a in list-of-animals :
   a.bark()

In this problem, "bark" is one method with multiple "branches" which operate differently depending upon the argument types. We implement "bark" once for each argument type we are interested in (Dogs and Cats). At runtime we are able to iterate through a list of animals and dynamically select the appropriate branch to take.

(2) It solves the namespace problem:

=== in file animal.code ===

   - Animals can "bark"

=== in file tree.code ===

   - Trees have "bark"

=== in file myprogram.code ===

import animal.code
import tree.code

a = new-dog()
a.bark() //Make the dog bark

…

t = new-tree()
b = t.bark() //Retrieve the bark from the tree

In this problem, "bark" is actually two conceptually different functions which just happen to have the same name. The type of the argument (whether dog or tree) determines which function we actually mean.


Multimethods elegantly solve problem number 1. But I don't understand how they solve problem number 2. For example, the first of the above two examples can be translated in a straightforward fashion to multimethods:

(1) Dogs and Cats using multimethods

=== in file animal.code ===

   - define generic function bark(Animal a)
   - define method bark(Dog d) : print("woof")
   - define method bark(Cat c) : print("meow")

=== in file myprogram.code ===

import animal.code
for each animal a in list-of-animals :
   bark(a)

The key point is that the method bark(Dog) is conceptually related to bark(Cat). The second example does not have this attribute, which is why I don't understand how multimethods solve the namespace issue.

(2) Why multimethods don't work for Animals and Trees

=== in file animal.code ===

   - define generic function bark(Animal a)

=== in file tree.code ===

   - define generic function bark(Tree t)

=== in file myprogram.code ===

import animal.code
import tree.code

a = new-dog()
bark(a)   /// Which bark function are we calling?

t = new-tree
bark(t)  /// Which bark function are we calling?

In this case, where should the generic function be defined? Should it be defined at the top-level, above both animal and tree? It doesn't make sense to think of bark for animal and tree as two methods of the same generic function because the two functions are conceptually different.

As far as I know, I haven't found any past work that's solved this problem yet. I have looked at Clojure multimethods, and CLOS multimethods and they have the same problem. I am crossing my fingers and hoping for either an elegant solution to the problem, or a persuading argument on why it's actually not a problem in real programming.

Please let me know if the question needs clarification. This is a fairly subtle (but important) point I think.


Thanks for the replies sanity, Rainer, Marcin, and Matthias. I understand your replies and completely agree that dynamic dispatch and namespace resolution are two different things. CLOS does not conflate the two ideas, whereas traditional message-passing OO does. This also allows for a straightforward extension of multimethods to multiple inheritance.

My question specifically is in the situation when the conflation is desirable.

The following is an example of what I mean.

=== file: XYZ.code ===

define class XYZ :
   define get-x ()
   define get-y ()
   define get-z ()

=== file: POINT.code ===

define class POINT :
   define get-x ()
   define get-y ()

=== file: GENE.code ===

define class GENE :
   define get-x ()
   define get-xx ()
   define get-y ()
   define get-xy ()

==== file: my_program.code ===

import XYZ.code
import POINT.code
import GENE.code

obj = new-xyz()
obj.get-x()

pt = new-point()
pt.get-x()

gene = new-point()
gene.get-x()

Because of the conflation of namespace resolution with dispatch, the programmer can naively call get-x() on all three objects. This is also perfectly unambiguous. Each object "owns" its own set of methods, so there is no confusion as to what the programmer meant.

Contrast this to the multimethod version:


=== file: XYZ.code ===

define generic function get-x (XYZ)
define generic function get-y (XYZ)
define generic function get-z (XYZ)

=== file: POINT.code ===

define generic function get-x (POINT)
define generic function get-y (POINT)

=== file: GENE.code ===

define generic function get-x (GENE)
define generic function get-xx (GENE)
define generic function get-y (GENE)
define generic function get-xy (GENE)

==== file: my_program.code ===

import XYZ.code
import POINT.code
import GENE.code

obj = new-xyz()
XYZ:get-x(obj)

pt = new-point()
POINT:get-x(pt)

gene = new-point()
GENE:get-x(gene)

Because get-x() of XYZ has no conceptual relation to get-x() of GENE, they are implemented as separate generic functions. Hence, the end programmer (in my_program.code) must explicitly qualify get-x() and tell the system which get-x() he actually means to call.

It is true that this explicit approach is clearer and easily generalizable to multiple dispatch and multiple inheritance. But using (abusing) dispatch to solve namespace issues is an extremely convenient feature of message-passing OO.

I personally feel that 98% of my own code is adequately expressed using single-dispatch and single-inheritance. I use this convenience of using dispatch for namespace resolution much more so than I use multiple-dispatch, so I am reluctant to give it up.

Is there a way to get me the best of both worlds? How do I avoid the need to explicitly qualify my function calls in a multi-method setting?


It seems that the consensus is that

  • multimethods solve the dispatch problem but do not attack the namespace problem.
  • functions that are conceptually different should have different names, and users should be expected to manually qualify them.

I then believe that, in cases where single-inheritance single-dispatch is sufficient, message-passing OO is more convenient than generic functions.

This sounds like it is open research then. If a language were to provide a mechanism for multimethods that may also be used for namespace resolution, would that be a desired feature?

I like the concept of generic functions, but currently feel they are optimized for making "very hard things not so hard" at the expense of making "trivial things slightly annoying". Since the majority of code is trivial, I still believe this is a worthwhile problem to solve.

Pairoar answered 4/3, 2012 at 7:27 Comment(2)
What problem do you see here? The method should work correctly for both types.Hyonhyoscine
In fact, it's the other way around: multimethods get namespacing right. In particular, they make multiple inheritance work. Think about what happens when at some point you discover that you need to define a Treant class that models something that is both a Tree and an Animal. With multimethods, there is no problem, since they are namespaced independently from class. With class-namespaced methods, you get a conflict.Keynesianism
K
21

Dynamic dispatch and namespace resolution are two different things. In many object systems classes are also used for namespaces. Also note that often both the class and the namespace are tied to a file. So these object systems conflate at least three things:

  • class definitions with their slots and methods
  • the namespace for identifiers
  • the storage unit of source code

Common Lisp and its object system (CLOS) works differently:

  • classes don't form a namespace
  • generic functions and methods don't belong to classes and thus are not defined inside classes
  • generic functions are defined as top-level functions and thus are not nested or local
  • identifiers for generic functions are symbols
  • symbols have their own namespace mechanism called packages
  • generic functions are 'open'. One can add or delete methods at any time
  • generic functions are first-class objects
  • mathods are first-class objects
  • classes and generic functions are also not conflated with files. You can define multiple classes and multiple generic functions in one file or in as many files as you want. You can also define classes and methods from running code (thus not tied to files) or something like a REPL (read eval print loop).

Style in CLOS:

  • if a functionality needs dynamic dispatch and the functionality is closely related, then use one generic function with different methods
  • if there are many different functionalities, but with a common name, don't put them in the same generic function. Create different generic functions.
  • generic functions with the same name, but where the name is in different packages are different generic functions.

Example:

(defpackage "ANIMAL" (:use "CL")) 
(in-package "ANIMAL")

(defclass animal () ())
(deflcass dog (animal) ())
(deflcass cat (animal) ()))

(defmethod bark ((an-animal dog)) (print 'woof))
(defmethod bark ((an-animal cat)) (print 'meow)) 

(bark (make-instance 'dog))
(bark (make-instance 'dog))

Note that the class ANIMAL and the package ANIMAL have the same name. But that is not necessary so. The names are not connected in any way. The DEFMETHOD implicitly creates a corresponding generic function.

If you add another package (for example GAME-ANIMALS), then the BARK generic function will be different. Unless these packages are related (for example one package uses the other).

From a different package (symbol namespace in Common Lisp), one can call these:

(animal:bark some-animal)

(game-animal:bark some-game-animal)

A symbol has the syntax

PACKAGE-NAME::SYMBOL-NAME

If the package is the same as the current package, then it can be omitted.

  • ANIMAL::BARK refers to the symbol named BARK in the package ANIMAL. Note that there are two colons.
  • AINMAL:BARK refers to the exported symbol BARK in the package ANIMAL. Note that there is only one colon. Exporting, importing and using are mechanisms defined for packages and their symbols. Thus they are independent of classes and generic functions, but it can be used to structure the namespace for the symbols naming those.

The more interesting case is when multimethods are actually used in generic functions:

(defmethod bite ((some-animal cat) (some-human human))
  ...)

(defmethod bite ((some-animal dog) (some-food bone))
  ...)

Above uses the classes CAT, HUMAN, DOG and BONE. Which class should the generic function belong to? What would the special namespace look like?

Since generic functions dispatch over all arguments, it does not make direct sense to conflate the generic function with a special namespace and make it a definition in a single class.

Motivation:

Generic functions were added in the 80s to Lisp by developers at Xerox PARC (for Common LOOPS) and at Symbolics for New Flavors. One wanted to get rid of an additional calling mechanism (message passing) and bring dispatch to ordinary (top-level) functions. New Flavors had single dispatch, but generic functions with multiple arguments. The research into Common LOOPS then brought multiple dispatch. New Flavors and Common LOOPS were then replaced by the standardized CLOS. These ideas then were brought to other languages like Dylan.

Since the example code in the question does not use anything generic functions have to offer, it looks like one has to give up something.

When single dispatch, message passing and single inheritance is sufficient, then generic functions may look like a step back. The reason for this is, as mentioned, that one does not want to put all kinds of similar named functionality into one generic function.

When

(defmethod bark ((some-animal dog)) ...)
(defmethod bark ((some-tree oak)) ...)

look similar, they are two conceptually different actions.

But more:

(defmethod bark ((some-animal dog) tone loudness duration)
   ...)

(defmethod bark ((some-tree oak)) ...)

Now suddenly the parameter lists for the same named generic function looks different. Should that be allowed to be one generic function? If not, how do we call BARK on various objects in a list of things with the right parameters?

In real Lisp code generic functions usually look much more complicated with several required and optional arguments.

In Common Lisp generic functions also not only have a single method type. There are different types of methods and various ways to combine them. It makes only sense to combine them, when they really belong to a certain generic function.

Since generic functions are also first class objects, they can be passed around, returned from functions and stored in data structures. At this point the generic function object itself is important, not its name anymore.

For the simple case where I have an object, which has x and y coordinates and can act as a point, I would inherit for the objects's class from a POINT class (maybe as some mixin). Then I would import the GET-X and GET-Y symbols into some namespace - where necessary.

There are other languages which are more different from Lisp/CLOS and which attempt(ed) to support multimethods:

There seems to be many attempts to add it to Java.

Knopp answered 4/3, 2012 at 8:35 Comment(3)
Upvoted for a clear and concise explanation of the philosophy of CLOS. I am, however, interested in the (IMO common) situation where the conflation is desirable. I have included a clarification in the original post.Pairoar
Is this tradeoff strictly necessary? Do I have to give up namespace resolution if I want multimethods? This is the heart of my question, is there any research on multimethods that comes with all the conveniences of message-passing?Pairoar
@user1156849: you can look at languages like MultiJava. All or C# and see how they support something like multimethods. They may not provide all the convenience of generic functions, though.Knopp
D
9

Your example for "Why multimethods won't work" presumes that you can define two identically-named generic functions in the same language namespace. This is generally not the case; for example, Clojure multimethods belong explicitly to a namespace, so if you have two such generic functions with the same name, you would need to make clear which you are using.

In short, functions that are "conceptually different" will always either have different names, or live in different namespaces.

Diploblastic answered 4/3, 2012 at 8:29 Comment(0)
B
3

Generic functions should perform the same "verb" for all classes its method is implemented for.

In the animals/tree "bark" case, the animal-verb is "perform a sound action" and in the tree case, well, I guess it's make-environment-shield.

That English happens to call them both "bark" is just a linguistic co-incidence.

If you have a case where multiple different GFs (generic functions) really should have the same name, using namespaces to separate them is (probably) the right thing.

Berube answered 4/3, 2012 at 11:45 Comment(0)
S
2

Message-passing OO does not, in general, solve the namespacing problem that you talk about. OO languages with structural type systems don't differentiate between a method bark in an Animal or a Tree as long as they have the same type. It's only because popular OO languages use nominal type systems (e.g., Java) that it seems like that.

Staphyloplasty answered 4/3, 2012 at 17:21 Comment(3)
Hi Asumu, could you elaborate on what you mean? It is obvious that Java solves the namespace problem because the type system allows the compiler to statically determine which function to call. In dynamic languages with structural type systems (e.g. Python/Ruby), the compiler does not differentiate between Animal:bark and Tree:bark. However at runtime, you are still assured that the correct method is called because of the dispatching mechanism. In other words, if you use it like Java, it will still work like it does in Java. The lack of a nominal type system does not prevent that.Pairoar
Yes, the correct implementation will be called. However, suppose that you have an invariant that a bark method should only be called on Animals in a part of your code. Java's type system lets you enforce this. However, an untyped language or a structually typed language (e.g., OCaml) wouldn't be able to differentiate at compile-time. Thus, it's a matter of interface.Staphyloplasty
I understand what you mean now. Yes, the message-passing scheme in the absence of a nominal typing scheme might lead to some strange bugs. But it is very convenient nonetheless. In the case of Python/Ruby I am already sacrificing many compile-time guarantees so it is not asking too much to sacrifice another.Pairoar
K
2

Because get-x() of XYZ has no conceptual relation to get-x() of GENE, they are implemented as separate generic functions

Sure. But since their arglist is the same (just passing the object to the method), then you 'could' implement them as different methods on the same generic function.

The only constraint when adding a method to a generic function, is that the arglist of the method matches the arglist of the generic function.

More generally, methods must have the same number of required and optional parameters and must be capable of accepting any arguments corresponding to any &rest or &key parameters specified by the generic function.

There's no constraint that the functions must be conceptually related. Most of the time they are (overriding a superclass, etc.), but they certainly don't have to be.

Although even this constraint (need the same arglist) seems limiting at times. If you look at Erlang, functions have arity, and you can define multiple functions with the same name that have different arity (functions with same name and different arglists). And then a sort of dispatch takes care of calling the right function. I like this. And in lisp, I think this would map to having a generic function accept methods that have varying arglists. Maybe this is something that is configurable in the MOP?

Although reading a bit more here, it seems that keyword arguments might allow the programmer to achieve having a generic function encapsulate methods with completely different arity, by using different keys in different methods to vary their number of arguments:

A method can "accept" &key and &rest arguments defined in its generic function by having a &rest parameter, by having the same &key parameters, or by specifying &allow-other-keys along with &key. A method can also specify &key parameters not found in the generic function's parameter list--when the generic function is called, any &key parameter specified by the generic function or any applicable method will be accepted.

Also note that this sort of blurring, where different methods stored in the generic function do conceptually different things, happens behind the scenes in your 'tree has bark', 'dogs bark' example. When defining the tree class, you'd set an automatic getter and setter method for the bark slot. When defining the dog class, you'd define a bark method on the dog type that actually does the barking. And both of these methods get stored in a #'bark generic function.

Since they are both enclosed in the same generic function, you'd call them in exactly the same way:

(bark tree-obj) -> Returns a noun (the bark of the tree)
(bark dog-obj) -> Produces a verb (the dog barks)

As code:

CL-USER> 
(defclass tree ()
  ((bark :accessor bark :initarg :bark :initform 'cracked)))
#<STANDARD-CLASS TREE>
CL-USER> 
(symbol-function 'bark)
#<STANDARD-GENERIC-FUNCTION BARK (1)>
CL-USER> 
(defclass dog ()
  ())
#<STANDARD-CLASS DOG>
CL-USER> 
(defmethod bark ((obj dog))
  'rough)
#<STANDARD-METHOD BARK (DOG) {1005494691}>
CL-USER> 
(symbol-function 'bark)
#<STANDARD-GENERIC-FUNCTION BARK (2)>
CL-USER> 
(bark (make-instance 'tree))
CRACKED
CL-USER> 
(bark (make-instance 'dog))
ROUGH
CL-USER> 

I tend to favor this sort of 'duality of syntax', or blurring of features, etc. And I do not think that all methods on a generic function have to be conceptually similar. That's just a guideline IMO. If a linguistic interaction in the English language happens (bark as noun and verb), it's nice to have a programming language that handles the case gracefully.

Kingmaker answered 4/3, 2012 at 17:41 Comment(2)
Thanks for replying. I also favor this "duality of syntax" as you call it. Even though it conflates namespace resolution with dispatch, it is a major advantage of message-passing systems. The reason why the approach you gave is troublesome for generic functions is that every time I declare a new getter function for some arbitrary class, I have to make sure to correspondingly declare a generic function at the topmost level, if I want to allow it to be used that way.Pairoar
@user1156849 The 'duality of syntax' term is from Doug Hoyte's Let Over Lambda book. CLOS actually is doing what you say under the hood. When a defclass form has a slot with an :accessor argument, it automatically writes the code that adds the getter to the generic function. That's one of my favorite CLOS features. No getter/setter boilerplate code that the programmer has to repeatedly write.Kingmaker
L
1

You are working with several concepts, and mixing them, like : namespaces, global generic functions, local generic functions (methods), method invocation, message passing, etc.

In some circumstances, those concepts may overlap sintactically, been difficult to implement. It seems to me that you are also mixing a lot of concepts in your mind.

Functional languages, are not my strength, I have done some work with LISP.

But, some of this concepts, are used in other paradigms, such as Procedural, & Object (Class) Orientation. You may want to check how this concepts are implemented, and, later, return to your own programming language.

For example, something that I consider very important its the use of namespace ( "modules" ), as a separate concept from Procedural Programming, and avoid identifier clashes, as the ones, you mention. A programming language with namespace like yours, would be like this:

=== in file animal.code ===

define module animals

define class animal
  // methods doesn't use "bark(animal AANIMAL)"
  define method bark()
  ...
  end define method
end define class

define class dog
  // methods doesn't use "bark(dog ADOG)"
  define method bark()
  ...
  end define method
end define class

end define module

=== in file myprogram.code ===

define module myprogram

import animals.code
import trees.code

define function main
  a = new-dog()
  a.bark() //Make the dog bark

  …

  t = new-tree()
  b = t.bark() //Retrieve the bark from the tree
end define function main

end define module

Cheers.

Limit answered 30/4, 2012 at 23:16 Comment(0)
S
1

This is the general question of where to put the dispatch table many programming languages are trying to address in a convenient way.

In case of OOP we put it into the class definition (we have type+function concretion this way, spiced with inheritance it gives all the pleasures of architecture issues).

In case of FP we put it inside the dispatching function (we have a shared centralized table, this is usually not that bad, but not perfect as well).

I like interface-based approach, when I can create the virtual table separately of any data type AND of any shared function definition (protocol in Clojure).

In Java (sorry) it will look like this:

Let's assume ResponseBody is an interface.

public static ResponseBody create(MediaType contentType,
     long contentLength, InputStream content) {

    return new ResponseBody() {
      public MediaType contentType() {
        return contentType;
      }

      public long contentLength() {
        return contentLength;
      }

      public BufferedSource source() {
        return streamBuffered(content);
      }
    };
}

The virtual table gets created for this specific create function. This completely solves namespace problem, you can also have a non-centralized type-based dispatch (OOP) if you want to.

It is also becomes trivial to have a separate implementation without declaring new data types for testing purposes.

Sherwood answered 13/1, 2018 at 23:37 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.