How can you extend Java to introduce passing by reference?
Asked Answered
D

11

28

Java is pass-by-value. How could you modify the language to introduce passing by reference (or some equivalent behavior)?

Take for example something like

public static void main(String[] args) {
    String variable = "'previous String reference'";
    passByReference(ref variable);
    System.out.println(variable); // I want this to print 'new String reference'
}

public static void passByReference(ref String someString) {
    someString = "'new String reference'";
}

which (without the ref) compiles to the following bytecode

  public static void main(java.lang.String[]);
    Code:
       0: ldc           #2                  // String 'previous String reference'
       2: astore_1
       3: aload_1
       4: invokestatic  #3                  // Method passByReference:(Ljava/lang/String;)V
       7: return

  public static void passByReference(java.lang.String);
    Code:
       0: ldc           #4                  // String 'new String reference'
       2: astore_0
       3: return

The code at 3: loads the reference onto the stack from the variable variable.

One possibility I'm considering is to have the compiler determine a method is pass by reference, possibly with ref, and change the method to accept a Holder object which stores the same reference as our variable. When the method completes, and possibly changes that reference in the holder, the variable on the caller side's value is replaced with the holder reference's value.

It should compile to an equivalent of this

public static void main(String[] args) {
    String variable = "'previous String reference'";
    Holder holder = Holder.referenceOf(variable);
    passByReference2(holder);
    variable = (String) holder.getReference(); // I don't think this cast is necessary in bytecode
    System.out.println(variable);
}

public static void passByReference(Holder someString) {
    someString.setReference("'new String reference'");
}

where Holder might be something like

public class Holder {
    Object reference;
    private Holder (Object reference) {
        this.reference = reference;
    }
    public Object getReference() {
        return this.reference;
    }
    public void setReference(Object reference) {
        this.reference = reference;
    }
    public static Holder referenceOf(Object reference) {
        return new Holder(reference);
    }
}

Where can this fail or how could you improve it?

Divinadivination answered 22/1, 2014 at 4:42 Comment(13)
Are you familiar with Jasmin? I've always liked the answer there - under "Implementing Call-by-reference for your language using the JVM instruction set." Spoiler: They called "reference" - "value".Unheard
@ElliottFrisch Thanks for the link, I wasn't familiar with Jasmin. It seems I'm suggesting something similar to their solution with wrapper classes.Divinadivination
Unfortunately, they haven't updated the homepage for almost ten years. I own the book.Unheard
Why not just returning the new value? Or why not using a different programming language that matches your requirements instead of trying to change this one?Sonnie
@holger You might want to return another value from the method. It's a theoretical language design question.Divinadivination
It doesn’t sound like a theoretical language design question; your entire question is about how to implement it, though you already answer it yourself giving an entire solution right inside the question. So what is your question actually? That you are opening a can of worms introducing a language feature that makes local variables non-local needs no discussion. That was understood over a decade ago when Java was created and this language design decision, not to support such thing, was made. As said, you can use a different language if you don’t like it.Sonnie
I recommend an annotation to indicate that a parameter should be pass-by-ref. Since it would be hard to modify the compiler, you could write a bytecode manipulator.Unalienable
@Unalienable That would look nice as well.Divinadivination
I may write such a bytecode manipulator.Unalienable
It would use one-element arrays instead of custom classes.Unalienable
A similar concept is used for SOAP based Web Services in providing WSDL-conform OUT and INOUT operation parameters. As parameters might get updated with new values, these changes need to be returned appropriately to the caller. Java provides therefore the Holder class which is similar to your suggestion. So, unless you propose a language addition (maybe in a C++ like way with adding & at the end of a variable or on including an explicit ref keyword), this (and the array way) will probably be the only working solutions, IMOClements
@RomanVottner Interesting SOAP workaround. And yes, I'm not trying to achieve this in base Java. I'm trying to see how I can modify byte code (compilation) to achieve it.Divinadivination
@RomanVottner, I prefer one-element arrays over generics here for two reasons: primitives would in practice be "double-boxed" since they'd have to be boxed before being treated generically, and arrays throw immediately on ill-typed assignment to their members.Gardant
O
14

To answer your question:

Where can this fail?

  1. Final variables and enum constants
  2. 'Special' references such as this
  3. References that are returned from method calls, or constructed inline using new
  4. Literals (Strings, integers, etc.)

...and possibly others. Basically, your ref keyword must only be usable if the parameter source is a non-final field or local variable. Any other source should generate a compilation error when used with ref.

An example of (1):

final String s = "final";
passByReference(ref s);  // Should not be possible

An example of (2):

passByReference(ref this);  // Definitely impossible

An example of (3):

passByReference(ref toString());  // Definitely impossible
passByReference(ref new String("foo"));  // Definitely impossible

An example of (4):

passByReference(ref "literal");  // Definitely impossible

And then there are assignment expressions, which seem to me like something of a judgement call:

String s;
passByReference(ref (s="initial"));  // Possible, but does it make sense?

It's also a little strange that your syntax requires the ref keyword for both the method definition and the method invocation. I think the method definition would be sufficient.

Obscurant answered 29/1, 2014 at 19:47 Comment(3)
Thanks for the answer. That's the kind of situations I was looking for. There are solutions at compile time for all of these (like C# does). Basically, you only allow references coming from non-final variables.Divinadivination
FYI - I'm not going to claim the list I provided is exhaustive, there may be others. I did add one additional example that's more of a grey area (assignment expressions)Obscurant
In Java, assignment expressions used as values, like (s = "initial") actually push the the value of the right hand side on the stack twice, pop once to assign to variable, and pop again to use as the value of the expression. There's probably a way to detect that too.Divinadivination
G
21

The usual idiom I've seen for pass-by-reference in Java is to pass a single-element array, which will both preserve run-time type-safety (unlike generics which undergo erasure) and avoid the need to introduce a new class.

public static void main(String[] args) {
    String[] holder = new String[1];

    // variable optimized away as holder[0]
    holder[0] = "'previous String reference'";

    passByReference(holder);
    System.out.println(holder[0]);
}

public static void passByReference(String[] someString) {
    someString[0] = "'new String reference'";
}
Gardant answered 22/1, 2014 at 4:56 Comment(3)
This is good for base java. I'll be honest, though, I'd much rather keep my parameters the type they actually are and not have to retrieve an array element. However, if we could change the bytecode to make it equivalent to the code you're showing, then that would be fine.Divinadivination
@Sotirios, that's really what I meant: equivalent base Java code post-transformation. You could use a ref keyword on parameters at both declaration and invocation as a trigger for this transformation, and doing so would allow interoperability with existing base Java code using this idiom.Gardant
Ah ok, I'll check to see how hard/possible it is to do that at the bytecode level. Thanks.Divinadivination
O
14

To answer your question:

Where can this fail?

  1. Final variables and enum constants
  2. 'Special' references such as this
  3. References that are returned from method calls, or constructed inline using new
  4. Literals (Strings, integers, etc.)

...and possibly others. Basically, your ref keyword must only be usable if the parameter source is a non-final field or local variable. Any other source should generate a compilation error when used with ref.

An example of (1):

final String s = "final";
passByReference(ref s);  // Should not be possible

An example of (2):

passByReference(ref this);  // Definitely impossible

An example of (3):

passByReference(ref toString());  // Definitely impossible
passByReference(ref new String("foo"));  // Definitely impossible

An example of (4):

passByReference(ref "literal");  // Definitely impossible

And then there are assignment expressions, which seem to me like something of a judgement call:

String s;
passByReference(ref (s="initial"));  // Possible, but does it make sense?

It's also a little strange that your syntax requires the ref keyword for both the method definition and the method invocation. I think the method definition would be sufficient.

Obscurant answered 29/1, 2014 at 19:47 Comment(3)
Thanks for the answer. That's the kind of situations I was looking for. There are solutions at compile time for all of these (like C# does). Basically, you only allow references coming from non-final variables.Divinadivination
FYI - I'm not going to claim the list I provided is exhaustive, there may be others. I did add one additional example that's more of a grey area (assignment expressions)Obscurant
In Java, assignment expressions used as values, like (s = "initial") actually push the the value of the right hand side on the stack twice, pop once to assign to variable, and pop again to use as the value of the expression. There's probably a way to detect that too.Divinadivination
A
9

Your attempt to modify the language ignores the fact that this "feature" was explicitly left out to prevent well-known side-effect bugs from being able to happen in the first place. Java recommends to do what you are trying to archive by the use of data-holder classes:

public class Holder<T> {
  protected T value;

  public T getValue() {
    return value;
  }

  public void setValue(T value) {
    this.value = value;
  }
}

A thread-safe version would be the AtomicReference.

Now storing a single String in a class seems over-kill and most likely it is, however usually you have a data-holder class for several related values instead of a single String.

The big benefit of this approach is that what happens inside the method is very explicit. So even if you are programming on a Monday morning after an eventful weekend and the coffee machine just broke down, you still can tell easily what the code is doing (KISS), preventing several bugs from even happening in the first place, just because you forgot about that one feature of method foo.

If you think about what your approach can do that the data-holder version cannot, you'll soon realize that you are implementing something just because it is different, but effectively it has no real value.

Amide answered 25/1, 2014 at 17:4 Comment(4)
What problems would exist with pass-by-reference which are not even worse when using holder classes? In .NET, if an object passes one of its fields to an outside method as a byref, it can be certain that anything which is going to happen to that field as a consequence of that method call will happen before it returns (unless the outside code is granted "unsafe" permissions). By contrast, once a Java reference to a mutable object has been exposed to outside code, there's no way of telling how or when or by whom the object might arbitrarily be modified any time in the future.Gabon
Assume you have the function add(myInt), but unknown to you, what myInt points to after calling that function is not what it was when you handed it over. There might be reasons for that, but effectively it makes the code hard to understand and bugs difficult to find. That is why one of the core principles of Java is to be very explicit about what the code is doing and a pass-by-reference violates that. For the same reason every book on Java tells you to not make mutable class-variables public.Amide
I agree with your argument about why it was left out of the language, but I don't agree with the argument about the alternative being bug-prone. The solution I'm looking for would explicitly have markers in the source code that a method is pass by reference, something like a new keyword ref or an annotation like @Ref.Divinadivination
Even if you have annotations, you still don't know what happens inside. This is bug-prone not because it causes bugs in general, but because there will be that one day, where you just forget about that one specialty of this function. If you have something like myInt = add(myInt, 1) it is very obvious what this does and you don't need to actually remember what add might do, it is explicit enough to get it just from reading the code. Even on Mondays before your first coffee.Amide
K
7

Using AtomicReference class as holder object.

public static void main(String[] args) {
    String variable="old";
    AtomicReference<String> at=new AtomicReference<String>(variable);
    passByReference(at);
    variable=at.get();
    System.out.println(variable);
}

public static void passByReference(AtomicReference<String> at) {
  at.set("new");
}
Kinslow answered 22/1, 2014 at 5:52 Comment(4)
Is there a specific reason you would need the reference to be atomic?Divinadivination
Well assignment is atomic: you would probably want to preserve that property. In any case the AtomicReference class is already there. Saves you having to add a holder class.Gabi
@EJP Sure, but what Thread could intercept the reference changing?Divinadivination
AtomicReference doesn't add value here unless you need getAndSet or compareAndSet. Also, if the variable is of primitive type, it has to be boxed to be treated generically; a one-element array already acts as a mutable box, so you don't end up with two references where one will suffice.Gardant
M
4

Oddly enough, I've been thinking about this problem myself recently. I was considering whether it might be fun to create a dialect of VB that ran on the JVM - I decided it wouldn't be.

Anyway, there are two main cases where this is likely to be useful and well defined:

  • local variables
  • object attributes

I'm assuming that you're writing a new compiler (or adapting an existing one) for your new dialect of Java.

Local variables are typically handled by code similar to what you're proposing. I'm most familiar with Scala, which doesn't support pass-by-reference, but does support closures, which have the same issues. In Scala, there's a class scala.runtime.ObjectRef, which resembles your Holder class. There are also similar {...}Ref classes for primitives, volatile variables, and similar.

If the compiler needs to create a closure that updates a local variable, it "upgrades" the variable to a final ObjectRef (which can be passed to the closure in its constructor), and replaces uses of that variable by gets and updates by sets, on the ObjectRef. In your compiler, you could upgrade local variables whenever they're passed by reference.

You could use a similar trick with object attributes. Suppose that Holder implements an interface ByRef. When your compiler sees an object attribute being passed by reference, it could create an anonymous subclass of ByRef that reads and updates the object attribute in its get and set methods. Again, Scala does something similar to this for lazily evaluated parameters (like references, but read-only).

For extra brownie points, you could extend the techique to JavaBean properties and even Map, List and Array elements.

One side effect of this is that at the JVM level, your methods have unexpected signatures. If you compile a method with signature void doIt(ref String), at the bytecode level, you'll end up with the signature void doIt(ByRef) (you might expect this to be something like void doIt(ByRef<String>), but of course generics use type erasure). This can cause problems with method overloading, as all by-ref parameters compile to the same signature.

It may be possible to do this with bytecode manipulation, but there are pitfalls, like the fact that the JVM permits applications to re-use local variables - so at the bytecode level, it may not be clear whether a parameter is being re-assigned, or its slot re-used, if the application was compiled without debugging symbols. Also, the compiler may elide aload instructions if there's no possibility of a value having changed within the outer method - if you don't take steps to avoid this, changes to your reference variable may not be reflected in the outer method.

Mandorla answered 31/1, 2014 at 12:17 Comment(0)
O
1

Think about how it might be implemented with a primitive type, say int. Java - the JVM, not just the language - does not have any "pointer" type to a local variable, on the frame (method stack) or the operand stack. Without that, it is not possible to truly pass by reference.

Other languages that support pass-by-reference use pointers (I believe, though I don't see any other possibility). C++ references (like int&) are pointers in disguise.

I've thought of creating a new set of classes that extend Number, containing int, long, etc. but not immutable. This could give some of the effect of passing primitives by reference - but they won't be auto-boxed, and some other features might not work.

Without support in the JVM, you can't have real pass-by-reference. Sorry, but that's my understanding.

BTW, there are already several Reference-type classes (like you'd like for Holder). ThreadLocal<> (which has get() and set()), or the Reference extenders, like WeakReference (which I think only have get()).

Edit: After reading some other answers, I'd suggest that ref be a form of auto-boxing. Thus:

class ReferenceHolder<T> {
    T referrent;
    static <T> ReferenceHolder<T> valueOf(T object) {
        return new ReferenceHolder<T>(object);
    }
    ReferenceHolder(T object) { referrent = object; }
    T get()            { return referrent; }
    void set(T value)  { referrent = value; }
}

class RefTest {
    static void main() {
        String s = "Hello";
        // This is how it is written...
        change(s);
        // but the compiler converts it to...
        ReferenceHolder<String> $tmp = ReferenceHolder.valueOf(s);
        change($tmp);
        s = $tmp.get();
    }
    // This is how it is written...
    static void change(ref Object s) {
        s = "Goodbye";              // won't work
        s = 17;             // *Potential ClassCastException, but not here*
    }
    // but the compiler converts it tothe compiler treats it as:
    static <T> void change(ReferenceHolder<T> obj) {
        obj.set((T) "Goodbye");     // this works
        obj.set((T) 17);    // *Compiler can't really catch this*
    }
}

But see where there is potential for putting the wrong kind of type in the ReferenceHolder? If genericized properly, the compiler may be able to warn sometimes, but as you likely want the new code to resemble normal code as much as possible, there is the possibility of a CCEx with each auto-ref call.

Occur answered 24/1, 2014 at 15:54 Comment(3)
The JVM doesn't matter if the byte code is effectively doing the job. If at compilation time I can process annotations (or new keywords like ref), then I could create (or modify) the byte code so that it is equivalent to passing by reference.Divinadivination
So, you're considering wrapping a ref int in a SignedInt (for example), passing a reference (by value) to that object and modifying the value stored in the object. Then, when the method returns, you copy the wrapped (and modified) value back into the local variable. I guess the crux of the issue is this last assignment. A sort of auto-boxing and auto-unboxing, which is where you can look for ideas (or, you probably already have).Occur
Yes, that's what I want to eventually transform the byte code to.Divinadivination
G
1

I think you can accomplish most of what you want by building an agent and using cglib.

Many of the examples given here can work. I'd recommend using the template you proposed because it will compile with the normal compiler.

public void doSomething(@Ref String var)

Then behind the scenes you use cglib to rewrite the annotated methods, which is easy. You'll also have to rewrite the caller, which i think will be much more complicated in cglib. javassist uses more of a "source code" oriented approach, and might be better suited for rewriting the callers.

Gastrula answered 30/1, 2014 at 7:41 Comment(0)
M
0

Answering you question about how to extend the language my pick would be: - Using various holders technics as several other answers describe - Use annotations to attach metadata regarding which arguments should be passed by reference and then start juggling with a byte code manipulation library, like cglib in order to fulfil your ideas in byte code itself.

Though this whole idea seems strange.

Mitchiner answered 26/1, 2014 at 16:22 Comment(0)
G
0

There are several ways to write Java code as effectively pass-by-reference, even within the standard pass-by-value conventions.

One approach is to use instance or static variables whose scope includes a particular method, in lieu of explicit parameters. The variables which are being modified could be included in the comments, if you really want to mention their names at the beginning of a method.

The disadvantage with this approach is that the scope of these variables would need to encompass the entire class in question, rather than only the method. If you would like to restrict the variables' scopes more precisely, you could always modify them using getter and setter methods rather than as parameters.

Having worked with both Java and C/C++, I don't think Java's supposed inflexibility in being pass-by-value only is a big deal--for any programmers who know what happens to the variables, there are reasonable workarounds that can accomplish the same things functionally.

Grane answered 27/1, 2014 at 16:47 Comment(1)
I'm not trying to do this in plain Java. I'm trying to figure some ways to manipulate byte code to achieve it, possibly without affecting Java's syntax and/or the types themselves. For example, if the method is declared as public void doSomething(@Ref String var), then the argument should be passed by reference. I don't want to change the source code to look like public void doSomething(Holder<String> var).Divinadivination
R
0

The question is about the language itself, but answers seems to mention practical tricks. Adding to the list.

There is an option to hold data in atomics, such as AtomicReference<T>

  • new AtomicReference<>("my data")

Also it seems that Pair and other sorts of Tuples are good holders, so its often happens that you already have a holder in your project.

  • Tuples.of(....) (reactor.util.function.Tuples)
  • Pair.of(a, b) (spring, apache commons etc)
  • AbstractMap.SimpleEntry (jdk)
  • new Object[]{"my data"} can be a holder too
Roee answered 17/4, 2022 at 7:26 Comment(0)
M
-1

Java is (in fact) pass by reference. When the method is called, the reference(pointer) to the object is passed and when you modify the object you can see the modification when you return from the method. The problem with your example is that java.lang.String is immutable.

And what you are actually achieving with your example is output parameters.

Here is a slightly different version of Jeffrey Hantin:

public static void main(String[] args) {
  StringBuilder variable = new StringBuilder("'previous String reference'");
  passByReference(variable);
  System.out.println(variable); // I want this to print 'new String reference'
}

public static void passByReference(StringBuilder someString) {
  String nr = "'new String reference'";
  someString.replace(0, nr.length() - 1, nr);
}
Malchus answered 30/1, 2014 at 5:59 Comment(3)
No, Java is very much pass by value. The reference of the object isn't passed. A copy of the value of the reference to the object is passed. In your example, in passByReference, if you changed the reference of someString to a new object, ie. someString = new StringBuilder(), that would not be visible from the calling code. I want to extend the language so that that change is visible.Divinadivination
Also, I want the types of the variables in the source code to remain the same. They can be changed in the byte code if necessary.Divinadivination
Sorry, -1 because of "Java is (in fact) pass by reference."Cooper

© 2022 - 2024 — McMap. All rights reserved.