Package-private scope in Scala visible from Java
Asked Answered
B

1

14

I just found out about a pretty weird behaviour of Scala scoping when bytecode generated from Scala code is used from Java code. Consider the following snippet using Spark (Spark 1.4, Hadoop 2.6):

import java.util.Arrays;
import java.util.List;

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.broadcast.Broadcast;

public class Test {
    public static void main(String[] args) {
        JavaSparkContext sc = 
            new JavaSparkContext(new SparkConf()
                                .setMaster("local[*]")
                                .setAppName("test"));

        Broadcast<List<Integer>> broadcast = sc.broadcast(Arrays.asList(1, 2, 3));

        broadcast.destroy(true);

        // fails with java.io.IOException: org.apache.spark.SparkException: 
        // Attempted to use Broadcast(0) after it was destroyed
        sc.parallelize(Arrays.asList("task1", "task2"), 2)
          .foreach(x -> System.out.println(broadcast.getValue()));
    }
}

This code fails, which is expected as I voluntarily destroy a Broadcast before using it, but the thing is that in my mental model it should not even compile, let alone running fine.

Indeed, Broadcast.destroy(Boolean) is declared as private[spark] so it should not be visible from my code. I'll try looking at the bytecode of Broadcast but it's not my specialty, that's why I prefer posting this question. Also, sorry I was too lazy to create an example that does not depend on Spark, but at least you get the idea. Note that I can use various package-private methods of Spark, it's not just about Broadcast.

Any idea of what's going on ?

Basin answered 11/6, 2016 at 8:37 Comment(0)
B
23

If we reconstruct this issue with a simpler example:

package yuvie

class X {
  private[yuvie] def destory(d: Boolean) = true
}

And decompile this in Java:

[yuvali@localhost yuvie]$ javap -p X.class 
Compiled from "X.scala"
public class yuvie.X {
  public boolean destory(boolean);
  public yuvie.X();
}

We see that private[package] in Scala becomes public in Java. Why? This comes from the fact that Java private package isn't equivalent to Scala private package. There is a nice explanation in this post:

The important distinction is that 'private [mypackage]' in Scala is not Java package-private, however much it looks like it. Scala packages are truly hierarchical, and 'private [mypackage]' grants access to classes and objects up to "mypackage" (including all the hierarchical packages that may be between). (I don't have the Scala spec reference for this and my understating here may be hazy, I'm using [4] as a reference.) Java's packages are not hierarchical, and package-private grants access only to classes in that package, as well as subclasses of the original class, something that Scala's 'private [mypackage]' does not allow.

So, 'package [mypackage]' is both more and less restrictive that Java package-private. For both reasons, JVM package-private can't be used to implement it, and the only option that allows the uses that Scala exposes in the compiler is 'public.'

Broderickbrodeur answered 11/6, 2016 at 9:27 Comment(6)
Thanks for the answer. Don't you think this is a bit dangerous for API writers ? Functionalities they never wanted to be exposed end up plainly visible from Java. I wonder if they could use some annotation trick to generate warnings on the user when they try using a member that was intended to be privateBasin
@Basin If you plan to interop with Java then yes, I definitely think it is something you have to take under consideration, especially if this exposes internals you don't want clients to invoke. Although in this particular case, you could also call the public Broadcast.destory method, shooting yourself in the foot equivalently.Broderickbrodeur
Yup, what I meant is that now that I know all Spark internals declared as package-private as exposed through the Java API, I think there should probably be more Java wrappers to hide functionalities that weren't intended to be public. My example was just to show the method was actually called.Basin
@Basin This would perhaps be a good question for the Spark mailing listBroderickbrodeur
Anyone who wants to access stuff they're not supposed to could just use reflection anyway.Haemo
@Haemo You're right, but the effort of using reflection usually makes people think it's not worth it, unlike simply invoking a method which is right there. I agree that this can be problematic.Broderickbrodeur

© 2022 - 2024 — McMap. All rights reserved.