[L array notation - where does it come from?
Asked Answered
V

7

93

I've often seen messages that use [L then a type to denote an array, for instance:

[Ljava.lang.Object; cannot be cast to [Ljava.lang.String;

(The above being an arbitrary example I just pulled out.) I know this signifies an array, but where does the syntax come from? Why the beginning [ but no closing square bracket? And why the L? Is it purely arbitrary or is there some other historical/technical reason behind it?

Valuate answered 23/2, 2011 at 0:51 Comment(2)
check out this postCarinacarinate
There is really no reason to use this format in messages for human readers.Emmi
H
92

[ stands for Array, the Lsome.type.Here; represent the type of the array. That's similar to the type descriptors used internally in the bytecode seen in §4.3 of the Java Virtual Machine Specification -- . The only difference is in that the real descriptors use / rather than . for denoting packages.

For instance, for primitives the value is: [I for array of ints, a two-dimensional array would be: [[I (strictly speaking Java doesn't have real two-dimensional arrays, but you can make arrays that consist of arrays).

Since classes may have any name, it would be harder to identify what class it is so they are delimited with L, followed by the class name and finishing with a ;

Descriptors are also used to represent the types of fields and methods.

For instance:

(IDLjava/lang/Thread;)Ljava/lang/Object;

... corresponds to a method whose parameters are int, double, and Thread and the return type is Object

edit

You can also see this in .class files using the java dissambler

C:>more > S.java
class S {
  Object  hello(int i, double d, long j, Thread t ) {
   return new Object();
  }
}
^C
C:>javac S.java

C:>javap -verbose S
class S extends java.lang.Object
  SourceFile: "S.java"
  minor version: 0
  major version: 50
  Constant pool:
const #1 = Method       #2.#12; //  java/lang/Object."<init>":()V
const #2 = class        #13;    //  java/lang/Object
const #3 = class        #14;    //  S
const #4 = Asciz        <init>;
const #5 = Asciz        ()V;
const #6 = Asciz        Code;
const #7 = Asciz        LineNumberTable;
const #8 = Asciz        hello;
const #9 = Asciz        (IDJLjava/lang/Thread;)Ljava/lang/Object;;
const #10 = Asciz       SourceFile;
const #11 = Asciz       S.java;
const #12 = NameAndType #4:#5;//  "<init>":()V
const #13 = Asciz       java/lang/Object;
const #14 = Asciz       S;

{
S();
  Code:
   Stack=1, Locals=1, Args_size=1
   0:   aload_0
   1:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   4:   return
  LineNumberTable:
   line 1: 0


java.lang.Object hello(int, double, long, java.lang.Thread);
  Code:
   Stack=2, Locals=7, Args_size=5
   0:   new     #2; //class java/lang/Object
   3:   dup
   4:   invokespecial   #1; //Method java/lang/Object."<init>":()V
   7:   areturn
  LineNumberTable:
   line 3: 0


}

And in raw class file ( look at line 5 ):

enter image description here

Reference: Field description on the JVM specification

Hemostat answered 23/2, 2011 at 0:55 Comment(1)
Java doesn't have real two-dimensional arrays, but you can make arrays that consist of arrays; [[I just means array-of-array-of-int.Ember
I
62

JVM array descriptors.

[Z = boolean
[B = byte
[S = short
[I = int
[J = long
[F = float
[D = double
[C = char
[L = any non-primitives(Object)

To get the main data-type, you need:

[Object].getClass().getComponentType();

It will return null if the "object" is not an array. to determine if it is an array, just call:

[Any Object].getClass().isArray()

or

Class.class.isArray();
Inferential answered 20/9, 2012 at 4:5 Comment(1)
Nice! This is what I was looking forCulicid
C
12

This is used in the JNI (and the JVM internally in general) to indicate a type. Primitives are denoted with a single letter (Z for boolean, I for int, etc), [ indicates an array, and L is used for a class (terminated by a ;).

See here: JNI Types

EDIT: To elaborate on why there is no terminating ] - this code is to allow the JNI/JVM to quickly identify a method and its signature. It's intended to be as compact as possible to make parsing fast (=as few characters as possible), so [ is used for an array which is pretty straightforward (what better symbol to use?). I for int is equally obvious.

Convexoconcave answered 23/2, 2011 at 0:55 Comment(8)
You're answering a different question. In fact, OP has explicitly stated he's not asking "what does it mean".Uncrowned
@Nikita If you read through that doc, you'll find that the "L" means "Fully qualified class", and "[L" indicates a very specific type of array (an array of FQCs), not just any array.Swordbill
@Nikita: The question is "where does it come from"? Well, it comes from the JNI.Convexoconcave
@Convexoconcave The question is 'why'. And that's a very interesting question, I'd like to know the answer too. While the question "in which chapter of JVM spec it's specified" is not.Uncrowned
I think the question is where "L" and "Z" and these other arbitrary-sounding abbreviations come from.Magruder
@Convexoconcave Yeah. That addresses the question (part of it), so I remove downvote.Uncrowned
These are not JNI specific but JVM internal representation.Hemostat
@Hemostat is right, this was part of the JVM specification before JNI even existed. JNI is reusing the representation in the JVM spec, not the other way around.Coarctate
C
9

[L array notation - where does it come from?

From the JVM spec. This is the representation of type names that is specified in the classFile format and other places.

  • The '[' denotes an array. In fact, the array type name is [<typename> where <typename> is the name of the base type of the array.
  • 'L' is actually part of the base type name; e.g. String is "Ljava.lang.String;". Note the trailing ';'!!

And yes, the notation is documented in other places as well.

Why?

There is no doubt that that internal type name representation was chosen because it is:

  • compact,
  • self-delimiting (this is important for representations of method signatures, and it's why the 'L' and the trailing ';' are there), and
  • uses printable characters (for legibility ... if not readability).

But it is unclear why they decided to expose the internal type names of array types via the Class.getName() method. I think they could have mapped the internal names to something more "human friendly". My best guess is that it was just one of those things that they didn't get around to fixing until it was too late. (Nobody is perfect ... not even the hypothetical "intelligent designer".)

Coarctate answered 23/2, 2011 at 2:21 Comment(0)
P
7

I think it's because C was taken by char, so next letter in class is L.

Plumb answered 6/9, 2015 at 19:19 Comment(2)
Great idea. But do you have any actual references to show you are correct?Marcela
nice.. but L could have used for Long..why used J for LongBeautiful
E
3

Another source for this would be the documentation of Class.getName(). Of course, all these specifications are congruent, since they are made to fit each other.

Erigena answered 23/2, 2011 at 1:21 Comment(0)
T
2

L comes from "Lvalue" which is a term borrowed from C.

In Von Neumann programming languages, like Java or C, Lvalues are types which can be assigned. L values store memory location. They are called Lvalues because in languages with C syntax, they stand on left side of assigment operator, assuming that operator is = as it is in Java or C.

Essentialy in Von Neumann languages on the left side of assigment is a variable which is always memory location. Assigment operator is assigning R value, value of expression on the right, to the memory location marked by L value.

We usually don't talk about primitive types as Lvalues, although they do stand on left side of assigment, they are substitute name for for memory location of value, and on asm/microcode level they have to be loaded and stored from/to memory in order for them to change value. Thus while they are Lvalues, we do not call them that as we are idiots who wrongly borrow terminology. Thus making object references the only "Lvalue" in JVM.

Anyway think of it as C technical jargon which leaked into JVM and get twisted. The same as throwing NullPointerException on null reference. etc.

Tijuana answered 3/11, 2023 at 23:27 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.