NoClassDefFoundError "wrong name" for a class in the java.lang package
Asked Answered
W

3

11

I'm running Cassandra 2.2.11 (and won't be upgrading) on a host. Periodically, in a cron job, I run nodetool commands for monitoring. nodetool is implemented as just another java process that uses JMX to talk to the Cassandra java process. I launch five or so commands every minute.

Once in a while (not in any recognizable pattern), the execution of nodetool will fail with a NoClassDefFoundError that refers to a class from java.lang. For example,

java.lang.NoClassDefFoundError: java/lang/Thread (wrong name: java/lang/Thread)
    at java.lang.Class.getDeclaredFields0(Native Method)
    at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
    at java.lang.Class.getDeclaredField(Class.java:2068)
    at java.util.concurrent.FutureTask.<clinit>(FutureTask.java:476)
    at java.util.concurrent.ScheduledThreadPoolExecutor.scheduleWithFixedDelay(ScheduledThreadPoolExecutor.java:590)
    at sun.rmi.transport.tcp.TCPChannel.free(TCPChannel.java:347)
    at sun.rmi.server.UnicastRef.free(UnicastRef.java:431)
    at sun.rmi.server.UnicastRef.done(UnicastRef.java:448)
    at sun.rmi.registry.RegistryImpl_Stub.lookup(Unknown Source)
    at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132)
    at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205)
    at javax.naming.InitialContext.lookup(InitialContext.java:417)
    at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955)
    at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922)
    at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287)
    at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
    at org.apache.cassandra.tools.NodeProbe.connect(NodeProbe.java:183)
    at org.apache.cassandra.tools.NodeProbe.<init>(NodeProbe.java:150)
    at org.apache.cassandra.tools.NodeTool$NodeToolCmd.connect(NodeTool.java:302)
    at org.apache.cassandra.tools.NodeTool$NodeToolCmd.run(NodeTool.java:242)
    at org.apache.cassandra.tools.NodeTool.main(NodeTool.java:158)

In this stack trace, the error happens during class initialization for FutureTask. I've also seen

java.lang.NoClassDefFoundError: java/lang/Object (wrong name: java/lang/Object)
    at java.lang.Class.getDeclaredMethods0(Native Method)
    at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
    at java.lang.Class.getDeclaredMethod(Class.java:2128)
    at java.lang.invoke.MethodHandleImpl$Lazy.<clinit>(MethodHandleImpl.java:614)
    [...]

but also

java.lang.NoClassDefFoundError: java/lang/String (wrong name: java/lang/String)
    at java.lang.Class.getDeclaredFields0(Native Method)
    at java.lang.Class.privateGetDeclaredFields(Class.java:2583)
    at java.lang.Class.getDeclaredField(Class.java:2068)
    at java.io.ObjectStreamClass.getDeclaredSUID(ObjectStreamClass.java:1703)
    at java.io.ObjectStreamClass.access$700(ObjectStreamClass.java:72)
    at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:484)
    at java.io.ObjectStreamClass$2.run(ObjectStreamClass.java:472)
    [...]

So it's not only happening during class initialization, but, in the few samples I've collected, something in the reflection implementation does seem to be the culprit.

Java is at version 8

java version "1.8.0_144"

The nodetool launcher always uses the same classpath. And there are no weird classes in there (or additional class loaders). The same installation is done across hundreds of identical nodes (on Linux).

My top search results for NoClassDefFoundError wrong name refer to executions where a simplified class name was used to launch java, rather than the fully qualified name. That's not the issue here. Also, the names in the error messages are identical.

So what can cause such "wrong name" NoClassDefFoundError errors for "bootstrap" classes?

Wifehood answered 26/10, 2017 at 0:9 Comment(2)
This question is probably a duplicate, but my understanding is that the NoClassDefFoundError error you are seeing may be misleading. It means that the class loader was unable to load java.lang.Thread, not that the class was missing.Spiroid
@TimBiegeleisen What would lead a JVM to fail to load "bootstrap" classes like java.lang.Object when it's past the point where it should have loaded them?Wifehood
S
1

i think it is the lack of resource that cause the problems like the connector timeout or something. Do you see the log from your example?; nodeprobe is connecting through the jmx or trying to connect then the error occurs? Those are very typical error that can also cause other intermiten error on the shit.(usually OS/netowrk OS shit) thus : includes your string and even object based error ;in conclusion it make sense. may be you should check your resource when the error happen. i know this is kind of catch 22 that the resource monitor is causing the lack of resource instead; but it happen hehe

Smriti answered 2/11, 2017 at 10:31 Comment(1)
A lack of what resource? How would a connector timeout cause such an error? That should just be a classic SocketException, not an NoClassDefFoundError. None of this seems correct.Wifehood
S
1

As none of basic java library found,I think there is problem in your java installation or you have not set CLASSPATH and JAVA_HOME environment variables. Try to set CLASSPATH and JAVA_HOME environment variables.

export JAVA_HOME="/usr/lib/jvm/java-8-oracle/bin"
export CLASSPATH="/usr/lib/jvm/java-8-oracle/lib"

If not worked, try reinstall java and set environment variables.

Sparling answered 4/11, 2017 at 15:25 Comment(1)
If that was the case, it would be failing repeatedly, not just once every few hundred/thousand runs. Our classpath is set explicitly, we don't rely on the environment variable. There's nothing in our environment that requires JAVA_HOME. This answer is wrong.Wifehood
M
1

According to the stacktraces, the exception id being thrown in a calls to getDeclaredFields0. However, this is not where the exception came from originally. According to the OpenJDK source code, there is nothing in the codebase that throws an exception with "wrong name" in the exception message. The message has come from somewhere else.

I strongly suspect that this is actually re-reporting a problem that happened the first time that some class was loaded or initialized. What happens is that the classloader finds the problem the first time, marks the offending internal class object as "bad" and then the throws the error. According to the javadoc, applications should not attempt to recover from this. But if one does, and then attempts to use the "bad" class in some way, the original problem will be reported again as a NoClassDefFoundError with the original reason.

So what does the reason mean?

It is hard to tell because we don't have the stacktrace for the original exception; i.e. then one where the classloading / initialization first failed. If you can find that stacktrace, we can track down the 3rd-party library that did it. It is almost certainly happening in a classloader.

The obvious meaning is that a class file has a classname in it that doesn't match the name in the classes bytecodes. However, we'd need to examine the classloader code to be sure.

So why is it happening intermittently?

Possibly because the application JVM has many classloaders and only a subset of them have "polluted" their class namespace with this bad class.

That could be bad news. It suggests there may be some kind of synchronization issue in the core of the application.

Anyhow, there is not enough evidence to draw sound conclusions.

Bottom line

Based on the evidence, I would guess that this is a result of some kind of "code weaving" or "byte code engineering" that has gone wrong. As a further guess, I would say that some child classloader is not delegating properly, and has mistakenly attempted to process a built-in class. (It could even be that the classloader in question knows that it should never process a "java.lang.*" class and it has a obscure way of saying this.)

Why? possibly because someone / something explicitly added the "rt.jar" to some classpath that it shouldn't be on.

For further diagnosis, the first thing we need is the original stacktrace that tells us which classloader did the initial damage.

Muniments answered 4/11, 2017 at 23:15 Comment(5)
I'm not aware of any custom class loading in Cassandra's command line tool nodetool. It's a regular main method that executes command objects. Does the RMI stack do its own classloading (eg. for generating proxies)? I don't think so. There are no other stack traces. What I've posted (minus a log line from Cassandra) is all there is in standard out and error. I like the idea of another error being swallowed, I'll look for such catch statements.Wifehood
I found this in the JDK source, but good luck tracing the source.Wifehood
Ah yes. I had a typo in my "grep". So this means that the ultimately the original exception does come from the Hotspot code. However, you still need to provide the original stacktrace to get some clues. about how / why this has happened.Muniments
"Does the RMI stack do its own classloading" - yes. Both explicitly and under the hood.Muniments
The more I think about it and the more I think your suspicion is unlikely. The whole confusion here is due to these classes all being "bootstrap" classes. If java/lang/Object had failed to load, there's no way the program would have progressed to this point.Wifehood

© 2022 - 2024 — McMap. All rights reserved.