ActiveMQ won't restart because KahaDB is locked
Asked Answered
P

5

7

Locally to start/stop ActiveMQ (5.6) on my dev machine I just run ./activemq start and ./activemq stop respectively.

On our QA machine we have it installed as a service and run service activemq start and service activemq stop respectively.

I just went to bounce the QA machine and issued service activemq stop, then service activemq start to restart it.

I see a process ID being created, and if I run ps -aef | grep activemq I see the living, breathing process of an ActiveMQ server.

But if I go to http://<qa-server>:8161/admin I get the typical error that you see when a server is down:

Firefox can't establish a connection to the server at :8161.

Edit: I have now tried both the ./active start and service activemq start methods, and both produce the same issue: I see a process being created, but nothing in the web admin tool.

I checked ActiveMQ's home directory and don't see any type of logs/ directory, so I'm not even sure where to begin debugging the issue.

Either AMQ is not restarting, or its web admin app isn't restarting or functioning properly; either way I have no idea where to start. Thanks in advance!

Edit:

I see the following error in data/activemq.log:

2012-10-07 11:37:14,501 | INFO | Database /qa-server/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/qa-server/kahadb/lock' could not be locked. | org.apache.activemq.store.kahadb.MessageDatabase | main 2012-10-07 11:37:24,504 | INFO | Database /qa-server/kahadb/lock is locked... waiting 10 seconds for the database to be unlocked. Reason: java.io.IOException: File '/qa-server/kahadb/lock' could not be locked. | org.apache.activemq.store.kahadb.MessageDatabase | main

Perrine answered 7/10, 2012 at 15:26 Comment(10)
Looks like AMQ is not restarting. Check PID before stop and after start. Still the same? Kill the process.Ragtime
Thanks again @Ragtime - please see my comment underneath Bobby Fisher's answer. I have verified that the PID exists after starting ActiveMQ, and verified that the PID no longer exists after stopping it. Furthermore I've verified that I don't have multiple PIDs trying to compete with one another, such as multiple AMQ instances vying with each other because of all the start/stops I've issued, etc.Perrine
Maybe the stop didn't released the lock. stop AMQ, cleanup data/, start AMQ.Ragtime
Thanks, but when you say "cleanup data/ that a little confusing to me. Is data/ its log directory (in which case, why would erasing a log file release a lock)? Or do you mean something else by "cleanup data/)? Thanks again for all your help so far, and again +1.Perrine
Also, I found this article but it didn't mean much to me. I know we use NFS, so maybe there's a "master" instance or something?Perrine
Sorry, thought data/ contains all AMQ data. Try to release/delete the lock from internal db. Check kahadb/.Ragtime
I don't see any kahadb/ directory under my AMQ home dir. The only reference to "kaha" is ${AMQ_HOME}/lib/kahadb-5.6.0.jar.Perrine
Check you log. It says /qa-server/kahadb/Ragtime
Ahhh, I do see a file called lock (I was looking in the wrong directory). So, by "clean lock", do you mean delete the file itself? Or is there some kind of command-line interface command I have to issue? I'd hate to break AMQ even further...Perrine
@Ragtime - I think I am having the same problem as was asked in this question. If that turns out to be the case I will delete/close this question; however I need help confirming what to do. The user "jkysam" gave an answer that explains what could be happening to me. I'm just not sure what the solution is...Perrine
P
5

Turns out there were multiple AMQ servers in our QA environment. When I shutdown the first server, an exception was thrown for some reason and so it didn't release the lock. Possession of that lock then went to the other AMQ instance (the first server was the master, the 2nd server was the slave).

When I tried restarting the first server (the master), it wouldn't restart because the 2nd server had possession of the lock. I shutdown the 2nd server and the lock was released, allowing me to restart them both.

Perrine answered 9/10, 2012 at 20:7 Comment(0)
R
2

Check the data/ directory for logs.

And is this a typo? :8161.admin
Try :8161/admin

Ragtime answered 7/10, 2012 at 15:36 Comment(2)
Thanks @Ragtime (+1) - yes I was just about to post an exception I see from that log; please see update in original question. Thanks again!Perrine
And yes, that was a typo (I'll fix it)!Perrine
F
1
  1. Go to your apache installation home folder in Win7 i.e. apache-activemq-X.XX.X
  2. Right click and select properties
  3. Change the access permission of this folder for your logged in user - "normally your logged in user should already have ADMIN access, but if you have this apache MQ unzipped on your machine, then admin permission would be missing"

Now, you should have activemq command running smoothly

Fatidic answered 25/2, 2019 at 13:12 Comment(0)
M
0

Stopping a service does not guarantee you that the service is stopped. Windows spans threads and after a certain amount of time it assumes that the service is stopped. Always check, verify, kill the process if you still see it running in the processes list. Doing that may clear the locks it is holding.

Mop answered 7/10, 2012 at 15:53 Comment(3)
Thanks @BobbFisher (+1) - however I have checked the PIDs with ps -aef | grep activemq both before and after shutting the service down, with both methods (service activemq stop, and ./activemq stop), and verified in all scenarios that the PID is no longer there/alive... any thoughts?Perrine
Looks like you are working in linux environment. Sorry about my assumption of OS. However, looking at the log you posted, it looks like activemq is trying to hold the lock on the database but the very presence of the lock file in Kakadu library tells me that either some other process is holding a lock on it or the lock from activemq was not cleared for some reason. If you are sure that no other process is holding a lock on the db, you may delete the lock file for activemq to start properly.Mop
this is simply wrong. Once your service receives the stop-signal in OnStop() or OnShutdown() it WILL get stopped, even if your service stalls into an endless loop - in that case, Windows waits until an internal timeout is reached (which can be very short or very long, up to tens of seconds) and kills it after that. Services can even request additional shutdown-time - but im guessing since most services are created by internet-superheroes only a few people actually know how they SHOULD work. If a service on your system doesnt actually stop it is highly defective and should be uninstalled.Stavanger
S
0

If non of the above are working for your case, please follow below simpler steps:

  1. Go to task-manager and kill all the java process running currently.
  2. Run the activemq batch file as Administrator

Verify this is running fine by opening the below in your local browser: http://localhost:8161/

Stradivarius answered 22/6, 2016 at 16:0 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.