Oozie SSH Action
Asked Answered
S

4

8

Oozie SSH Action Issue:

Issue: We are trying to run few commands on a particular host machine of our cluster. We chose SSH Action for the same. We have been facing this SSH issue for some time now. What might be the real issue here? Please point me towards the solution.

logs:

AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added host,1.2.3.4 (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

org.apache.oozie.action.ActionExecutorException: AUTH_FAILED: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added 1.2.3.4,192.168.34.208 (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

at org.apache.oozie.action.ssh.SshActionExecutor.execute(SshActionExecutor.java:589)
at org.apache.oozie.action.ssh.SshActionExecutor.start(SshActionExecutor.java:204)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:211)
at org.apache.oozie.command.wf.ActionStartXCommand.execute(ActionStartXCommand.java:59)
at org.apache.oozie.command.XCommand.call(XCommand.java:277)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:326)
at org.apache.oozie.service.CallableQueueService$CompositeCallable.call(CallableQueueService.java:255)
at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:175)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)

Caused by: java.io.IOException: Not able to perform operation [ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/ ] | ErrorStream: Warning: Permanently added '1.2.3.4,1.2.3.4' (RSA) to the list of known hosts. Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).

at org.apache.oozie.action.ssh.SshActionExecutor.executeCommand(SshActionExecutor.java:340)
at org.apache.oozie.action.ssh.SshActionExecutor.setupRemote(SshActionExecutor.java:373)
at org.apache.oozie.action.ssh.SshActionExecutor$1.call(SshActionExecutor.java:206)
at org.apache.oozie.action.ssh.SshActionExecutor$1.call(SshActionExecutor.java:204)
at org.apache.oozie.action.ssh.SshActionExecutor.execute(SshActionExecutor.java:547)
... 10 more

2013-10-09 12:48:25,982 WARN org.apache.oozie.command.wf.ActionStartXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[0000000-131008185935754-oozie-oozi-W@action1] Suspending Workflow Job id=0000000-131008185935754-oozie-oozi-W 2013-10-09 12:48:27,204 WARN org.apache.oozie.command.coord.CoordActionUpdateXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[0000000-131008185935754-oozie-oozi-W@action1] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100 2013-10-09 12:59:57,477 INFO org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] STARTED WorkflowKillXCommand for jobId=0000000-131008185935754-oozie-oozi-W 2013-10-09 12:59:57,685 WARN org.apache.oozie.command.coord.CoordActionUpdateXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [, coord action is null], Error Code: E1100 2013-10-09 12:59:57,686 INFO org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] ENDED WorkflowKillXCommand for jobId=0000000-131008185935754-oozie-oozi-W 2013-10-09 13:41:32,654 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725 2013-10-09 13:41:45,199 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725 2013-10-09 13:42:04,869 WARN org.apache.oozie.command.wf.ResumeXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E1100: Command precondition does not hold before execution, [workflow's status is KILLED is not SUSPENDED], Error Code: E1100 2013-10-09 13:45:56,357 WARN org.apache.oozie.command.wf.KillXCommand: USER[user] GROUP[-] TOKEN[] APP[Test] JOB[0000000-131008185935754-oozie-oozi-W] ACTION[-] E0725: Workflow instance can not be killed, 0000000-131008185935754-oozie-oozi-W, Error Code: E0725

Approached tried:

  1. Password-less SSH set
  2. User proxies set
  3. Giving permissions to the required folders

Thanks;

Kasa.

Semitrailer answered 9/10, 2013 at 12:45 Comment(1)
facing the exact same issuePelargonium
W
10

I just hit a similar problem. I had a case where I could run as USER:

ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000000-131008185935754-oozie-oozi-W/action1--ssh/

by hand on the command line and it worked, but when launched via Oozie as USER it failed.

The reason, in my case, it failed is that I set up passwordless ssh between USER on the oozie server and USER on the remote machine. What one needs to do is set up passwordless ssh between oozie on the oozie server and USER on the remote machine. In other words, su to oozie on the oozie server and run the above command by hand. If it fails, it will fail in Oozie. If it works, then it should work in Oozie (assuming all else is correct, like dir permissions, etc.)

Take a look at what user your oozie server is running as:

ps -ef | grep oozie

Whatever user that is needs passwordless ssh to USER on the remote machine.

Wisnicki answered 24/10, 2013 at 21:0 Comment(2)
Thanks @Wisnicki . I was having so much trouble running oozie for my practice on cloudera VM untill I found this post. But when I try to set up ssh keys, it is prompting for user 'oozie' password. can u let me know what is the password for 'oozie' on cloudera VM.Zing
I haven't used the Cloudera VM in a long time, but you probably have root privileges, so you should be able to sudo to root and do it.Wisnicki
N
1

Whatever quux00 has answered is right. I am just adding few points to that. As the command ssh in the ssh-action will be executed by oozie user, then you will need to set oozie as a bash user.

To do that you need to change the /etc/passwd file on all the nodes of the cluster. Look for the below value (similar to it) in the /etc/passwd file.

oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/false 

and change it to

oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/bash

which will actually make oozie user a bash user. And then proceed with the password-less authentication between the oozie user and any other user that you want on any of the host machine.

And then try to rerun the oozie job again. And let me know if it works. Hope it helps!!!

Newson answered 15/1, 2014 at 10:16 Comment(0)
D
0

This is a very tricky problem and I could only hack it. I wasnt satisfied with the answer given so here my my version. Following failed for me( I could see in logs )

ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000067-130808155814753-oozie-oozi-W/mysshjob--ssh/

But if tried the same command but removed KbdInteractiveDevices=no or changed KbdInteractiveDevices=pam it worked

ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=pam -o StrictHostKeyChecking=no -o ConnectTimeout=20 [email protected] mkdir -p oozie-oozi/0000067-130808155814753-oozie-oozi-W/mysshjob--ssh/

Anyway I think there was some issue with old ssh key so I tried following and it works

$ ssh-keygen -t dsa
$ cat ~/.ssh/id_dsa.pub > ~/.ssh/authorized_keys2
Dumpish answered 20/2, 2014 at 15:6 Comment(0)
P
-1

After following all the above suggestion

oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/false 

and change it to

oozie:x:488:487:Oozie User:/var/lib/oozie:/bin/bash

Just try these steps:

  1. Create a password-less communication use below process:

    sudo su - oozie
    oozie@localhost: ssh-keygen -t dsa
    

    copy the public key generated to your local remote server like apps@XXXXXXX

  2. try ssh apps@XXXXXXX, you will login to remote without error

  3. go to HUE and select SSH action and give your BASH command like bash -x yourscript parameter
  4. save
  5. submit
Psychologist answered 17/6, 2016 at 11:7 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.