Accessing kerberos secured WebHDFS without SPnego
Asked Answered
A

2

2

I have a working application for managing HDFS using WebHDFS. I need to be able to do this on a Kerberos secured cluster.

The problem is, that there is no library or extension to negotiate the ticket for my app, I only have a basic HTTP client.

Would it be possible to create a Java service which would handle the ticket exchange and once it gets the Service ticket to just pass it to the app for use in a HTTP request? In other words, my app would ask the Java service to negotiate the tickets and it would return the Service ticket back to my app in a string or raw string and the app would just attach it to the HTTP request?

EDIT: Is there a similar elegant solution like @SamsonScharfrichter described for HTTPfs? (To my knowledge, it does not support delegation tokens)

EDIT2: Hi guys, I am still completly lost. Im trying to figure out the Hadoop-auth client without any luck. Could you please help me out again? I already spent hours reading upon it without luck. The examples say to do this:

* // establishing an initial connection
*
* URL url = new URL("http://foo:8080/bar");
* AuthenticatedURL.Token token = new AuthenticatedURL.Token();
* AuthenticatedURL aUrl = new AuthenticatedURL();
* HttpURLConnection conn = new AuthenticatedURL(url, token).openConnection();
* ....
* // use the 'conn' instance
* ....

Im lost already here. What initial connection do I need? How can

new AuthenticatedURL(url, token).openConnection();

take two parameters? there is no constructor for such a case. (im getting error because of this). Shouldnt a principal be somewhere specified? It is probably not going to be this easy.

    URL url = new URL("http://<host>:14000/webhdfs/v1/?op=liststatus");
    AuthenticatedURL.Token token = new AuthenticatedURL.Token();

    HttpURLConnection conn = new AuthenticatedURL(url, token).openConnection(url, token);
Academicism answered 26/5, 2016 at 11:5 Comment(7)
I guess it might be possible to do that, however, it just reopens the security hole that Kerberos closed. I believe Knox / Sentry allow you to access data through some api points, no?Mathewson
Thanks for a hint, but I can not use Knox, if it opens the security hole is up for a discussion once it is working.Academicism
AFAIK all Hadoop GUIs and REST services use a signed cookie to cache the Kerberos credentials -- except WebHDFS that requires explicitly managing the delegation token. Maybe it's possible to create the cookie with one HTTP library, then use it with another session -- you should try to run a "debug mode" connection with HttpFS to check if there's a cookie involved. And hopefully your "basic HTTP client" is not too basic and lets you tinker with cookies.Verve
Thanks a lot, will update the thread once the solution is up and running.Academicism
Now I suggest going back to the hadoop-auth client. The dependency problem is small (750kb) and the solution I proposed deals with obtaining the Authorization parameter and cookie that fits HttpFS (HttpFS uses the hadoop-auth server side stuff to authenticate). If you are concerned about the size of the dependencies, you could at least take the source from the hadoop-auth client and use that.Tael
You'll need to also grab the "X-Hadoop-Delegation-Token" header if present.Tael
Hey @MaBu, I've updated my answer, see if it helps. Apologies for providing the wrong parameters, it turns out the documentation is wrong and I've submitted a patch.Tael
V
5

Using Java code plus the Hadoop Java API to open a Kerberized session, get the Delegation Token for the session, and pass that Token to the other app -- as suggested by @tellisnz -- has a drawback: the Java API requires quite a lot of dependencies (i.e. a lot of JARs, plus Hadoop native libraries). If you run you app on Windows, in particular, it will be a tough ride.

Another option is to use Java code plus WebHDFS to run a single SPNEGOed query and GET the Delegation Token, then pass it to the other app -- that option requires absolutely no Hadoop library on your server. The barebones version would be sthg like

URL urlGetToken = new URL("http://<host>:<port>/webhdfs/v1/?op=GETDELEGATIONTOKEN") ;
HttpURLConnection cnxGetToken =(HttpURLConnection) urlGetToken.openConnection() ;
BufferedReader httpMessage = new BufferedReader( new InputStreamReader(cnxGetToken.getInputStream()), 1024) ;
Pattern regexHasToken =Pattern.compile("urlString[\": ]+(.[^\" ]+)") ;
String httpMessageLine ;
while ( (httpMessageLine =httpMessage.readLine()) != null)
{ Matcher regexToken =regexHasToken.matcher(httpMessageLine) ;
  if (regexToken.find())
  { System.out.println("Use that template: http://<Host>:<Port>/webhdfs/v1%AbsPath%?delegation=" +regexToken.group(1) +"&op=...") ; }
}
httpMessage.close() ;

That's what I use to access HDFS from a Windows Powershell script (or even an Excel macro). Caveat: with Windows you have to create your Kerberos TGT on the fly, by passing to the JVM a JAAS config pointing to the appropriate keytab file. But that caveat also applies to the Java API, anyway.

Verve answered 27/5, 2016 at 9:47 Comment(2)
Thanks a lot, I will try it this way and let you know how it all ended up.Academicism
We have managed to solve this by using the Delegation token for WebHDFS and the signed cookie for HTTPfs. Thanks a lot for your help.Academicism
T
0

You could take a look at the hadoop-auth client and create a service which does the first connection, then you might be able to grab the 'Authorization' and 'X-Hadoop-Delegation-Token' headers and cookie from it and add it to your basic client's requests.

First you'll need to have either used kinit to authenticate your user for application before running. Otherwise, you're going to have to do a JAAS login for your user, this tutorial provides a pretty good overview on how to do that.

Then, to do the login to WebHDFS/HttpFS, we'll need to do something like:

URL url = new URL("http://youhost:8080/your-kerberised-resource");
AuthenticatedURL.Token token = new AuthenticatedURL.Token();
HttpURLConnection conn = new AuthenticatedURL().openConnection(url, token);

String authorizationTokenString = conn.getRequestProperty("Authorization");
String delegationToken = conn.getRequestProperty("X-Hadoop-Delegation-Token");
...
// do what you have to to get your basic client connection
...
myBasicClientConnection.setRequestProperty("Authorization", authorizationTokenString);
myBasicClientConnection.setRequestProperty("Cookie", "hadoop.auth=" + token.toString());
myBasicClientConnection.setRequestProperty("X-Hadoop-Delegation-Token", delegationToken);
Tael answered 26/5, 2016 at 12:55 Comment(2)
Thanks for a tip, I had in mind using webhdfs-java-client. I would probably modify it to return the Service ticket once its obtained. I was wondering if that would work.Academicism
Yeah that looks a little old. With hadoop-auth it could be as simple as edit above.Tael

© 2022 - 2024 — McMap. All rights reserved.