Hadoop / Amabari integration with Active Directory

One of the things you may want to do with a Hadoop environment is get it integrated with an existing Active Directory.  Depending on which distribution you're using, there are several ways to go.  My experience has been with Hortonworks and Ambari.

The documentation I started with, rather un-elegantly, shoves running your own KDC server and using an existing Active Directory into the same set of, choose your own adventure style, documentation:

I found this documentation much more useful for my environment:

It's important to note that, "Before enabling Kerberos in the cluster, you must deploy the Java Cryptography Extension (JCE) security policy files on the Ambari Server and on all hosts in the cluster." -reference

I found this JAVA program helpful for validation: https://jsosic.wordpress.com/tag/java/

One of the pieces of information you'll need is your LDAP connection string.
dsquery is a great resource for this.

For example the following will return what organizational unit and container you're in:
dsquery user -name "[your login]"

Another piece of info you'll need to find is your certificate authority.  certutil (Windows Command) has you covered here.
certutil -config - -ping

(yes "- -ping"), here's why

Is LDAP running over SSL? Check with ldp.exe
Here's how to set it up, if it's not.

You can obtain the certificate information via AD directly (see the IBM article), or by running openssl:
openssl s_client -connect [Server]:[Port]

You then need to trust the certificate on all the linux hosts
From the IBM article:
  1. Create '/etc/pki/ca-trust/source/anchors/activedirectory.pem' and paste the certificate contents
  2. Trust CA cert: sudo update-ca-trust enable; sudo update-ca-trust extract; sudo update-ca-trust check
  3. Trust CA cert in Java:
  4. mycert=/etc/pki/ca-trust/source/anchors/activedirectory.pem sudo keytool -importcert -noprompt -storepass changeit -file ${mycert} -alias ad -keystore /etc/pki/java/cacerts
  5. And at last, please make sure every node on your cluster has access to the ad host.
More details on keytool:

Once you've got all the pre-requisites done and your configuration items noted down, enabling Kerberos is done via Ambari: Admin -> Kerberos.

Here's the information you'll need:

KDC type: Existing Active Directory
KDC host:
Realm name:
LDAP url:
Container DN:

What you put in here will be mapped to what is in the final krb5.conf files on each server.  You can see the details of what Ambaris is going to do by going to the following: Ambari > Kerberos > Configs > Advanced krb5-conf.

Kadmin (This is required)
Kadmin Host:
Admin principal:
Admin password:

I did have to change my encryption types to match the certificate as I was getting the following error until I did so:
kinit: Preauthentication failed while getting initial credentials

You can further configure additional items, like encryption types here:
Ambari > Kerberos > Configs > Advanced kerberos-env.

I modified the Encryption Types, in Advanced kerberos-env, and un-commented the following in Advanced krb5-conf:
#default_tgs_enctypes = {{encryption_types}}
#default_tkt_enctypes = {{encryption_types}}

A couple of commands were helpful while troubleshooting:

In order to identify what encryption types were supported I ran klist against one of the keytabs:
klist -kte  /etc/security/keytabs/kerberos.service_check.082616.keytab

I was also able to manually check what Ambari was attempting to do by running kinit
kinit -c [Kerberos 5 cache name] -kt /etc/security/keytabs/kerberos.service_check.082616.keytab HDP_TEST-082616@DOMAIN 

During the initial configuration a csv file is produced by Ambari.  It contains details regarding hosts, pricipals, and keytabs.  Using this as a guide you can do some additional validation.
  1. Switch to one of the users that is running a kerberized service
  2. Validate the Kerberose ticket via klist (simply run klist without any switches)
Finally I was able to validate via the Ambari front end, under Services > Kerberos > Kerberos Clients.

You can also see all the users Ambari creates by inspecting AD.  I created a new OU and a new account in AD before I got started in order to keep things organized. The new account was used for the Admin principal configuration.  I ended up with close to 30 new user accounts in my environment.

1 comment:

  1. P.S. I found THIS very helpful for configuring the browser end.