Following are the steps for Kerberos configuration:
Kerberos is a computer network authentication protocol which works on the basis of "tickets" to allow nodes communicating over a non-secure network to prove their identity to one another in a secure manner. A Kerberos principal is a unique identity to which Kerberos can assign tickets. For Hadoop, the principals should be of the format username/fully.qualified.domain.name@REALM-NAME.COM. The term username in the username/fully.qualified.domain.name@REALM-NAME.COM principal refers to the username of an existing Unix account, such as hdfs or mapred.
Click here to download Kerberos
NOTE: You must have Administrative privileges for kerberos setup.
Start by unpacking the Kerberos source distribution to some directory(krb5-1.10.tar). For example, unpack kerberos to directory '/app/krb5-1.10'
To create the build, use the following procedure:.
- cd /app/krb5-1.10/src
- ./configure
- make
Next step is to install the binaries. This can be done by executing the command:
# make install
To install binaries to destination directory, use following command:
- # make install DESTDIR=/path/to/destdir
The Kerberos distribution provides built-in regression tests. To test the build, use following command:
- # make check
Work of KDCs is to issue Kerberos ticket. Master KDC has the master copy of the database which is distributed to the slave KDCs at regular intervals, thus each KDC has a copy of the Kerberos database. Any changes in database are reported to master KDC and Slave KDCs provide Kerberos ticket-granting services.
Modify the configuration files, /etc/krb5.conf and /usr/local/var/krb5kdc/kdc.conf to reflect the correct information (such as the hostnames and realm name). Most of the tags in configuration file have default values, but there are some tags in the krb5.conf file whose values must be specified.
The krb5.conf file has information about Kerberos configuration which includes admin servers and the KDCs locations for the Kerberos, interest realms, current realm defaults and applications of Kerberos, and hostnames mappings onto Kerberos realms. Default install directory of krb5.conf file is /etc. Environment variable 'KRB5_CONFIG' can be used to change it.
Replace the contents of krb5.conf file with following code:
- # vi /etc/krb5.conf
- [logging]
default = FILE:/var/log/krb5lib.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmin.log
[libdefaults]
default_realm = queryiorealm
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
kdc_timesync = 0
[realms]
queryiorealm = {
kdc = 192.168.0.1
admin_server = 192.168.0.1
}
The kdc.conf file has information about KDC configuration, which includes defaults used to issue Kerberos tickets. By default install directory of kdc.conf file is /usr/local/var/krb5kdc. It can be changed by setting the environment variable 'KRB5_KDC_PROFILE'. You can also find kdc.conf file at location "/usr/local/share/examples/krb5/kdc.conf".
Replace the contents of kdc.conf file with following code:
- [kdcdefaults]
kdc_ports = 88
[realms]
queryiorealm = {
kadmind_port = 749
max_life = 12h 0m 0s
max_renewable_life = 7d 0h 0m 0s
master_key_type = des3-hmac-sha1
supported_enctypes = des3-hmac-sha1:normal des-cbc-crc:normal des-cbc-crc:v4
}
[logging]
kdc = FILE:/usr/local/var/krb5kdc/kdc.log
admin_server = FILE:/usr/local/var/krb5kdc/kadmin.log
The Kerberos database and the optional stash file can be created using "kdb5_util" command on the Master KDC . The stash file, which is a local copy of the master key lies on the KDC's local disk in encrypted form. To authenticate the KDC to itself automatically before starting the kadmind and krb5kdc daemons, the stash file is used. If you choose to install a stash file, its access permission should be restricted to root only. You can also ignore to install stash file. kdb5_util will prompt you for the master key for the Kerberos database. This key can be any string.
The following is an example of how to create a Kerberos database and stash file on the KDC, using the kdb5_util command.
- # /usr/local/sbin/kdb5_util create -r queryiorealm -s
Initializing database '/usr/local/var/krb5kdc/principal' for realm 'queryiorealm',
master key name 'K/M@queryiorealm'
You will be prompted for the database Master Password. It is important that you NOT FORGET this password.
Enter KDC database master key: // Type the master password.
Re-enter KDC database master key to verify: //Type it again.
#
This will create five files in the directory specified in your kdc.conf file:
(The default directory is /usr/local/var/krb5kdc.) If you do not want a stash file, run the above command without the -s option
Access Control List (acl) file needs to be created, and Kerberos principal of at least one of the administrators is put into it. ACL file gets used by the kadmind daemon to restrict which principals can view and make privileged modifications to the Kerberos database files. The filename should match the value that has been set for "acl file" in kdc.conf file. '/usr/local/var/krb5kdc/kadm5.acl' is the default file name.
Format of the ACL file is:
Permission | Description |
---|---|
a | allows the addition of principals or policies in the database. |
A | disallows the addition of principals or policies in the database. |
d | allows the deletion of principals or policies in the database. |
D | disallows the deletion of principals or policies in the database. |
m | allows the modification of principals or policies in the database. |
M | disallows the modification of principals or policies in the database. |
c | allows the changing of passwords for principals in the database. |
C | disallows the changing of passwords for principals in the database. |
i | allows inquiries to the database. |
I | disallows inquiries to the database. |
s | allows the explicit setting of the key for a principal |
S | disallows the explicit setting of the key for a principal. |
l | allows the listing of principals or policies in the database. |
L | disallows the listing of principals or policies in the database. |
* | All privileges. |
x | All privileges, identical to " * ". |
Example of a kadm5.acl file: Note that order is important; permissions are determined by the first matching entry.
- */admin@queryiorealm *
*/root@queryiorealm *@queryiorealm cil *1/admin@queryiorealm
*/*@queryiorealm i
One needs to add administrator user to the kerberos database(atleast one). Use kadmin.local on the master KDC for this purpose. The administrative principal must be added to ACL list before it can be created.
For example:
- # /usr/local/sbin/kadmin.local
kadmin.local: addprinc admin/admin@queryiorealm
NOTICE: no policy specified for "admin/admin@queryiorealm"; assigning "default".
Enter password for principal admin/admin@queryiorealm: // Enter a password.
Re-enter password for principal admin/admin@queryiorealm: //Type it again.
Principal "admin/admin@queryiorealm" created.
kadmin.local:
A keytab is a file containing pairs of Kerberos principals and an encrypted copy of that principal's key. The keytab files are unique to each host since their keys include the hostname. This file is used to authenticate a principal on a host to Kerberos without human interaction or storing a password in a plain text file. The kadmind keytab is the key which is used by legacy administration daemons kadmind4 and v5passwdd to decrypt administrator's or client's Kerberos tickets to determine whether or not they should have access to the database.
You need to create the kadmin keytab with entries for the principals kadmin/admin and kadmin/changepw. (These principals are automatically added to Kerberos database.) To create the kadmin keytab, run kadmin.local and use the ktadd command as follows:
- # /usr/local/sbin/kadmin.local
kadmin.local: ktadd -k /usr/local/var/krb5kdc/kadm5.keytab kadmin/admin kadmin/changepw
Entry for principal kadmin/admin with kvno 5, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/usr/local/var/krb5kdc/kadm5.keytab
Entry for principal kadmin/admin with kvno 5, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/usr/local/var/krb5kdc/kadm5.keytab
Entry for principal kadmin/changepw with kvno 5, encryption type Triple DES cbc mode with HMAC/sha1 added to keytab WRFILE:/usr/local/var/krb5kdc/kadm5.keytab
Entry for principal kadmin/changepw with kvno 5, encryption type DES cbc mode with CRC-32 added to keytab WRFILE:/usr/local/var/krb5kdc/kadm5.keytab
kadmin.local: quit
#
- using -k argument, ktadd will save the extracted keytab as /usr/local/var/krb5kdc/kadm5.keytab
To start the kerberos at the master KDC, use following commands:
- # /usr/local/sbin/krb5kdc
# /usr/local/sbin/krb5kdc
# /usr/local/sbin/kadmind
Each daemon will fork and run in the background. If you want the daemon to start automatically at boot time, add them to the KDC's /etc/rc or /etc/inittab file.(stash file required)
Adding kerberos principals from QueryIO is done automatically. All QueryIO users are added as principals in kerberos.
To add a QueryIO user as principal, same user credentials as used by QueryIO are used to create principal in kerberos i.e same username and password are used for principal's username and password.
Stop all nodes in cluster to enable security and then change configuration properties. All nodes must be stopped because node restarted with security enabled can not communicate with the node running without security enabled. This can be done through QueryIO UI. All NameNodes and DataNodes should be stopped manually. To stop a node, select the node and click Stop.
All configuration files throughout the cluster must have same content. To enable hadoop security, append the following properties to the core-site.xml file for all QueryIO components on every host.
You can find core-site.xml on every registered host machine : $HOST_INSTALL_PATH/QueryIOPackage/hadoop-2.0.2-alpha/etc/$NODE_TYPE$-conf_$NODE_ID$/
($NODE_TYPE$ can be NameNode, DataNode, ResourceManager, NodeManager and $NODE_ID$ is the respective id of every node.)
<property> <name>hadoop.security.authentication</name> <value>kerberos</value> <!-- Giving value as "simple" disables security.--> </property>
<property> <name>hadoop.security.authorization</name> <value>true</value> </property>
You can find hdfs-site.xml on every registered host machine : $HOST_INSTALL_PATH/QueryIOPackage/hadoop-2.0.2-alpha/etc/$NODE_TYPE$-conf_$NODE_ID$/
($NODE_TYPE$ can be NameNode, DataNode, ResourceManager, NodeManager and $NODE_ID$ is the respective id of every node.)
Append the following properties to the hdfs-site.xml :
<!-- General HDFS security config -->
<property> <name>dfs.block.access.token.enable</name> <value>true</value> </property>
<!-- NameNode security config -->
<property> <name>dfs.https.address</name> <value><fully qualified domain name of NN>:50470</value> </property>
<property> <name>dfs.https.port</name> <value>50470</value> </property>
<property> <name>dfs.namenode.keytab.file</name> <value>/etc/hadoop/conf/hdfs.keytab</value> <!-- path to the HDFS keytab --> </property>
<property> <name>dfs.namenode.kerberos.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
<property> <name>dfs.namenode.kerberos.https.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
<!-- Secondary NameNode security config -->
<property> <name>dfs.secondary.https.address</name> <value><fully qualified domain name of Standby NN>:50495</value> </property>
<property> <name>dfs.secondary.https.port</name> <value>50495</value> </property>
<property> <name>dfs.secondary.namenode.keytab.file</name> <value>/etc/hadoop/conf/hdfs.keytab</value> <!-- path to the HDFS keytab --> </property>
<property> <name>dfs.secondary.namenode.kerberos.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
<property> <name>dfs.secondary.namenode.kerberos.https.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
<!-- DataNode security config -->
<property> <name>dfs.datanode.data.dir.perm</name> <value>700</value> </property>
<property> <name>dfs.datanode.address</name> <value>0.0.0.0:1004</value> </property>
<property> <name>dfs.datanode.http.address</name> <value>0.0.0.0:1006</value> </property>
<property> <name>dfs.datanode.keytab.file</name> <value>/etc/hadoop/conf/hdfs.keytab</value> <!-- path to the HDFS keytab --> </property>
<property> <name>dfs.datanode.kerberos.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
<property> <name>dfs.datanode.kerberos.https.principal</name> <value>admin/_HOST@queryiorealm</value> </property> <property> <name>dfs.web.authentication.kerberos.principal</name> <value>admin/_HOST@queryiorealm</value> </property>
These properties has to be manually updated in the respective files.
Append following options to queryio.<cluster_component>.options property for all cluster components.
-Djava.security.krb5.realm=queryiorealm -Dsun.security.krb5.debug=true -Djava.security.krb5.kdc=192.168.0.1
Select the component and click configure. For example, in case of datanode, select the datanode and click on configure. Now append the property "queryio.datanode.options" and click save. Repeat the process for all components.
Once the Kerberos has been successfully configured, QueryIO can be integrated with kerberos by changing the property useKerberos to true in queryio.properties file which is stored at "tomcat/webapps/queryio/conf".
Now you can start the Cluster through QueryIO UI. Start all the NamoNodes, DataNodes, ResourceManagers & NodeManagers. To start a NameNode, select NameNode and click Start. To start a DataNode, select DataNode and click Start and so on.
If all the nodes in the cluster starts well, then your QueryIO cluster has been successfully configured with kerberos.