Run a Java program in Hadoop with Kerberos enabled.

Following steps are needed to run a Java program in Hadoop with Kerberos security enabled:


1. Create a text file named FileCount.java and store it in your home directory such as /home/userid

2. Copy paste the below code into the file FileCount.java . Change the hadoop hostname from quickstart.cloudera:8020 to the correct host.

import java.io.*;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;

class FileCount {
public static void main(String[] args) throws IOException, FileNotFoundException{
String path = “/”;
if( args.length > 0 )
path = args[0];

FileSystem fs = FileSystem.get(new Configuration());
FileStatus[] status = fs.listStatus(new Path(“hdfs://quickstart.cloudera:8020” + path));
System.out.println(“File Count: ” + status.length);
}
}

 

3. Next Compile your java code using the javac compiler along with the classpath below. Make sure you have installed JDK 1.8 already. Check $java -version :

$ javac -classpath /opt/cloudera/parcels/CDH/jars/hadoop-common-2.6.0-cdh5.15.0.jar FileCount.java

4. Run the command to find all the classpath libraries needed:

$ hadoop classpath

/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*

5. Run your java program as below:

$ java -classpath “/opt/cloudera/parcels/CDH/jars/hadoop-common-2.6.0-cdh5.15.0.jar:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/lib/commons-logging-1.1.3.jar:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop-hdfs/lib/guava-11.0.2.jar:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/lib/commons-collections-3.2.2.jar:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/lib/slf4j-api-1.7.5.jar:/usr/share/cmf/common_jars/:/etc/hadoop/conf:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop/.//*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/./:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-hdfs/.//*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-yarn/lib/*:/opt/cloudera/parcels/CDH-5.15.0-1.cdh5.15.0.p0.21/lib/hadoop/libexec/../../hadoop-yarn/.//*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/lib/*:/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/.//*:.” FileCount /user

 

You may get a kerberos error below:

18/07/24 18:08:11 WARN security.UserGroupInformation: PriviledgedActionException as:userid (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
18/07/24 18:08:11 WARN ipc.Client: Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

6. Do a kinit to initiate the TGT key:

$ kinit

Password for userid@EXAMPLE.COM:  <enter your password>

 

7. Run the java command same as before:

18/07/24 18:10:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform… using builtin-java classes where applicable
File Count: 9
$ hdfs dfs -ls /user
Found 9 items

This shows that the java program is working correctly on Hadoop with Kerberos enabled.

 

REFERENCE:

http://dewoods.com/blog/hadoop-kerberos-guide

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.