Run a Python program to access Hadoop webhdfs with Kerberos enabled

Following python code makes REST calls to a secure Kerberos enabled Hadoop cluster to use webhdfs REST api to get file data:


  1. You need to first run $ knit userid@REALM to authenticate and initiate the Kerberos ticket for the user.
  2. Make sure the python modules requests and requests_kerberos have been installed. Otherwise install it for example:

# pip install requests

# pip install requests-kerberos

3. Put the below code in a file and run the code such as $ python

# start of python code

import httplib
import requests
import json
from requests_kerberos import HTTPKerberosAuth, REQUIRED

kerberos_auth = HTTPKerberosAuth(mutual_authentication=REQUIRED, sanitize_mutual_error_response=False)
webhdfs_url = “http://namenode:50070/webhdfs/v1/tmp?op=LISTSTATUS
headers = { ‘X-Requested-By’: ‘someuser’}
response = requests.get(webhdfs_url, headers=headers, auth=kerberos_auth, verify=False)

print “webhdfs response statuscode=”, response.status_code
print “webhdfs response responsetext=”, response.text

# end of python code

4. After running you should get results like below:

webhdfs response statuscode= 200
webhdfs response responsetext= {“FileStatuses”:{“FileStatus”:[
{“accessTime”:0,”blockSize”:0,”childrenNum”:7,”fileId”:26479,”group”:”group”,”length”:0,”modificationTime”:1532544 496983,”owner”:”userid”,”pathSuffix”:”staging”,”permission”:”700″,”replication”:0,”storagePolicy”:0,”type”:”DIRECTORY”},




Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.