Connect Microsoft Power BI desktop to Cloudera Impala or Hive with Kerberos

Microsoft Power BI desktop is free and is able to successfully connect to a Cloudera Impala or hive database with Kerberos security enabled. The below blog only shows Impala driver but you can use same procedure with Hive driver also.

The basic steps are:

  • Install the MIT Kerberos client for Windows and make sure you have successfully got a ticket to the Cloudera CDH cluster.
  • Install the Power BI desktop
  • Create a ODBC System DSN. I used the MicroStrategy Impala ODBC driver client version (created by Simba Technologies).
  • While creating the ODBC DSN use the following parameters:
  • Host=Impala datanode hostname, Port=21050, Database=default, Authentication Mechanism=Kerberos, Realm=Your kerberos realm, Host FQDN=_HOST, Service Name=impala. Leave the Delegate Kerberos Credentials and Use Keytab as blank. Transport Buffer size=1000, Delegation UID leave blank.

  • Test the connect if successful otherwise it wont work. Make sure you got a new kerberos ticket in the MIT Kerberos client using your userid and password.
  • After ODBC Test is successful then go to the Power BI desktop and click on Get Data.
  • Select More->Other->ODBC option->connect. It will show a dropdown for your recently created ODBC DSN for Impala with whatever name you specified.
  • Press OK and you should be able to see your database and tables in Cloudera Impala CDH cluster and do any visualization in PowerBI.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.