Connect Excel to Cloudera Hive/Impala

Below procedure will help you connect Microsoft Excel to Cloudera Impala or Hive using ODBC driver.

  • First download and install the MIT Kerberos Client for windows from Kerberos for Windows Release 4.1 – current release
  • Make sure you get the Kerberos userid/password from the Cloudera Administrator and your are able to login and get a ticket using the MIT Client on Windows. If the MIT Client doesnt connect you wont be able to connect using Excel.
  • This link may be helpful for MAC users: http://computing.help.inf.ed.ac.uk/kerberos-mac-os-x
  • Check whether you have 32bit or 64bit Excel from the File->Account->About Excel menu.
  • Download and install the corresponding Cloudera 32bit or 64bit ODBC driver for Windows. Try Impala driver first before Hive driver as it is faster. Download from the website https://www.cloudera.com/downloads.html

Database Drivers 

Hive ODBC Driver Downloads
Hive JDBC Driver Downloads
Impala ODBC Driver Downloads
Impala JDBC Driver Downloads

  • Extract and Install the ODBC driver on Windows 10.
  • In Windows search for the ODBC Data Sources (32bit or 64bit) program and run it.
  • Add a new ODBC connection for Cloudera ODBC Driver for Impala or Hive.
  • If using Impala then the following parameters are needed for the ODBC connection:
  • Data Source Name=give any name you like
  • Host: your Impala hostname
  • Port=21050
  • Database=default
  • Mechanism=Kerberos (if using kerberos hadoop cluster)
  • Realm=Kerberos Realm
  • Host FQDN=the hostname for the impala service account e.g xyz.com (note that the driver will try to connect to the Impala host using the credential impala/xyz.com@XYZREALM.COM
  • Service Name=impala
  • Click yes on the Canonicalize Principal FQDN
  • Transport mode=SASL
  • Test the connection if successful and click ok to save.
  • Open Excel and click on Data->Get Data->From Other Sources->From ODBC
  • Select the Impala ODBC DSN you created from the drop down.
  • It will connect to Cloudera Impala and show the databases and tables to select.
  • Now you can select any Hive table and easily see the data in Excel.

In addition you can even open Microsoft Access and use the same Impala ODBC DSN show the data in Access and run SQL queries and other data manipulation in Access.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.