Category Archives: Hive

Business Intelligence, ETL and Data Science tools

Opensource BI / ETL tools:

Talend = ETL tool, leader in Gartner Magic Quadrant

HUE = Hadoop Analytics server

Jupyter Notebook = Datascience BI tool

Pentaho = desktop and server version

KNIME = Data Science leader in Gartner Magic Quadrant 2017 desktop version

PowerBI = desktop free version

Oracle SQL Developer = Free SQL tool

 

Commercial BI tools:

Tableau = desktop and server version

MicroStrategy = desktop and server version

Qlik = desktop and server version

Microsoft SSIS/SSAS/SSRS

RapidMiner = Data Science tool

 

 

 

Advertisements

Query Cloudera Hadoop Hive using Oracle SQL Developer.

  1. First download and install the popular free tool Oracle SQL Developer for Windows from Oracle website.
  2. Read this blog for a good idea about connecting Oracle SQL Developer to Hadoop Hive:   https://blogs.oracle.com/bigdataconnectors/move-data-between-apache-hadoop-and-oracle-database-with-sql-developer
  3. Note when configuring Cloudera-Hive JDBC drivers use the below website to download the 64bit JDBC driver for windows. https://www.cloudera.com/downloads/connectors/hive/jdbc/2-5-19.html
  4. In the Tools->Preferences->Database->Third party JDBC drivers add the .jar files from the JDBC driver download file hive_jdbc_2.5.19.1053\2.5.19.1053 GA\Cloudera_HiveJDBC4  .  (Note: dont use the Cloudera_HiveJDBC41 .jar files as SQL Developer wont recognize and enable the Hive connection). Make sure to select all 10 or 15 .jar driver files and load into the tool not just the path.
  5. Apache Hive Connection setup:
    1. Click on the New Connection in SQL Developer
    2. Click on Hive tab.  If the ‘Hive’ tab does not display next to the ‘Oracle’ tab it is likely that the Hive JDBC drivers did not install correctly. Load the
    3. Tools -> Preferences -> Database -> Third party JDBC drivers again.
    4. Give a connection name, username in Hive or Hue such as hive, Hostname of the hiverserver2, Port default is 10000, Database: default
  6. Test the connection if successful then Connect and you will see the Hive tables and will be able to run the SQL queries, import, export etc. on Hive from SQL Developer.
  7. The above connection has been tested using Oracle SQL Developer Version 17.3.0.271 on Windows 64bit and Hive on Cloudera Hadoop CDH 5.12.x.