Install Python pip on RHEL/Centos 7

When you initially try to install pip you may get an error as below: # python --version Python 2.7.5 [root]# yum install python-pip No package python-pip available. Error: Nothing to do We need to install epel-release first: [root] # yum install epel-release Installed: epel-release.noarch 0:7-11 Next install pip: [root]# yum install python-pip Installed: python2-pip.noarch 0:8.1.2-10.el7 … Continue reading Install Python pip on RHEL/Centos 7

Errors and debugging

In jython if you get a syntx error like below check if you used a capital letter like "If" instead of "if" in your code: javax.script.ScriptException: SyntaxError: no viable alternative at input 'xyz' in <script> at line number 116 at column number 15

Useful SQL query examples

Example SQL queries which may be helpful: This works in IMPALA SQL to convert a unix epoch time to 30min intervals time for example time 19:15, 19:25 will show as 19:00 and 19:31, 19:50 will show as 19:30 etc. SELECT from_timestamp (cast((epochtime div 1800000)*1800 as timestamp) + interval (epochtime % 1000) milliseconds, 'yyyy-MM-dd-HH:mm') as timeat30mininterval, … Continue reading Useful SQL query examples

Streamsets renew JWT token to call api

Many JWT tokens expire hourly and need to be renewed to pass in an api call. Streamsets auto renewal of JWT tokens may not work so here is another way to renew JWT tokens. STEPS: PIPELINE-1: A continuously running separate pipeline will periodically renew the JWT token and store in a text file PIPELINE-2: The … Continue reading Streamsets renew JWT token to call api

Connect DBeaver SQL Tool to Cloudera Hive/Impala with Kerberos

DBeaver https://dbeaver.io/ is a a powerful free opensource SQL editor tool than can connect to 80+ different databases. The below procedures will enable DBeaver to connect to Cloudera Hive/Impala using kerberos. Initially tried to use the Cloudera JDBC connection but it kept giving kerberos error: [Cloudera]ImpalaJDBCDriver Error initialized or created transport for authentication: [Cloudera]ImpalaJDBCDriver Unable … Continue reading Connect DBeaver SQL Tool to Cloudera Hive/Impala with Kerberos

Use Beeline to query Hive table

Example how to query with beeline: Login with your userid on the linux server: [userxyz]$ beeline beeline> !connect jdbc:hive2://hive-server-hostname:10000/default;principal=hive/_HOST@XYZREALM.COM Error: Could not open client transport with JDBC Uri: : GSS initiate failed (state=08S01,code=0) This error is due to kinit not done. So do $ kinit userxyz beeline> !connect jdbc:hive2://hive-server-hostname:10000/default;principal=hive/_HOST@XYZREALM.COM Connected to: Apache Hive (version 1.1.0-cdh5.16.1) … Continue reading Use Beeline to query Hive table

Transfer parquet Hive table from one Hadoop cluster to another

EXAMPLE: HOW TO TRANSFER PARQUET HIVE TABLE FROM ONE CLUSTER TO ANOTHER CLUSTER First create a new table as CTAS to combine multiple hive table parquet files to a single parquet file for ease of transfer from one cluster to another. In Source cluster create a new table: CREATE TABLE default.mynewtable stored as PARQUET AS … Continue reading Transfer parquet Hive table from one Hadoop cluster to another

Connect Excel to Cloudera Hive/Impala

Below procedure will help you connect Microsoft Excel to Cloudera Impala or Hive using ODBC driver. First download and install the MIT Kerberos Client for windows from Kerberos for Windows Release 4.1 - current release Make sure you get the Kerberos userid/password from the Cloudera Administrator and your are able to login and get a … Continue reading Connect Excel to Cloudera Hive/Impala

Top Highest Interest US Savings Accounts

A good way to find the highest interest paying US Savings Bank accounts is from the website https://www.bankrate.com/banking/savings/rates/ and sort by All Products APY . However some of these banks sometimes use a bait-and-switch technique by increasing the interest rates to attract customers deposits and few months later drop the rates lower knowing that customers … Continue reading Top Highest Interest US Savings Accounts

Tableau Server connect to Cloudera Hive with MIT Kerberos

We know Tableau Desktop works with MIT Kerberos on Windows to connect to Cloudera Hive/Impala. But there is some confusing information in Tableau support sites whether Tableau SERVER can work with MIT Kerberos in an Windows environment. There is a note that Kerberos delegation requires Active Directory and MIT Kerberos is not supported. But let … Continue reading Tableau Server connect to Cloudera Hive with MIT Kerberos