Useful SQL query examples

Example SQL queries which may be helpful: This works in IMPALA SQL to convert a unix epoch time to 30min intervals time for example time 19:15, 19:25 will show as 19:00 and 19:31, 19:50 will show as 19:30 etc. SELECT from_timestamp (cast((epochtime div 1800000)*1800 as timestamp) + interval (epochtime % 1000) milliseconds, 'yyyy-MM-dd-HH:mm') as timeat30mininterval, … Continue reading Useful SQL query examples

Streamsets renew JWT token to call api

Many JWT tokens expire hourly and need to be renewed to pass in an api call. Streamsets auto renewal of JWT tokens may not work so here is another way to renew JWT tokens. STEPS: PIPELINE-1: A continuously running separate pipeline will periodically renew the JWT token and store in a text file PIPELINE-2: The … Continue reading Streamsets renew JWT token to call api

Connect DBeaver SQL Tool to Cloudera Hive/Impala with Kerberos

DBeaver https://dbeaver.io/ is a a powerful free opensource SQL editor tool than can connect to 80+ different databases. The below procedures will enable DBeaver to connect to Cloudera Hive/Impala using kerberos. Initially tried to use the Cloudera JDBC connection but it kept giving kerberos error: [Cloudera]ImpalaJDBCDriver Error initialized or created transport for authentication: [Cloudera]ImpalaJDBCDriver Unable … Continue reading Connect DBeaver SQL Tool to Cloudera Hive/Impala with Kerberos

Use Beeline to query Hive table

Example how to query with beeline: Login with your userid on the linux server: [userxyz]$ beeline beeline> !connect jdbc:hive2://hive-server-hostname:10000/default;principal=hive/_HOST@XYZREALM.COM Error: Could not open client transport with JDBC Uri: : GSS initiate failed (state=08S01,code=0) This error is due to kinit not done. So do $ kinit userxyz beeline> !connect jdbc:hive2://hive-server-hostname:10000/default;principal=hive/_HOST@XYZREALM.COM Connected to: Apache Hive (version 1.1.0-cdh5.16.1) … Continue reading Use Beeline to query Hive table

Transfer parquet Hive table from one Hadoop cluster to another

EXAMPLE: HOW TO TRANSFER PARQUET HIVE TABLE FROM ONE CLUSTER TO ANOTHER CLUSTER First create a new table as CTAS to combine multiple hive table parquet files to a single parquet file for ease of transfer from one cluster to another. In Source cluster create a new table: CREATE TABLE default.mynewtable stored as PARQUET AS … Continue reading Transfer parquet Hive table from one Hadoop cluster to another