Streamsets renew JWT token to call api

Many JWT tokens expire hourly and need to be renewed to pass in an api call. Streamsets auto renewal of JWT tokens may not work so here is another way to renew JWT tokens. STEPS: PIPELINE-1: A continuously running separate pipeline will periodically renew the JWT token and store in a text file PIPELINE-2: The … Continue reading Streamsets renew JWT token to call api

Using Streamsets for ETL to/from Hadoop

[blog in progress - incomplete] This blog will show some examples of doing ETL to or from Hadoop. USE CASE #1: Use Sqoop commands inside Streamsets to copy data to Hadoop from RDBMS https://www.youtube.com/watch?v=k8VbTR77l8M https://streamsets.com/tutorials/ USE CASE #2: Use Kafka with Kerberos and Sentry with Streamsets on Cloudera The following steps need to be completed … Continue reading Using Streamsets for ETL to/from Hadoop

Streamsets install Oracle JDBC driver in External Library for CDH

This blog will show how to install the Oracle JDBC driver to the Streamsets External Library in a Cloudera Hadoop system.  Environment: Cloudera CDH 5.12, Streamsets 3.1.2 TASK: Update the Oracle JDBC driver inside Streamsets https://streamsets.com/documentation/datacollector/latest/help/#datacollector/UserGuide/Configuration/ExternalLibs.html#concept_pdv_qlw_ft Step 1. Set Up an External Directory  Setting Up for Cloudera Manager In Cloudera Manager, select the StreamSets service and then … Continue reading Streamsets install Oracle JDBC driver in External Library for CDH