TALEND ETL Examples

Consuming REST APIs with Talend Open Studio 6.2.1 medium.com/@stevenbeeckman/consuming-rest-apis-with-talend-open-studio-6-2-1-147d0de15c35

Advertisements

Using Streamsets for ETL to/from Hadoop

[blog in progress - incomplete] This blog will show some examples of doing ETL to or from Hadoop. USE CASE #1: Use Sqoop commands inside Streamsets to copy data to Hadoop from RDBMS https://www.youtube.com/watch?v=k8VbTR77l8M https://streamsets.com/tutorials/ USE CASE #2: Use Kafka with Kerberos and Sentry with Streamsets on Cloudera The following steps need to be completed … Continue reading Using Streamsets for ETL to/from Hadoop

Streamsets install Oracle JDBC driver in External Library for CDH

This blog will show how to install the Oracle JDBC driver to the Streamsets External Library in a Cloudera Hadoop system.  Environment: Cloudera CDH 5.12, Streamsets 3.1.2 TASK: Update the Oracle JDBC driver inside Streamsets https://streamsets.com/documentation/datacollector/latest/help/#datacollector/UserGuide/Configuration/ExternalLibs.html#concept_pdv_qlw_ft Step 1. Set Up an External Directory  Setting Up for Cloudera Manager In Cloudera Manager, select the StreamSets service and then … Continue reading Streamsets install Oracle JDBC driver in External Library for CDH

Streamsets install using Cloudera Manager

Environment: CDH 5.12 STREAMSETS-3.1.2.0.jar Follow the install instructions in the link below: https://streamsets.com/documentation/datacollector/latest/help/index.html#datacollector/UserGuide/Installation/CMInstall-Overview.html#concept_nb5_c3m_25 Installation with Cloudera Manager To install Data Collector through Cloudera Manager, perform the following steps: Install the StreamSets custom service descriptor (CSD). (Optional.) Manually install the parcel and checksum files. Typically only needed when the Cloudera Manager Server does not have internet access. Download, … Continue reading Streamsets install using Cloudera Manager