TALEND ETL Examples

Consuming REST APIs with Talend Open Studio 6.2.1 medium.com/@stevenbeeckman/consuming-rest-apis-with-talend-open-studio-6-2-1-147d0de15c35

Advertisements

Using Streamsets for ETL to/from Hadoop

[blog in progress - incomplete] This blog will show some examples of doing ETL to or from Hadoop. USE CASE #1: Use Sqoop commands inside Streamsets to copy data to Hadoop from RDBMS https://www.youtube.com/watch?v=k8VbTR77l8M     REFERENCES: https://streamsets.com/tutorials/ https://github.com/streamsets/tutorials http://www.treselle.com/blog/import-and-ingest-data-into-hdfs-using-kafka-in-streamsets/ https://streamsets.com/blog/transform-data-streamsets-data-collector/ https://streamsets.com/blog/blogreplicating-relational-databases-with-streamsets-data-collector/ https://github.com/streamsets/tutorials/blob/master/tutorial-hivedrift/readme.md http://blog.cloudera.com/blog/2016/02/how-to-build-a-real-time-search-system-using-streamsets-apache-kafka-and-cloudera-search/ https://www.youtube.com/watch?v=Gnvl30OJNao https://www.youtube.com/watch?v=qAyFvC4c2n4  

Streamsets install Oracle JDBC driver in External Library for CDH

This blog will show how to install the Oracle JDBC driver to the Streamsets External Library in a Cloudera Hadoop system.  Environment: Cloudera CDH 5.12, Streamsets 3.1.2 TASK: Update the Oracle JDBC driver inside Streamsets https://streamsets.com/documentation/datacollector/latest/help/#datacollector/UserGuide/Configuration/ExternalLibs.html#concept_pdv_qlw_ft Step 1. Set Up an External Directory  Setting Up for Cloudera Manager In Cloudera Manager, select the StreamSets service and then … Continue reading Streamsets install Oracle JDBC driver in External Library for CDH

Streamsets install using Cloudera Manager

Environment: CDH 5.12 STREAMSETS-3.1.2.0.jar Follow the install instructions in the link below: https://streamsets.com/documentation/datacollector/latest/help/index.html#datacollector/UserGuide/Installation/CMInstall-Overview.html#concept_nb5_c3m_25 Installation with Cloudera Manager To install Data Collector through Cloudera Manager, perform the following steps: Install the StreamSets custom service descriptor (CSD). (Optional.) Manually install the parcel and checksum files. Typically only needed when the Cloudera Manager Server does not have internet access. Download, … Continue reading Streamsets install using Cloudera Manager