Streamsets install Oracle JDBC driver in External Library for CDH

This blog will show how to install the Oracle JDBC driver to the Streamsets External Library in a Cloudera Hadoop system. 


Cloudera CDH 5.12, Streamsets 3.1.2

TASK: Update the Oracle JDBC driver inside Streamsets

Step 1. Set Up an External Directory

 Setting Up for Cloudera Manager
    1. In Cloudera Manager, select the StreamSets service and then click Configuration.
    2. On the Configuration page, in the Data Collector Advanced Configuration Snippet (Safety Valve) for field, add the STREAMSETS_LIBRARIES_EXTRA_DIR environment variable and point it to the external directory, as follows:
      For example:
      export STREAMSETS_LIBRARIES_EXTRA_DIR="/opt/sdc-extras/"
  1. Create the /opt/sdc-extras/ directory on every node that runs Data Collector.
  2. If you use the default system user and group named sdc to run Data Collector as a service, use the following command to change the owner of the external directory and all files in the directory to sdc:sdc:
    chown -R sdc:sdc /opt/sdc-extras
  3. When using the Java Security Manager, which is enabled by default, update the Data Collector Advanced Configuration Snippet (Safety Valve) for sdc-security.policy property to include the external directory as follows:

// user-defined external directory
grant codebase “file:///opt/sdc-extras/-” {

6. Restart Streamsets Data Collector service from Cloudera Manager status page.

Step 2. Install External Libraries

After you’ve set up the external directory, use the Package Manager within Data Collector to install external libraries.

Download the Oracle JDBC driver to your laptop as it will be needed to be uploaded as given later:

Oracle Database 12c Release 2 ( JDBC Driver & UCP Downloads

ojdbc8.jar (4,036,257 bytes)

    1. In Data Collector, in the top right toolbar, click the Package Manager icon:
  1. In the navigation panel, click External Libraries:
    Data Collector lists any currently installed external libraries.
  2. Immediately under the top right toolbar, click the Install External Libraries icon:
  3. In the Install External Libraries dialog box, select the stage library that needs to access the external library.
    For example, if you are installing a JDBC driver for the JDBC Multitable Consumer origin, select the JDBC stage library. If you are installing an external Java library for the Groovy Evaluator processor, select the Groovy stage library.
  4. Select the ojdbc8.jar on your laptop and upload.
  5. After successful upload in Streamsets External Library it will give message but dont click on Restart Data Collector as we need to restart from Cloudera Manager:

Successfully installed external library. Restart the Data Collector for the changes to take effect. If you started the Data Collector manually, click Restart Data Collector. If you started the Data Collector as a service, click Cancel and then run the following command: sdc service restart. 

7.Check if the JDBC driver is stored in the correct directory. It will get stored in linux under:
-rw-r–r– 1 sdc sdc 4036257 Apr 12 13:45 ojdbc8.jar

8. From Cloudera Manager Status page restart the Streamsets service.

The Oracle JDBC driver is now available in Streamsets to the various stages like JDBC Query Consumer.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.