Install Jupyterhub

Jupyterhub Prerequisites: Before installing JupyterHub, you will need: a Linux/Unix based system and will need over 10GB of free space Python 3.4 or greater. An understanding of using pip or conda for installing Python packages is helpful. Installation using conda: Check if Anaconda package is already installed: $ dpkg -l | grep conda $ rpm -ql conda        -- if using rhel/centos If Anaconda … Continue reading Install Jupyterhub

Advertisements

Some helpful links

Cloudera/Hadoop: tiny.cloudera.com/hw-reqs tiny.cloudera.com/aws-ra http://docs.aws.amazon.com/quickstart/latest/cloudera/welcome.html http://docplayer.net/25124019-Hadoop-security-authors-ben-spivey-and-joey-echeverria-provide-in-depth-information-about-the-security-features-available-in-hadoop-and-organize-them.html http://blog.cloudera.com/blog/2015/03/how-to-quickly-configure-kerberos-for-your-apache-hadoop-cluster/ http://wpcertification.blogspot.com/ https://henning.kropponline.de/ https://blogs.msdn.microsoft.com/pliu/2016/01/02/integrating-cloudera-cluster-with-active-directory-part-13/ Jupyter: https://blog.insightdatascience.com/using-jupyter-on-apache-spark-step-by-step-with-a-terabyte-of-reddit-data-ef4d6c13959a Docker: https://www.dataquest.io/blog/docker-data-science/ Miscellaneous: https://blog.daftcode.pl/hype-driven-development-3469fc2e9b22 https://github.com/parth8891/NYC_Taxi_Data_Analysis https://keshif.me/demo/VisTools http://blog.thedigitalgroup.com/dattatrayap/high-speed-ingestion-into-solr-with-custom-talend-component-developed-by-tdg/ http://www.bigendiandata.com/ https://requestbin.com/

Let’s Encrypt free SSL certificate authority (CA)

Let’s Encrypt is a free, automated, and open certificate authority (CA), run for the public’s benefit. It is a service provided by the Internet Security Research Group (ISRG). The key principles behind Let’s Encrypt are: Free: Anyone who owns a domain name can use Let’s Encrypt to obtain a trusted certificate at zero cost. Automatic: Software running on … Continue reading Let’s Encrypt free SSL certificate authority (CA)

Finding Cheaper Pharmacy Drug prices

It is easy to get overcharged by big pharmacies like CVS, Walgreens, Walmart etc. for common drugs which can double in price inexplicably. Searching for cheaper pharmacy options online shows websites like GoodRx. It seems the same drug may be priced almost 50% less than CVS Pharmacy or others on this website listing.  They provide you … Continue reading Finding Cheaper Pharmacy Drug prices

Install Jupyter notebook with Livy for Spark on Cloudera Hadoop

Environment Cloudera CDH 5.12.x running Livy and Spark (see other blog on this website to install Livy) Anaconda parcel installed using Cloudera Manager (see other blog on this website to install Anaconda parcel on CDH) Non-Kerberos cluster. Kerberos based Hadoop cluster needs different setup and these instructions wont work. We will first install Anaconda and … Continue reading Install Jupyter notebook with Livy for Spark on Cloudera Hadoop

Enable Linux subsystem on Windows

We need to enable Windows Subsystem for Linux (Beta) which is basically Ubuntu linux on Windows. 1. Before installing the Linux Subsystem, you have to have: Windows 10 (Anniversary update or later version) 64-bit installation (can’t run on 32-bit systems) 2. Go to Windows Settings and click on pushbutton to Enable developer mode. It may … Continue reading Enable Linux subsystem on Windows

Install Ansible on Windows 10 WSL-Ubuntu

Steps to install Ansible on Windows 10. First we need to enable Windows Subsystem for Linux (Beta) which is an Ubuntu linux on Windows. Complete the steps in this blog: https://plenium.wordpress.com/2017/11/21/install-linux-on-windows/   <<<NEXT STEPS>>> Install Ansible Latest Releases Via Apt (Ubuntu) To configure the PPA on your machine and install ansible run these commands: $ sudo … Continue reading Install Ansible on Windows 10 WSL-Ubuntu

Install Anaconda Python package on Cloudera CDH.

This blog will show how to install Anaconda parcel in CDH to enable Pandas and other python libraries on Hue pySpark notebook. http://docs.anaconda.com/anaconda/user-guide/tasks/integration/cloudera/ There are two methods of using Anaconda on an existing cluster with Cloudera CDH, Cloudera’s distribution including Apache Hadoop: Use the Anaconda parcel for Cloudera CDH. The following procedure describes how to install … Continue reading Install Anaconda Python package on Cloudera CDH.

Install Hue Spark Notebook with Livy on Cloudera

This blog will show simple steps to install and configure Hue Spark notebook to run interactive pySpark  scripts using Livy. Environment used: CDH 5.12.x , Cloudera Manager, Hue 4.0, Livy 0.3.0, Spark 1.6.0 on RHEL linux. Sentry was installed in unsecure mode. Kerberos was not used in the Hadoop cluster. Kerberos will need additional steps … Continue reading Install Hue Spark Notebook with Livy on Cloudera