Kerberos setup in Cloudera Hadoop

Reference: http://blog.cloudera.com/blog/2015/03/how-to-quickly-configure-kerberos-for-your-apache-hadoop-cluster/ Cloudera Security manual .pdf – CDH 5.15 on Cloudera Documentation website Environment: Cloudera CDH 5.15 on Centos 7 MIT KDC Kerberos   Setting up Kerberos in Cloudera CDH is somewhat tricky. The above blog is a good step by step way to setup. Also refer to the official Cloudera Security .pdf document on … Continue reading Kerberos setup in Cloudera Hadoop

Advertisements

Virtualbox VM setups.

Virtualbox network setup for internet access: Generally the following table gives the connectivity for different Virtualbox Network adapters. Sometimes it is possible the Bridged network wont get an ipv4 address and cannot connect to internet. Then we have to setup both a NAT and a Host-only adapter network. The requirement is: Host is Windows 10, … Continue reading Virtualbox VM setups.

Upgrade MySQL 5.6 to 5.7 on Centos 7

Reference: http://linuxresolved.com/upgrade-mysql-5-6-mysql-5-7-centos/ 1. # service mysql stop 2. Important backup your databases before upgrade. Create a backup of the original MySQL data. mv /var/lib/mysql /var/lib/mysql.original 2. Download the MySQL 5.7 RPM wget http://repo.mysql.com/mysql57-community-release-el7.rpm -P /tmp/ 3. Remove the MySQL-Community RPM that contains MySQL 5.6 yum remove mysql-community-release 4. Install the MySQL 5.7 RPM rpm -ivh /tmp/mysql57-community-release-el7.rpm … Continue reading Upgrade MySQL 5.6 to 5.7 on Centos 7

PostgreSQL notes

In Ubuntu, the server is run as a service called postgresql (configured in /etc/init.d/postgresql). The postgresql service is started automatically upon startup. Like all other services, you could: $ sudo service postgresql stop // Stop the service $ sudo service postgresql start // Start the service $ sudo service postgresql restart // Stop and restart the service $ sudo service postgresql reload … Continue reading PostgreSQL notes

Enable SSH login to Ubuntu by root user

In Ubuntu 16.04 by default root login by SSH is blocked for security reason. To enable root to login by SSH we have to update the ssh configuration. NOTE: This may make your server insecure for hackers so just enable root login temporarily if needed. /etc/ssh# ssh root Permission denied (publickey). Next edit the file:  /etc/ssh# vi sshd_config Change the following … Continue reading Enable SSH login to Ubuntu by root user

MicroStrategy Desktop connect to Impala

Environment: MicroStrategy Desktop 10.11 Cloudera CDH 5.12 Impala 2.x Steps to connect MicroStrategy Destop to Cloudera Impala: Best thing about MicroStrategy Desktop unlike Tableau Desktop is it is free to download and use and a powerful BI visualization/query tool. Tableau Public Desktop is free but it only has few connectors and cannot connect to Hadoop … Continue reading MicroStrategy Desktop connect to Impala

ESRI-GIS Tools for Hadoop

The ESRI GIS Tools for Hadoop are a collection of GIS tools for spatial analysis of big data. References: https://github.com/Esri/gis-tools-for-hadoop/tree/master/samples/point-in-polygon-aggregation-hive   Aggregation Sample for Hive: point-in-polygon-aggregation-hive The following steps are taken from the above reference. Step-1: Make a folder anywhere in your local server where hive is installed: $mkdir /tmp/esri-git Step-2: Bring down the git repository: … Continue reading ESRI-GIS Tools for Hadoop

Elasticsearch notes

Elasticsearch vs RDBMS concepts: You can (roughly) think of an Elastic index like a RDBMS database.   MySQL => Databases => Tables => Rows=>Columns Elasticsearch => Indices(database) => Types(tables) => Documents(rows) with Properties(columns) An Elasticsearch cluster can contain multiple Indices (databases), which in turn contain multiple Types(tables). These types hold multiple Documents (rows), and each document has Properties(columns). A ES mapping … Continue reading Elasticsearch notes