Anaconda Python notes

Some notes on Anaconda python package manager: Reference: Conda is the main installer for the Anaconda packagesConda can be used to create multiple environments with different python or other package versions.The Anaconda packages are installed under /<some path>/Anaconda3/pkgs and other sub-directoriesInside a new Conda installation, the root environment is activated by default, so you … Continue reading Anaconda Python notes


TLS and SSL notes

Following are some concepts used in TLS/SSL configurations: Private Key: The key that is not shared with other connections but just used to decrypt the payload encrypted with public certificate.Public Certificate: The certificate that is provided to the remote connection as part of the SSL/TLS negotiations used to encrypt the message.CSR: certificate signing request is … Continue reading TLS and SSL notes

Use Pandas in Jupyter PySpark3 kernel to query Hive table

Following python code will read a Hive table and convert to Pandas dataframe so you can use Pandas to process the rows. NOTE: Be careful when copy/paste the below code the double quotes need to be retyped as they get changed and gives syntax error. -------------------------------------------------------------------------------------------------------------- import pandas as pd from pyspark import SparkConf, SparkContext … Continue reading Use Pandas in Jupyter PySpark3 kernel to query Hive table

Cloudera Hadoop Data Encryption at rest Notes

In Cloudera Hadoop there are few components that are used to implemented Data Encryption at rest: The Key Management Server (KMS) uses the Key Trustee Server as the enderlying keystore instead of the file-based Java KeyStore(JKS) used by the default Hadoop KMS. Cloudera Navigator Key Trustee Server is the actual keystore for the encryption keys … Continue reading Cloudera Hadoop Data Encryption at rest Notes

Manually Connecting an SSSD Client to an Active Directory Domain

Following is a good article which worked successfully to connect Centos7 to Active Directory for users in AD to be able to login to Centos. Manually Connecting an SSSD Client to an Active Directory Domain   Another useful testing procedures blog: QA:Testcase Active Directory Setup   Realmd and SSSD Active Directory Authentication reading Manually Connecting an SSSD Client to an Active Directory Domain

Wireshark commands

Some Wireshark filter fields match against multiple protocol fields. For example, "ip.addr" matches against both the IP source and destination addresses in the IP header. The same is true for "tcp.port", "udp.port", "eth.addr", and others. It's important to note that ip.addr == is equivalent to ip.src == or ip.dst == This can be counterintuitive in some cases. … Continue reading Wireshark commands

Tableau Desktop connect to Cloudera Hadoop using Kerberos

Reference: The basic steps to connect Tableau to Cloudera Hive or Impala with Kerberos authentication involves the following steps: Download and Install the MIT Kerberos Client for Window Set the C:\ProgramData\MIT\Kerberos5\krb5.ini with  the Kerberos realm and server details (Optional) KRB5CCNAME system environment variable may need to be set at times to a temporary value: FILE:C:\temp\kerberos\krb5cache … Continue reading Tableau Desktop connect to Cloudera Hadoop using Kerberos

Kerberos commands

Common Kerberos commands: 1.Change password of a principal(user) $ kadmin.local kadmin.local: cpw <principalname> Enter password for princal "principalname@REALM.COM": 2.initialize a kerberos ticket $ kinit <principalname> To get detailed verbose info use below options: $ KRB5_TRACE=/dev/stdout kinit -V 3. Destroy the current ticket: $ kdestroy 4. Check the status of Kerberos KDC $ systemctl status kadmin $ systemctl … Continue reading Kerberos commands

Access webhdfs using Kerberos from laptop client

The following blog shows how to access a kerberized hadoop cluster from a Chrome browser in laptop. This will work mostly except change the below: 3. network.negotiate-auth.gsslib = C:\Program Files\MIT\Kerberos\bin\gssapi64.dll instead of the gssapi32.dll  since we mostly use 64-bit Firefox which doesnt work with the 32bit dll.