Use Pandas in Jupyter PySpark3 kernel to query Hive table

Following python code will read a Hive table and convert to Pandas dataframe so you can use Pandas to process the rows. NOTE: Be careful when copy/paste the below code the double quotes need to be retyped as they get changed and gives syntax error. -------------------------------------------------------------------------------------------------------------- import pandas as pd from pyspark import SparkConf, SparkContext … Continue reading Use Pandas in Jupyter PySpark3 kernel to query Hive table

Advertisements