Apache Kudu is a relational database in the Hadoop ecosystem which provides CRUD update/delete capabilities in Impala tables. It stores data outside of hdfs in tablet files in the hadoop datanodes. It is useful for fast IOT data storage and querying as soon as data is inserted into the table unlike HDFS hive tables where the file needs to be closed before you can query causing delays. Also like datawarehouse you can update data. Nice!!
- Kudu parcels are already installed in the CDH cluster so just do a Add Service from Clouder Manager and select Kudu service to enable.
KUDU Admin commands:
[root]# sudo -u kudu kudu cluster ksck master1hostname ,master2hostname,master3hostname
[root]# sudo -u kudu kudu cluster ksck master1hostname ,master2hostname,master3hostname -tables=impala::mydb.mytable