2nd December 2020, Wednesday
Comparison Hadoop Cloudera HP Vertica
 Version 5.9.x --
 Name Cloudera --
 Drawbacks -- --
 Advantages -- --
 Languages Supported Java Python --
 Website www.cloudera.com --
 XML Support no --
 JSON Support yes --
 Brief description Cloudera is a hybrid open-source packaged distribution primarily of Apache Hadoop, Spark, Kafka. The distribution is called CDH (Cloudera Distribution Including Apache Hadoop). It is targeted at enterprise-class deployments of Hadoop platform. --
 Database Model Hadoop File System (HDFS) --
 Technical Documentation https://www.cloudera.com/documentation.html --
 License Commercial --
 Cloud-based / SaaS Altus is the cloud offering of Cloudera --
 Implementation Language NA --
 Operating System Supported Linux Windows --
 Options for Integration / Access API Restful HTTP --
 Consistency NA --
 Foreign Keys Not but you can join two files using Hive and Impala --
 Streaming Support Yes --
 Analytics Support Using Mlib in Apache Spark --
 Data Storage Schema Hadoop File System (HDFS) --
 Notable Users Dun & Bradstreet, AoL --
 Key Differentiator Cloudera Navigator provides lineage of the various jobs and data points. --
 Concurrency Yes --
 Partitioning Yes --
 Replication Yes --
 Secondary Indexes Yes in HBase. Datawarehouse using cloudera is generally built using Hbase. Hbase has secondary indexes --
 SchemaLess Yes --
 SQL Query No. HiveQL similar to SQL can be used withHive --