Hadoop 3.0 Cloudera

Hadoop 3.0 Cloudera (or CDH 6.x) is an obvious question after you have seen Hadoop 3.0 new feature and enhancement list. At the time of writing this blog, CDH was having 5.14 supporting Hadoop 2.7.3. Hadoop 3.0 has lot of changes and if you want to try it in stand alone mode before it becomes available, it is available for installation. Since the changes in Hadoop 3.0 is quite a lot compare to Hadoop 2.0, it would take time for companies like Cloudera or Hortonworks to make a production ready bundle for it.

Hadoop 3.0 GA was released on 14 Dec 2017 and if you follow the Hadoop 3.0 architecture, you may find very interesting changes which will certainly take few months for bundler to give next major release.

Most of the organization is now running enterprise data lake at their premises and storage cost reduction is one of the promising feature, Hodoop 3.0 Cloudera or CDH 5.14 must arrive as soon as possible.

Hadoop 3.0 Cloudera Vs CDH 5.14

Hortonworks approach is to provide new bundling for minor version only when necessary  to ensure that the interoperability of Apache project components. Following are the HDP components including the package version which are included in HDP 2.6.4. The list of Hadoop 3.0 Cloudera support is yet to be available.

Component Package Version Release Note Link
Apache Avro avro-1.7.6 https://archive.cloudera.com/cdh5/cdh/5/avro-1.7.6-cdh5.14.0.releasenotes.html
Apache Crunch crunch-0.11.0 https://archive.cloudera.com/cdh5/cdh/5/crunch-0.11.0-cdh5.14.0.releasenotes.html
Datafu pig-udf-datafu-1.1.0 https://archive.cloudera.com/cdh5/cdh/5/datafu-1.1.0-cdh5.14.0.releasenotes.html
Flume-ng flume-ng-1.6.0 https://archive.cloudera.com/cdh5/cdh/5/flume-ng-1.6.0-cdh5.14.0.releasenotes.html
Apache Hadoop hadoop-2.6.0 https://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.14.0.releasenotes.html
Hadoop Mrv1 hadoop-0.20-mapreduce-2.6.0 https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.14.0.releasenotes.htmlNone
Hbase hbase-1.2.0 https://archive.cloudera.com/cdh5/cdh/5/hbase-1.2.0-cdh5.14.0.releasenotes.html
Hbase-solr hbase-solr-1.5 https://archive.cloudera.com/cdh5/cdh/5/hbase-solr-1.5-cdh5.14.0.releasenotes.html
Apache Hive hive-1.1.0 https://archive.cloudera.com/cdh5/cdh/5/hive-1.1.0-cdh5.14.0.releasenotes.html
Hue hue-3.9.0 https://archive.cloudera.com/cdh5/cdh/5/hue-3.9.0-cdh5.14.0.releasenotes.html
Apache Impala impala-2.11.0 https://archive.cloudera.com/cdh5/cdh/5/impala-2.11.0-cdh5.14.0.releasenotes.html
Kite SDK kite-1.0.0 https://archive.cloudera.com/cdh5/cdh/5/kite-1.0.0-cdh5.14.0.releasenotes.html
Apache Kudu kudu-1.6.0 http://archive.cloudera.com/cdh5/cdh/5/kudu-1.6.0-cdh5.14.0.releasenotes.html
Llama llama-1.0.0 https://archive.cloudera.com/cdh5/cdh/5/llama-1.0.0-cdh5.14.0.releasenotes.html
Apache Mahout mahout-0.9 https://archive.cloudera.com/cdh5/cdh/5/mahout-0.9-cdh5.14.0.releasenotes.html
Apache Oozie oozie-4.1.0 https://archive.cloudera.com/cdh5/cdh/5/oozie-4.1.0-cdh5.14.0.releasenotes.html
Apache Parquet parquet-1.5.0 https://archive.cloudera.com/cdh5/cdh/5/parquet-1.5.0-cdh5.14.0.releasenotes.html
Parquet-format parquet-format-2.1.0 https://archive.cloudera.com/cdh5/cdh/5/parquet-format-2.1.0-cdh5.14.0.releasenotes.html
Apache Pig pig-0.12.0 https://archive.cloudera.com/cdh5/cdh/5/pig-0.12.0-cdh5.14.0.releasenotes.html
Cloudera Search search-1.0.0 https://archive.cloudera.com/cdh5/cdh/5/search-1.0.0-cdh5.14.0.releasenotes.html
Apache Sentry sentry-1.5.1 https://archive.cloudera.com/cdh5/cdh/5/sentry-1.5.1-cdh5.14.0.releasenotes.html
Apache Solr solr-4.10.3 https://archive.cloudera.com/cdh5/cdh/5/solr-4.10.3-cdh5.14.0.releasenotes.html
Apache Spark spark-1.6.0 https://archive.cloudera.com/cdh5/cdh/5/spark-1.6.0-cdh5.14.0.releasenotes.html
Apache Sqoop sqoop-1.4.6 https://archive.cloudera.com/cdh5/cdh/5/sqoop-1.4.6-cdh5.14.0.releasenotes.html
Apache Sqoop2 sqoop2-1.99.5 https://archive.cloudera.com/cdh5/cdh/5/sqoop2-1.99.5-cdh5.14.0.releasenotes.html
Apache Whirr whirr-0.9.0 https://archive.cloudera.com/cdh5/cdh/5/whirr-0.9.0-cdh5.14.0.releasenotes.html
Zookeeper zookeeper-3.4.5 https://archive.cloudera.com/cdh5/cdh/5/zookeeper-3.4.5-cdh5.14.0.releasenotes.html

Why Hodoop 3.0 Cloudera

Why not Hadoop 3.0 Cloudera (or HDP 3.0)? The cost reduction, java 1.8 support, improve yarn timeline and many more cool feature will make developer’s,  administrator’s and business life so easy. Hadoop 3.0 Cloudera make the overall data engineer projects more efficient and relatively much cheaper. On arrival of Hadoop 3.0 Hortonworks, user will see following upgrades

  1. Java 8 (jdk 1.8) as runtime for Hadoop 3.0
  2. Erasure Encoding for to reduce storage cost
  3. YARN Timeline Service v.2 (YARN-2928)
  4. New Default Ports for Several Services
  5. Intra-DataNode Balancer
  6. Shell Script Rewrite (HADOOP-9902)
  7. Shaded Client Jars
  8. Support for Opportunistic Containers
  9. MapReduce Task-Level Native Optimization
  10. Support for More than 2 NameNodes
  11. Support for Filesystem Connector
  12. Reworked Daemon and Task Heap Management
  13. Improved Fault-tolerance with Quorum Journal Manager

Downstream Compatibility with Other Apache Project

Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.

Apache Project Version Compiles Unit Testing Status Basic Functional Testing
HBase 2.0.0
Spark 2.0
Hive 2.1.0
Oozie 5.0
Pig 0.16
Solr 6.x
Kafka 0.10

More on Hadoop 3.0 Related Topics

# Other Articles Link
1 All the newly added features and enhancements in Hadoop 3.0 Hadoop 3.0 features and enhancement
2 Detailed comparison between Hadoop 3.0 vs Hadoop 2.0 and what benefit it brings to the developer Hadoop 3.0 vs Hadoop 2.0
3 Hadoop 3.0 Installation Hadoop 3.0 Installation
4 Hadoop 3.0 Release Date Hadoop 3.0 Release Date
5 Hadoop 3. 0 Security Book Hadoop 3.0 Security by Ben and Joey
6 Demystify The Hadoop 3.0 Architecture and its components Hadoop 3.0 Architecture
7 Hadoop 3.0 & Hortonworks Support for it in HDP 3.0 Release Hadoop 3.0 Hortonworks