Hadoop 3.0 Hortonworks

Hadoop 3.0 Hortonworks is an obvious question after you have seen Hadoop 3.0 new feature and enhancement list. At the time of writing this blog, HDP was having 2.6.4 supporting Hadoop 2.7.3. Hadoop 3.0 has lot of changes and if you want to try it in stand alone mode before it becomes available, it is available for installation. Since the changes in Hadoop 3.0 is quite a lot compare to Hadoop 2.0, it would take time for companies like Hortonworks or Cloudera to make a production ready bundle for it.

Hadoop 3.0 GA was released on 14 Dec 2017 and if you follow the Hadoop 3.0 architecture, you may find very interesting changes which will certainly take few months for bundler to give next major release.

Most of the organization is now running enterprise data lake at their premises and storage cost reduction is one of the promising feature, Hodoop 3.0 Hortonworks or HDP 3.0 must arrive as soon as possible.

Hadoop 3.0 Hortonworks Vs HDP 2.6.4

Hortonworks approach is to provide new bundling for minor version only when necessary  to ensure that the interoperability of Apache project components. Following are the HDP components including the package version which are included in HDP 2.6.4. The list of Hadoop 3.0 Hortonworks support is yet to be available.

  1. Apache Hadoop 2.7.3
  2. Apache Accumulo 1.7.0
  3. Apache Atlas 0.8.0
  4. Apache Calcite 1.2.0
  5. Apache DataFu 1.3.0
  6. Apache Falcon 0.10.0
  7. Apache Flume 1.5.2
  8. Apache HBase 1.1.2
  9. Apache Hive 1.2.1
  10. Apache Hive 2.1.0
  11. Apache Kafka 0.10.1
  12. Apache Knox 0.12.0
  13. Apache Mahout 0.9.0+
  14. Apache Oozie 4.2.0
  15. Apache Phoenix 4.7.0
  16. Apache Pig 0.16.0
  17. Apache Ranger 0.7.0
  18. Apache Slider 0.92.0
  19. Apache Spark 1.6.3
  20. Apache Spark 2.2.0
  21. Apache Sqoop 1.4.6
  22. Apache Storm 1.1.0
  23. Apache TEZ 0.7.0
  24. Apache Zeppelin 0.7.3
  25. Apache ZooKeeper 3.4.6

Hadoop 3.0 Hortonworks

Why Hodoop 3.0 Hortonworks

Why not Hadoop 3.0 Hortonworks (or HDP 3.0)? The cost reduction, java 1.8 support, improve yarn timeline and many more cool feature will make developer’s,¬† administrator’s and business life so easy. Hadoop 3.0 Hortoworks make the overall data engineer projects more efficient and relatively much cheaper. On arrival of Hadoop 3.0 Hortonworks, user will see following upgrades

  1. Java 8 (jdk 1.8) as runtime for Hadoop 3.0
  2. Erasure Encoding for to reduce storage cost
  3. YARN Timeline Service v.2 (YARN-2928)
  4. New Default Ports for Several Services
  5. Intra-DataNode Balancer
  6. Shell Script Rewrite (HADOOP-9902)
  7. Shaded Client Jars
  8. Support for Opportunistic Containers
  9. MapReduce Task-Level Native Optimization
  10. Support for More than 2 NameNodes
  11. Support for Filesystem Connector
  12. Reworked Daemon and Task Heap Management
  13. Improved Fault-tolerance with Quorum Journal Manager

Downstream Compatibility with Other Apache Project

Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.

Apache Project Version Compiles Unit Testing Status Basic Functional Testing
HBase 2.0.0
Spark 2.0
Hive 2.1.0
Oozie 5.0
Pig 0.16
Solr 6.x
Kafka 0.10

More on Hadoop 3.0 Hortonworks Related Topics

# Other Articles Link
1 All the newly added features and enhancements in Hadoop 3.0 Hadoop 3.0 features and enhancement
2 Detailed comparison between Hadoop 3.0 vs Hadoop 2.0 and what benefit it brings to the developer Hadoop 3.0 vs Hadoop 2.0
3 Hadoop 3.0 Installation Hadoop 3.0 Installation
4 Hadoop 3.0 Release Date Hadoop 3.0 Release Date
5 Hadoop 3. 0 Security Book Hadoop 3.0 Security by Ben and Joey
6 Demystify The Hadoop 3.0 Architecture and its components Hadoop 3.0 Architecture
7 Hadoop 3.0 & Hortonworks Support for it in HDP 3.0 Release Hadoop 3.0 Hortonworks