Hadoop 3.0 Hortonworks
Hadoop 3.0 Hortonworks is an obvious question after you have seen Hadoop 3.0 new feature and enhancement list. At the time of writing this blog, HDP was having 2.6.4 supporting Hadoop 2.7.3. Hadoop 3.0 has lot of changes and if you want to try it in stand alone mode before it becomes available, it is available for installation. Since the changes in Hadoop 3.0 is quite a lot compare to Hadoop 2.0, it would take time for companies like Hortonworks or Cloudera to make a production ready bundle for it.
Hadoop 3.0 GA was released on 14 Dec 2017 and if you follow the Hadoop 3.0 architecture, you may find very interesting changes which will certainly take few months for bundler to give next major release.
Most of the organization is now running enterprise data lake at their premises and storage cost reduction is one of the promising feature, Hodoop 3.0 Hortonworks or HDP 3.0 must arrive as soon as possible.
Hadoop 3.0 Hortonworks Vs HDP 2.6.4
Hortonworks approach is to provide new bundling for minor version only when necessary to ensure that the interoperability of Apache project components. Following are the HDP components including the package version which are included in HDP 2.6.4. The list of Hadoop 3.0 Hortonworks support is yet to be available.
- Apache Hadoop 2.7.3
- Apache Accumulo 1.7.0
- Apache Atlas 0.8.0
- Apache Calcite 1.2.0
- Apache DataFu 1.3.0
- Apache Falcon 0.10.0
- Apache Flume 1.5.2
- Apache HBase 1.1.2
- Apache Hive 1.2.1
- Apache Hive 2.1.0
- Apache Kafka 0.10.1
- Apache Knox 0.12.0
- Apache Mahout 0.9.0+
- Apache Oozie 4.2.0
- Apache Phoenix 4.7.0
- Apache Pig 0.16.0
- Apache Ranger 0.7.0
- Apache Slider 0.92.0
- Apache Spark 1.6.3
- Apache Spark 2.2.0
- Apache Sqoop 1.4.6
- Apache Storm 1.1.0
- Apache TEZ 0.7.0
- Apache Zeppelin 0.7.3
- Apache ZooKeeper 3.4.6
Why Hodoop 3.0 Hortonworks
Why not Hadoop 3.0 Hortonworks (or HDP 3.0)? The cost reduction, java 1.8 support, improve yarn timeline and many more cool feature will make developer’s, administrator’s and business life so easy. Hadoop 3.0 Hortoworks make the overall data engineer projects more efficient and relatively much cheaper. On arrival of Hadoop 3.0 Hortonworks, user will see following upgrades
- Java 8 (jdk 1.8) as runtime for Hadoop 3.0
- Erasure Encoding for to reduce storage cost
- YARN Timeline Service v.2 (YARN-2928)
- New Default Ports for Several Services
- Intra-DataNode Balancer
- Shell Script Rewrite (HADOOP-9902)
- Shaded Client Jars
- Support for Opportunistic Containers
- MapReduce Task-Level Native Optimization
- Support for More than 2 NameNodes
- Support for Filesystem Connector
- Reworked Daemon and Task Heap Management
- Improved Fault-tolerance with Quorum Journal Manager
Downstream Compatibility with Other Apache Project
Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.
More on Hadoop 3.0 Hortonworks Related Topics