Hadoop 3.0 New Features And Enhancement

Why do you need Hadoop 3.0 new features and enhancement and what is driving this change. The key driving force behind Hadoop 3.0 that there are lot of content in trunk that did not make in 2.x. And another roadblock that JDK 1.7 is not supported by Oracle. Look at the below comparison table where it clearly says what Hadoop 3.0 is promising.

Hadoop 3 vs Hadoop 2 side by side

The key Hadoop 3.0 new features and enhancement are as follows

Hadoop-3.0

  1. Java 8 (jdk 1.8) as runtime for Hadoop 3.0
  2. Erasure Encoding for to reduce storage cost
  3. YARN Timeline Service v.2 (YARN-2928)
  4. New Default Ports for Several Services
  5. Intra-DataNode Balancer
  6. Shell Script Rewrite (HADOOP-9902)
  7. Shaded Client Jars
  8. Support for Opportunistic Containers
  9. MapReduce Task-Level Native Optimization
  10. Support for More than 2 NameNodes
  11. Support for Filesystem Connector
  12. Reworked Daemon and Task Heap Management
  13. Improved Fault-tolerance with Quorum Journal Manager

Java 8 Minimum Runtime Version

Hadoop 3.0 and Java 8 Runtime

Oracle JDK 7 is EoL (End of Life) at April 2015 so in Hadoop 3.0, all JARs are compiled targeting a runtime version of Java 8. Hadoop 3.0 new features and enhancement is using lamda expression, Steam API, security enhancement and performance enhancements for HashMaps & IO/NIO.

Hadoop’s evaluation with JDK upgrade

  • Hadoop 2.6.x - JDK 6,7,8 or later
  • Hadoop 2.7.x/2.8.x/2.9.s - JDK 7,8 or later
  • Hadoop 3.0.x - JDK 8 or later

Erasure Coding Support in Hadoop 3.0

With the speed data is growing and many application are being developed to crunch them, organisation is looking to optimise the storage and HDFC erasure coding is an important feature to server this purpose.

Erasure Coding Hadoop 3.0

Hadoop 2.x has three replica by default. 1st replica on local node, local rack or random node. 2nd and 3rd replicas on the same remote rack. This gives reliability tolerance 2 failure. this gives good data locality and local shortcut. Multiple copies and parallel IO for parallel compute. very fast block recovery and node recovery. 3x storage has 200% overhead and to solve this storage overhead, erasure coding storage is implemented. It is cost effective, durable, save IO bandwidth however expensive recovery.

YARN Timeline Service v.2

Hadoop 3.0 introduced a major revision of YARN Timeline Service i.e. v.2. YARN Timeline Service. It is developed to address two major challenges:

  • Improving scalability and reliability of Timeline Service
  • Enhancing usability by introducing flows and aggregation

YARN Timeline Service v.2: Scalability

YARN version 1.x is limited to a single instance of writer/reader and does not scale well beyond small clusters. Version 2.x uses a more scalable distributed writer architecture and a scalable backend storage. It separates the collection (writes) of data from serving (reads) of data. It uses distributed collectors, essentially one collector for each YARN application. The readers are separate instances that are dedicated to serving queries via REST API.

YARN Timeline Service v.2 chooses Apache HBase as the primary backing storage, as Apache HBase scales well to a large size while maintaining good response times for reads and writes.

YARN Timeline Service v.2: Usability Improvements

Now talking about usability improvements, users are interested in the information at the level of flows or logical groups of YARN applications. It is much more common to launch a set or series of YARN applications to complete a logical application. Timeline Service v.2 supports the notion of flows explicitly. In addition, it supports aggregating metrics at the flow level as you can see in the below diagram.

More details are available in theYARN Timeline Service v.2docs.

Shell Script Rewrite

The Hadoop 3.0 shell scripts have been rewritten to fix bugs, correct compatibility issues and some of the existing installation problem. It also incorporates some new features. So I will list some of the important ones:

  1. All Hadoop shell script subsystems now execute hadoop-env.sh, which allows for all of the environment variables to be in one location.
  2. Daemonization has been moved from *-daemon.sh to the bin commands via the daemon option. In Hadoop 3 we can simply use daemon start to start a daemon, daemon stop to stop a daemon, and daemon status to set $? to the daemons status. For example, hdfs daemon start namenode.
  3. Operations which trigger ssh connections can now use pdsh if installed.
  4. ${HADOOP\_CONF\_DIR} is now properly honored everywhere, without requiring symlinking and other such tricks.
  5. Scripts now test and report better error messages for various states of the log and pid dirs on daemon startup. Before, unprotected shell errors would be displayed to the user.

There are many more features you will know when Hadoop 3.0 will be in the beta phase. Now let us discuss the shaded client jar and know their benefits.

Classpath Client Side Isolation

Classpath client side isolation solve the problem with application’s code dependency and its conflict with Hadoop’s dependencies. so this feature separating server side jar and client side jar like hbase-client dependencies are shared.

New Default Ports for Several Services

Prior to Hadoop 3.0, the default ports for multiple Hadoop services were in the Linux ephemeral port range(short-lived transport protocolport) (32768-61000) and can conflict with other apps running in the same node . So these default ports for NameNode, DataNode, Secondary NameNode and KMS have been moved out of the Linux ephemeral port range to avoid any bind errors on startup. This feature has been introduced to enhance the reliability of rolling restarts on large hadoop clusters.

See the release notes forHDFS-9427andHADOOP-12811for a full list of port changes

MapReduce Task-Level Native Optimization

In Hadoop 3.0, a native Java implementation has been added in MapReduce for the map output collector. For shuffle-intensive jobs, this improves the performance by 30% or more.

They added a native implementation of the map output collector. For shuffle-intensive jobs, this may provide speed-ups of 30% or more. They are working on native optimization for MapTask based on JNI. The basic idea is to add a NativeMapOutputCollector to handle key value pairs emitted by the mapper, therefore sort, spill, IFile serialization can all be done in native code. They are still working on the Merge code.

Support for More than 2 NameNodes

In Hadoop 2.x, HDFS NameNode high-availability architecture has a single active NameNode and a single Standby NameNode. By replicating edits to a quorum of three JournalNodes, this architecture is able to tolerate the failure of any one NameNode.

However, business critical deployments require higher degrees of fault-tolerance. So, in Hadoop 3 allows users to run multiple standby NameNodes. For instance, by configuring three NameNodes (1 active and 2 passive) and five JournalNodes, the cluster can tolerate the failure of two nodes.

Support for Filesystem Connector

Hadoop now supports integration with Microsoft Azure Data Lake and Aliyun Object Storage System. It can be used as an alternative Hadoop-compatible filesystem. First Microsoft Azure Data Lake was added and then they added Aliyun Object Storage System as well. You might expect some more.

Intra-DataNode Balancer

A single DataNode manages multiple disks. During a normal write operation, data is divided evenly and thus, disks are filled up evenly. But adding or replacing disks leads to skew within a DataNode. This situation was earlier not handled by the existing HDFS balancer. This concerns intra-DataNode skew.

Now Hadoop 3 handles this situation by the new intra-DataNode balancing functionality, which is invoked via the hdfs diskbalancer CLI.

See the disk balancer section in theHDFS Commands Guidefor more information.

Reworked Daemon and Task Heap Management

A series of changes have been made to heap management for Hadoop daemons as well as MapReduce tasks.

  • New methods for configuring daemon heap sizes. Notably, auto-tuning is now possible based on the memory size of the host, and the HADOOP_HEAPSIZE variable has been deprecated.In its place, HADOOP\_HEAPSIZE\_MAX and HADOOP\_HEAPSIZE\_MIN have been introduced to set Xmx and Xms, respectively. All global and daemon-specific heap size variables now support units. If the variable is only a number, the size is assumed to be in megabytes.
  • Simplification of the configuration of map and reduce task heap sizes, so the desired heap size no longer needs to be specified in both the task configuration and as a Java option. Existing configs that already specify both are not affected by this change.

Hadoop 3.0 Downstream Compatibility

Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.

More on Hadoop 3.0 Related Topics