Hadoop 3.0 Roadmap

Hadoop 3.0 Roadmap : Latest version of Hadoop is already out and everybody is excited about itsfeatures. Hadoop 3.0 is a major release after Hadoop 2.9 which was released in Dec 2017. Hadoop 3.0 Roadmap includes primarily 3 key enhancements YARN Feature, HDFC Features & common Features.

YARN Features are very important and having around 11 key items to be part of the roadmap. Features like “Container-executor rewrite for better security,extensibility and portability” and “Support Pausing/Freezing of opportunistic containers” are removed. On other side, HDFS features contain just 1 item after removing “HDFS Storage Policy Satisfier” & “Ozone: Object store for HDFS”. Common features includes only 1 items.

The latest detail and JIRA # can be found in this link.

Hadoop 3.0 Roadmap

Hadoop 3.0 Roadmap & Upcoming Releases

Hadoop 3.0 was released on 13 Dec 2017 and 3.0.1 was on 25 Mar 2018. The upcoming releases are Hadoop 3.0.2 and Hadoop 3.0.3.

Hadoop 3.0.1 was having following important blocker fixed

  1. Backport HADOOP-15039 to branch-2 and branch-3
  2. Delete copy-on-truncate block along with the original block, when deleting a file being truncated
  3. create-release site build outputs dummy shaded jars due to skipShade
  4. Several javadoc errors
  5. Lock down version of doxia-module-markdown plugin
  6. Skip validating priority acls while recovering applications
  7. Concurrent task progress updates causing NPE in Application Master
  8. Atsv2GSSException: No valid credentials provided - Failed to find any Kerberos tgt thrown by imelinev2Client & HBaseClient in NM
  9. Revert YARN-6078
  10. ResourceManager UI cluster/app/\app-id\ page fails to render
  11. Update the release year to 2018
  12. Enable user re-mapping for Docker containers in yarn-default.xml
  13. Change default NameNode RPC port back to 8020
  14. When NN is not able to identify DN for replication, reason behind it can be logged
  15. INode.getFullPathName may throw ArrayIndexOutOfBoundsException lead to NameNode exit
  16. "yarn logs" command fails to get logs for running containers if UI authentication is enabled.
  17. TestBalancerRPCDelay#testBalancerRPCDelay fails very frequently
  18. ync rbw dir on the first hsync() to avoid file lost on power failure
  19. StripedBlockReader#createBlockReader leaks socket on IOException
  20. NM heartbeat stuck when responseId overflows MAX_INT
  21. Improve Capacity Scheduler Async Scheduling to better handle node failures
  22. Map outputs implicitly rely on permissive umask for shuffle
  23. Handle IllegalArgumentException when GETSERVERDEFAULTS is not implemented in webhdfs.
  24. AmFilterInitializer should addFilter after fill all parameters
  25. refreshNamenodes does not support adding a new standby to a running DN
  26. Token expiration edits may cause log corruption or deadlock
  27. ATSv2NPE while starting hbase co-processor when HBase authorization is enabled.
  28. RBF: Support erasure coding methods in RouterRpcServer
  29. some YARN container events have timestamp of -1

Hadoop 3.0.2 is basically an amendment to Apache Hadoop 3.0.1 release to fix sharded jars in apache maven repository.

Hadoop 3.1 Planned Features

Following are the YARN features which are planned to be part of Hadoop 3.1 release.

  1. YARN native services
  2. Dynamic scheduler queue configuration
  3. Add absolute resource configuration to CapacityScheduler
  4. GPU Isolation
  5. Resource Profile and multiple resource types
  6. Overcommittment
  7. Support rich placement constraints in YARN
  8. Capacity Scheduler: Support Auto Creation of Leaf Queues While Doing Queue Mapping
  9. FPGA support

Following are the HDFS Features which are planned to be part of Hadoop 3.1 release

  1. HDFS tiered storage

Following are the Common Features which are planned to be part of Hadoop 3.1 release

  1. Add S3Guard committer for zero-rename commits to S3 endpoints

More on Hadoop 3.0 Related Topics