Hadoop 3.0 Installation On Windows

Apache Hadoop 3.0 Installation on Windows is a short and practical guide for bigdata engineersto get their hands dirty. Since Hadoop 3.0 is not yet available with Cloudera CDH 6.x or Hortonworks HDP 3.x, this guide navigate you through the installation steps without cyngwin.

Since Hadoop 3.0 New Features are based on Java 1.8, you need following preparation before you start Hadoop 3.0 installation.

Java 1.8 For Hadoop 3.0 Installation on Window

You must have administrative priviledge to install JDK 1.8 on your windows machine. You can visit Oracle website and download the binaries to install it.

Once it is installed, or already installed, you can run java -version command on windows command prompt to validate the installation.

Java 1.8 Installation

If you have trouble running this command, you can check the windows path and JAVA_HOME variable.

Download & Extract Hadoop 3.0 Binaries

Download the latest Hadoop 3.0 Installation on Windows bundlefrom its official website.General availability (GA) marks a point of quality and stability for the release series that indicates its ready for production use. You can also download the source code for this release which is around ~25Mb in size. Hadoop 3.0 binaries will be somewhere around ~250Mb in size

Apache Hadoop 3.0 Release

Since Hadoop 3.0 is built on Java, there is no separate distribution for Unix or Windows. All the binaries are byte-code which can run anywhere.

On successful download, validate the size of Hadoop 3.0 bundle.

Hadoop 3.0 Installation Extract Windows

While extracting the tar file, you may also find unzipping error as shown below. To avoid this, I extracted the tar in a UNIX machine and transferred the untar version to windows machine.

Hadoop 3.0 Extract Error on Windows

Once extracted, the folder in windows look like this

Hadoop 3.0 Extract Folders

Windows Path Setup for Hadoop 3.0

Now we need to check and setup the JAVA_HOME and HADOOP_HOME path

HADOOP_HOME for Hadoop 3.0

Same way, set up the JAVA_HOME and java\bin folder in the path variables and verify those variables from command prompt

Hadoop and Java path for Hadoop 3.0

Configuration : HDFS 3.0 Installation on Windows

Edit fileC:/hadoop_3_x/hadoop-3.0.0/etc/hadoop/core-site.xml, paste below xml paragraph and save this file.

configuration
   property
       namefs.defaultFS/name
       valuehdfs://localhost:9000/value
   /property
/configuration

Rename “mapred-site.xml.template” to “mapred-site.xml” and edit this file C:/hadoop_3_x/hadoop-3.0.0/etc/hadoop/mapred-site.xml, paste below xml paragraph and save this file.

configuration
   property
       namemapreduce.framework.name/name
       valueyarn/value
   /property
/configuration

create folder“data”under“C:/hadoop_3_x/hadoop-3.0.0/”

Edit file C:/hadoop_3_x/hadoop-3.0.0/etc/hadoop/hdfs-site.xml, paste below xml paragraph and save this file.

configuration
   property
       namedfs.replication/name
       value1/value
   /property
   property
       namedfs.namenode.name.dir/name
       valueC:/hadoop_3_x/hadoop-3.0.0/data/namenode/value
   /property
   property
       namedfs.datanode.data.dir/name
       valueC:/hadoop_3_x/hadoop-3.0.0/data/datanode/value
   /property
/configuration

Edit file C:/hadoop_3_x/hadoop-3.0.0/etc/hadoop/yarn-site.xml, paste below xml paragraph and save this file.

configuration
   property
    	nameyarn.nodemanager.aux-services/name
    	valuemapreduce_shuffle/value
   /property
   property
      	nameyarn.nodemanager.auxservices.mapreduce.shuffle.class/name  
	valueorg.apache.hadoop.mapred.ShuffleHandler/value
   /property
/configuration

Edit file C:/hadoop_3_x/hadoop-3.0.0/etc/hadoop/hadoop-env.cmd by closing the command line “JAVA_HOME=%JAVA_HOME%” instead of set “JAVA_HOME=C:\Java” (On C:\java this is path to file jdk.18.0)

@ren JAVA_HOME is required
set JAVA_HOME=C:\Program Files\Java\jdk1.8.0_131

Hadoop 3.0 Environment Command Windows

Open cmd and typing command“hdfs namenode format”. You will see

Hadoop 3.0 Data Format

Hadoop 3.0 Installation Health Check

Previous version of Hadoop 2.x, web UI port is 50070 and it has been moved to 9870 in Hadoop 3.0. It can be accessed via web UI from localhost:9870

Hadoop 3.0 Health Check Web UI

Hadoop 3.0 Downstream Compatibility

Following are the version compatibility matrix sheet indication the version of different Apache projects and their unit test status including basic functionality testing. This was done as part of Hadoop 3.0 Beta 1 release in Oct 2017.

[wpsm_comparison_table id=”1” class=””]

Topper Tips