Page MenuHomec4science

SingleCluster.html
No OneTemporary

File Metadata

Created
Tue, Feb 25, 15:04

SingleCluster.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2014-02-11 -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop 2.3.0 - Hadoop MapReduce Next Generation 2.3.0 - Setting up a Single Node Cluster.</title>
<style type="text/css" media="all">
@import url("./css/maven-base.css");
@import url("./css/maven-theme.css");
@import url("./css/site.css");
</style>
<link rel="stylesheet" href="./css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20140211" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="http://hadoop.apache.org/" id="bannerLeft">
<img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="http://www.apache.org/" id="bannerRight">
<img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<a href="http://www.apache.org/" class="externalLink">Apache</a>
&gt;
<a href="http://hadoop.apache.org/" class="externalLink">Hadoop</a>
&gt;
<a href="../">Apache Hadoop Project Dist POM</a>
&gt;
Apache Hadoop 2.3.0
</div>
<div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://svn.apache.org/repos/asf/hadoop/" class="externalLink">SVN</a>
|
<a href="http://hadoop.apache.org/" class="externalLink">Apache Hadoop</a>
&nbsp;| Last Published: 2014-02-11
&nbsp;| Version: 2.3.0
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CommandsManual.html">Hadoop Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/FileSystemShell.html">File System Shell</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Compatibility.html">Hadoop Compatibility</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Superusers.html">Superusers</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS User Guide</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html">High Availability With QJM</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html">High Availability With NFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">HDFS Snapshots</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">HDFS Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Hftp.html">HFTP</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">C API libhdfs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS REST API</a>
</li>
<li class="none">
<a href="../../hadoop-hdfs-httpfs/index.html">HttpFS Gateway</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">HDFS NFS Gateway</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YARN.html">YARN Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">YARN Commands</a>
</li>
<li class="none">
<a href="../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html">History Server</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/releasenotes.html">Release Notes</a>
</li>
<li class="none">
<a href="../../api/index.html">API docs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CHANGES.txt">Common CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CHANGES.txt">HDFS CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt">MapReduce CHANGES.txt</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><div class="section">
<h2>Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.<a name="Hadoop_MapReduce_Next_Generation_-_Setting_up_a_Single_Node_Cluster."></a></h2>
<ul>
<li><a href="#Hadoop_MapReduce_Next_Generation_-_Setting_up_a_Single_Node_Cluster.">Hadoop MapReduce Next Generation - Setting up a Single Node Cluster.</a>
<ul>
<li><a href="#Mapreduce_Tarball">Mapreduce Tarball</a></li>
<li><a href="#Setting_up_the_environment.">Setting up the environment.</a></li>
<li><a href="#Setting_up_Configuration.">Setting up Configuration.</a>
<ul>
<li><a href="#Setting_up_mapred-site.xml">Setting up mapred-site.xml</a></li>
<li><a href="#Setting_up_yarn-site.xml">Setting up yarn-site.xml</a></li></ul></li></ul></li></ul>
<div class="section">
<h3>Mapreduce Tarball<a name="Mapreduce_Tarball"></a></h3>
<p>You should be able to obtain the MapReduce tarball from the release. If not, you should be able to create a tarball from the source.</p>
<div class="source">
<pre>$ mvn clean install -DskipTests
$ cd hadoop-mapreduce-project
$ mvn clean install assembly:assembly -Pnative</pre></div>
<p><b>NOTE:</b> You will need <a class="externalLink" href="http://code.google.com/p/protobuf">protoc 2.5.0</a> installed.</p>
<p>To ignore the native builds in mapreduce you can omit the <tt>-Pnative</tt> argument for maven. The tarball should be available in <tt>target/</tt> directory. </p></div>
<div class="section">
<h3>Setting up the environment.<a name="Setting_up_the_environment."></a></h3>
<p>Assuming you have installed hadoop-common/hadoop-hdfs and exported <b>$HADOOP_COMMON_HOME</b>/<b>$HADOOP_HDFS_HOME</b>, untar hadoop mapreduce tarball and set environment variable <b>$HADOOP_MAPRED_HOME</b> to the untarred directory. Set <b>$HADOOP_YARN_HOME</b> the same as <b>$HADOOP_MAPRED_HOME</b>. </p>
<p><b>NOTE:</b> The following instructions assume you have hdfs running.</p></div>
<div class="section">
<h3>Setting up Configuration.<a name="Setting_up_Configuration."></a></h3>
<p>To start the ResourceManager and NodeManager, you will have to update the configs. Assuming your $HADOOP_CONF_DIR is the configuration directory and has the installed configs for HDFS and <tt>core-site.xml</tt>. There are 2 config files you will have to setup <tt>mapred-site.xml</tt> and <tt>yarn-site.xml</tt>.</p>
<div class="section">
<h4>Setting up <tt>mapred-site.xml</tt><a name="Setting_up_mapred-site.xml"></a></h4>
<p>Add the following configs to your <tt>mapred-site.xml</tt>.</p>
<div class="source">
<pre> &lt;property&gt;
&lt;name&gt;mapreduce.cluster.temp.dir&lt;/name&gt;
&lt;value&gt;&lt;/value&gt;
&lt;description&gt;No description&lt;/description&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapreduce.cluster.local.dir&lt;/name&gt;
&lt;value&gt;&lt;/value&gt;
&lt;description&gt;No description&lt;/description&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;</pre></div></div>
<div class="section">
<h4>Setting up <tt>yarn-site.xml</tt><a name="Setting_up_yarn-site.xml"></a></h4></div></div></div>
<div class="section">
<h2>Add the following configs to your <tt>yarn-site.xml</tt><a name="Add_the_following_configs_to_your_yarn-site.xml"></a></h2>
<div class="source">
<pre> &lt;property&gt;
&lt;name&gt;yarn.resourcemanager.resource-tracker.address&lt;/name&gt;
&lt;value&gt;host:port&lt;/value&gt;
&lt;description&gt;host is the hostname of the resource manager and
port is the port on which the NodeManagers contact the Resource Manager.
&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.resourcemanager.scheduler.address&lt;/name&gt;
&lt;value&gt;host:port&lt;/value&gt;
&lt;description&gt;host is the hostname of the resourcemanager and port is the port
on which the Applications in the cluster talk to the Resource Manager.
&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.resourcemanager.scheduler.class&lt;/name&gt;
&lt;value&gt;org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler&lt;/value&gt;
&lt;description&gt;In case you do not want to use the default scheduler&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.resourcemanager.address&lt;/name&gt;
&lt;value&gt;host:port&lt;/value&gt;
&lt;description&gt;the host is the hostname of the ResourceManager and the port is the port on
which the clients can talk to the Resource Manager. &lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.local-dirs&lt;/name&gt;
&lt;value&gt;&lt;/value&gt;
&lt;description&gt;the local directories used by the nodemanager&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.address&lt;/name&gt;
&lt;value&gt;0.0.0.0:port&lt;/value&gt;
&lt;description&gt;the nodemanagers bind to this port&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.resource.memory-mb&lt;/name&gt;
&lt;value&gt;10240&lt;/value&gt;
&lt;description&gt;the amount of memory on the NodeManager in GB&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.remote-app-log-dir&lt;/name&gt;
&lt;value&gt;/app-logs&lt;/value&gt;
&lt;description&gt;directory on hdfs where the application logs are moved to &lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.log-dirs&lt;/name&gt;
&lt;value&gt;&lt;/value&gt;
&lt;description&gt;the directories used by Nodemanagers as log directories&lt;/description&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.nodemanager.aux-services&lt;/name&gt;
&lt;value&gt;mapreduce_shuffle&lt;/value&gt;
&lt;description&gt;shuffle service that needs to be set for Map Reduce to run &lt;/description&gt;
&lt;/property&gt;</pre></div>
<div class="section">
<h3>Setting up <tt>capacity-scheduler.xml</tt><a name="Setting_up_capacity-scheduler.xml"></a></h3>
<p>Make sure you populate the root queues in <tt>capacity-scheduler.xml</tt>.</p>
<div class="source">
<pre> &lt;property&gt;
&lt;name&gt;yarn.scheduler.capacity.root.queues&lt;/name&gt;
&lt;value&gt;unfunded,default&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.scheduler.capacity.root.capacity&lt;/name&gt;
&lt;value&gt;100&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.scheduler.capacity.root.unfunded.capacity&lt;/name&gt;
&lt;value&gt;50&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;yarn.scheduler.capacity.root.default.capacity&lt;/name&gt;
&lt;value&gt;50&lt;/value&gt;
&lt;/property&gt;</pre></div></div>
<div class="section">
<h3>Running daemons.<a name="Running_daemons."></a></h3>
<p>Assuming that the environment variables <b>$HADOOP_COMMON_HOME</b>, <b>$HADOOP_HDFS_HOME</b>, <b>$HADOO_MAPRED_HOME</b>, <b>$HADOOP_YARN_HOME</b>, <b>$JAVA_HOME</b> and <b>$HADOOP_CONF_DIR</b> have been set appropriately. Set $<b>$YARN_CONF_DIR</b> the same as $<b>HADOOP_CONF_DIR</b></p>
<p>Run ResourceManager and NodeManager as:</p>
<div class="source">
<pre>$ cd $HADOOP_MAPRED_HOME
$ sbin/yarn-daemon.sh start resourcemanager
$ sbin/yarn-daemon.sh start nodemanager</pre></div>
<p>You should be up and running. You can run randomwriter as:</p>
<div class="source">
<pre>$ $HADOOP_COMMON_HOME/bin/hadoop jar hadoop-examples.jar randomwriter out</pre></div></div></div>
<div class="section">
<h2>Good luck.<a name="Good_luck."></a></h2></div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">&#169; 2014
Apache Software Foundation
- <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a></div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>

Event Timeline