Page MenuHomec4science

Federation.html
No OneTemporary

File Metadata

Created
Tue, Feb 25, 07:52

Federation.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2014-02-11 -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop 2.3.0 - Hadoop Distributed File System-2.3.0 - Federation</title>
<style type="text/css" media="all">
@import url("./css/maven-base.css");
@import url("./css/maven-theme.css");
@import url("./css/site.css");
</style>
<link rel="stylesheet" href="./css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20140211" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="http://hadoop.apache.org/" id="bannerLeft">
<img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="http://www.apache.org/" id="bannerRight">
<img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<a href="http://www.apache.org/" class="externalLink">Apache</a>
&gt;
<a href="http://hadoop.apache.org/" class="externalLink">Hadoop</a>
&gt;
<a href="../">Apache Hadoop Project Dist POM</a>
&gt;
Apache Hadoop 2.3.0
</div>
<div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://svn.apache.org/repos/asf/hadoop/" class="externalLink">SVN</a>
|
<a href="http://hadoop.apache.org/" class="externalLink">Apache Hadoop</a>
&nbsp;| Last Published: 2014-02-11
&nbsp;| Version: 2.3.0
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CommandsManual.html">Hadoop Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/FileSystemShell.html">File System Shell</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Compatibility.html">Hadoop Compatibility</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Superusers.html">Superusers</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS User Guide</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html">High Availability With QJM</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html">High Availability With NFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">HDFS Snapshots</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">HDFS Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Hftp.html">HFTP</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">C API libhdfs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS REST API</a>
</li>
<li class="none">
<a href="../../hadoop-hdfs-httpfs/index.html">HttpFS Gateway</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">HDFS NFS Gateway</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YARN.html">YARN Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">YARN Commands</a>
</li>
<li class="none">
<a href="../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html">History Server</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/releasenotes.html">Release Notes</a>
</li>
<li class="none">
<a href="../../api/index.html">API docs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CHANGES.txt">Common CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CHANGES.txt">HDFS CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt">MapReduce CHANGES.txt</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><div class="section">
<h2>HDFS Federation<a name="HDFS_Federation"></a></h2>
<ul>
<li><a href="#HDFS_Federation">HDFS Federation</a>
<ul>
<li><a href="#Background">Background</a></li>
<li><a href="#Multiple_NamenodesNamespaces">Multiple Namenodes/Namespaces</a>
<ul>
<li><a href="#Key_Benefits">Key Benefits</a></li></ul></li>
<li><a href="#Federation_Configuration">Federation Configuration</a>
<ul>
<li><a href="#Configuration:">Configuration:</a></li>
<li><a href="#Formatting_Namenodes">Formatting Namenodes</a></li>
<li><a href="#Upgrading_from_older_release_to_0.23_and_configuring_federation">Upgrading from older release to 0.23 and configuring federation</a></li>
<li><a href="#Adding_a_new_Namenode_to_an_existing_HDFS_cluster">Adding a new Namenode to an existing HDFS cluster</a></li></ul></li>
<li><a href="#Managing_the_cluster">Managing the cluster</a>
<ul>
<li><a href="#Starting_and_stopping_cluster">Starting and stopping cluster</a></li>
<li><a href="#Balancer">Balancer</a></li>
<li><a href="#Decommissioning">Decommissioning</a></li>
<li><a href="#Cluster_Web_Console">Cluster Web Console</a></li></ul></li></ul></li></ul>
<p>This guide provides an overview of the HDFS Federation feature and how to configure and manage the federated cluster.</p>
<div class="section">
<h3><a name="Background">Background</a></h3><img src="./images/federation-background.gif" alt="HDFS Layers" />
<p>HDFS has two main layers:</p>
<ul>
<li><b>Namespace</b>
<ul>
<li>Consists of directories, files and blocks</li>
<li>It supports all the namespace related file system operations such as create, delete, modify and list files and directories.</li></ul></li>
<li><b>Block Storage Service</b> has two parts
<ul>
<li>Block Management (which is done in Namenode)
<ul>
<li>Provides datanode cluster membership by handling registrations, and periodic heart beats.</li>
<li>Processes block reports and maintains location of blocks.</li>
<li>Supports block related operations such as create, delete, modify and get block location.</li>
<li>Manages replica placement and replication of a block for under replicated blocks and deletes blocks that are over replicated.</li></ul></li>
<li>Storage - is provided by datanodes by storing blocks on the local file system and allows read/write access.</li></ul>
<p>The prior HDFS architecture allows only a single namespace for the entire cluster. A single Namenode manages this namespace. HDFS Federation addresses limitation of the prior architecture by adding support multiple Namenodes/namespaces to HDFS file system.</p></li></ul></div>
<div class="section">
<h3><a name="Multiple_NamenodesNamespaces">Multiple Namenodes/Namespaces</a></h3>
<p>In order to scale the name service horizontally, federation uses multiple independent Namenodes/namespaces. The Namenodes are federated, that is, the Namenodes are independent and don&#x2019;t require coordination with each other. The datanodes are used as common storage for blocks by all the Namenodes. Each datanode registers with all the Namenodes in the cluster. Datanodes send periodic heartbeats and block reports and handles commands from the Namenodes.</p><img src="./images/federation.gif" alt="HDFS Federation Architecture" />
<p><b>Block Pool</b></p>
<p>A Block Pool is a set of blocks that belong to a single namespace. Datanodes store blocks for all the block pools in the cluster. It is managed independently of other block pools. This allows a namespace to generate Block IDs for new blocks without the need for coordination with the other namespaces. The failure of a Namenode does not prevent the datanode from serving other Namenodes in the cluster.</p>
<p>A Namespace and its block pool together are called Namespace Volume. It is a self-contained unit of management. When a Namenode/namespace is deleted, the corresponding block pool at the datanodes is deleted. Each namespace volume is upgraded as a unit, during cluster upgrade.</p>
<p><b>ClusterID</b></p>
<p>A new identifier <b>ClusterID</b> is added to identify all the nodes in the cluster. When a Namenode is formatted, this identifier is provided or auto generated. This ID should be used for formatting the other Namenodes into the cluster.</p>
<div class="section">
<h4>Key Benefits<a name="Key_Benefits"></a></h4>
<ul>
<li>Namespace Scalability - HDFS cluster storage scales horizontally but the namespace does not. Large deployments or deployments using lot of small files benefit from scaling the namespace by adding more Namenodes to the cluster</li>
<li>Performance - File system operation throughput is limited by a single Namenode in the prior architecture. Adding more Namenodes to the cluster scales the file system read/write operations throughput.</li>
<li>Isolation - A single Namenode offers no isolation in multi user environment. An experimental application can overload the Namenode and slow down production critical applications. With multiple Namenodes, different categories of applications and users can be isolated to different namespaces.</li></ul></div></div>
<div class="section">
<h3><a name="Federation_Configuration">Federation Configuration</a></h3>
<p>Federation configuration is <b>backward compatible</b> and allows existing single Namenode configuration to work without any change. The new configuration is designed such that all the nodes in the cluster have same configuration without the need for deploying different configuration based on the type of the node in the cluster.</p>
<p>A new abstraction called <tt>NameServiceID</tt> is added with federation. The Namenode and its corresponding secondary/backup/checkpointer nodes belong to this. To support single configuration file, the Namenode and secondary/backup/checkpointer configuration parameters are suffixed with <tt>NameServiceID</tt> and are added to the same configuration file.</p>
<div class="section">
<h4>Configuration:<a name="Configuration:"></a></h4>
<p><b>Step 1</b>: Add the following parameters to your configuration: <tt>dfs.nameservices</tt>: Configure with list of comma separated NameServiceIDs. This will be used by Datanodes to determine all the Namenodes in the cluster.</p>
<p><b>Step 2</b>: For each Namenode and Secondary Namenode/BackupNode/Checkpointer add the following configuration suffixed with the corresponding <tt>NameServiceID</tt> into the common configuration file.</p>
<table border="1" class="bodyTable">
<tr class="a">
<th align="left">Daemon</th>
<th align="left">Configuration Parameter</th></tr>
<tr class="b">
<td align="left">Namenode</td>
<td align="left"><tt>dfs.namenode.rpc-address</tt> <tt>dfs.namenode.servicerpc-address</tt> <tt>dfs.namenode.http-address</tt> <tt>dfs.namenode.https-address</tt> <tt>dfs.namenode.keytab.file</tt> <tt>dfs.namenode.name.dir</tt> <tt>dfs.namenode.edits.dir</tt> <tt>dfs.namenode.checkpoint.dir</tt> <tt>dfs.namenode.checkpoint.edits.dir</tt></td></tr>
<tr class="a">
<td align="left">Secondary Namenode</td>
<td align="left"><tt>dfs.namenode.secondary.http-address</tt> <tt>dfs.secondary.namenode.keytab.file</tt></td></tr>
<tr class="b">
<td align="left">BackupNode</td>
<td align="left"><tt>dfs.namenode.backup.address</tt> <tt>dfs.secondary.namenode.keytab.file</tt></td></tr></table>
<p>Here is an example configuration with two namenodes:</p>
<div>
<pre>&lt;configuration&gt;
&lt;property&gt;
&lt;name&gt;dfs.nameservices&lt;/name&gt;
&lt;value&gt;ns1,ns2&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.rpc-address.ns1&lt;/name&gt;
&lt;value&gt;nn-host1:rpc-port&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.http-address.ns1&lt;/name&gt;
&lt;value&gt;nn-host1:http-port&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.secondaryhttp-address.ns1&lt;/name&gt;
&lt;value&gt;snn-host1:http-port&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.rpc-address.ns2&lt;/name&gt;
&lt;value&gt;nn-host2:rpc-port&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.http-address.ns2&lt;/name&gt;
&lt;value&gt;nn-host2:http-port&lt;/value&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.namenode.secondaryhttp-address.ns2&lt;/name&gt;
&lt;value&gt;snn-host2:http-port&lt;/value&gt;
&lt;/property&gt;
.... Other common configuration ...
&lt;/configuration&gt;</pre></div></div>
<div class="section">
<h4>Formatting Namenodes<a name="Formatting_Namenodes"></a></h4>
<p><b>Step 1</b>: Format a namenode using the following command:</p>
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/hdfs namenode -format [-clusterId &lt;cluster_id&gt;]</pre></div>
<p>Choose a unique cluster_id, which will not conflict other clusters in your environment. If it is not provided, then a unique ClusterID is auto generated.</p>
<p><b>Step 2</b>: Format additional namenode using the following command:</p>
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/hdfs namenode -format -clusterId &lt;cluster_id&gt;</pre></div>
<p>Note that the cluster_id in step 2 must be same as that of the cluster_id in step 1. If they are different, the additional Namenodes will not be part of the federated cluster.</p></div>
<div class="section">
<h4>Upgrading from older release to 0.23 and configuring federation<a name="Upgrading_from_older_release_to_0.23_and_configuring_federation"></a></h4>
<p>Older releases supported a single Namenode. Here are the steps enable federation:</p>
<p>Step 1: Upgrade the cluster to newer release. During upgrade you can provide a ClusterID as follows:</p>
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/hdfs start namenode --config $HADOOP_CONF_DIR -upgrade -clusterId &lt;cluster_ID&gt;</pre></div>
<p>If ClusterID is not provided, it is auto generated.</p></div>
<div class="section">
<h4>Adding a new Namenode to an existing HDFS cluster<a name="Adding_a_new_Namenode_to_an_existing_HDFS_cluster"></a></h4>
<p>Follow the following steps:</p>
<ul>
<li>Add configuration parameter <tt>dfs.nameservices</tt> to the configuration.</li>
<li>Update the configuration with NameServiceID suffix. Configuration key names have changed post release 0.20. You must use new configuration parameter names, for federation.</li>
<li>Add new Namenode related config to the configuration files.</li>
<li>Propagate the configuration file to the all the nodes in the cluster.</li>
<li>Start the new Namenode, Secondary/Backup.</li>
<li>Refresh the datanodes to pickup the newly added Namenode by running the following command:
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/hdfs dfadmin -refreshNameNode &lt;datanode_host_name&gt;:&lt;datanode_rpc_port&gt;</pre></div></li>
<li>The above command must be run against all the datanodes in the cluster.</li></ul></div></div>
<div class="section">
<h3><a name="Managing_the_cluster">Managing the cluster</a></h3>
<div class="section">
<h4>Starting and stopping cluster<a name="Starting_and_stopping_cluster"></a></h4>
<p>To start the cluster run the following command:</p>
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/start-dfs.sh</pre></div>
<p>To stop the cluster run the following command:</p>
<div>
<pre>&gt; $HADOOP_PREFIX_HOME/bin/stop-dfs.sh</pre></div>
<p>These commands can be run from any node where the HDFS configuration is available. The command uses configuration to determine the Namenodes in the cluster and starts the Namenode process on those nodes. The datanodes are started on nodes specified in the <tt>slaves</tt> file. The script can be used as reference for building your own scripts for starting and stopping the cluster.</p></div>
<div class="section">
<h4>Balancer<a name="Balancer"></a></h4>
<p>Balancer has been changed to work with multiple Namenodes in the cluster to balance the cluster. Balancer can be run using the command:</p>
<div>
<pre>&quot;$HADOOP_PREFIX&quot;/bin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script &quot;$bin&quot;/hdfs start balancer [-policy &lt;policy&gt;]</pre></div>
<p>Policy could be:</p>
<ul>
<li><tt>node</tt> - this is the <i>default</i> policy. This balances the storage at the datanode level. This is similar to balancing policy from prior releases.</li>
<li><tt>blockpool</tt> - this balances the storage at the block pool level. Balancing at block pool level balances storage at the datanode level also.
<p>Note that Balander only balances the data and does not balance the namespace.</p></li></ul></div>
<div class="section">
<h4>Decommissioning<a name="Decommissioning"></a></h4>
<p>Decommissioning is similar to prior releases. The nodes that need to be decomissioned are added to the exclude file at all the Namenode. Each Namenode decommissions its Block Pool. When all the Namenodes finish decommissioning a datanode, the datanode is considered to be decommissioned.</p>
<p><b>Step 1</b>: To distributed an exclude file to all the Namenodes, use the following command:</p>
<div>
<pre>&quot;$HADOOP_PREFIX&quot;/bin/distributed-exclude.sh &lt;exclude_file&gt;</pre></div>
<p><b>Step 2</b>: Refresh all the Namenodes to pick up the new exclude file.</p>
<div>
<pre>&quot;$HADOOP_PREFIX&quot;/bin/refresh-namenodes.sh</pre></div>
<p>The above command uses HDFS configuration to determine the Namenodes configured in the cluster and refreshes all the Namenodes to pick up the new exclude file.</p></div>
<div class="section">
<h4>Cluster Web Console<a name="Cluster_Web_Console"></a></h4>
<p>Similar to Namenode status web page, a Cluster Web Console is added in federation to monitor the federated cluster at <tt>http://&lt;any_nn_host:port&gt;/dfsclusterhealth.jsp</tt>. Any Namenode in the cluster can be used to access this web page.</p>
<p>The web page provides the following information:</p>
<ul>
<li>Cluster summary that shows number of files, number of blocks and total configured storage capacity, available and used storage information for the entire cluster.</li>
<li>Provides list of Namenodes and summary that includes number of files, blocks, missing blocks, number of live and dead data nodes for each Namenode. It also provides a link to conveniently access Namenode web UI.</li>
<li>It also provides decommissioning status of datanodes.</li></ul></div></div></div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">&#169; 2014
Apache Software Foundation
- <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a></div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>

Event Timeline