<ahref="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
<ahref="http://maven.apache.org/"title="Built by Maven"class="poweredBy">
<imgalt="Built by Maven"src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<divid="bodyColumn">
<divid="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><divclass="section">
<p>This document describes how to configure and manage Service Level Authorization for Hadoop.</p></div>
<divclass="section">
<h3>Prerequisites<aname="Prerequisites"></a></h3>
<p>Make sure Hadoop is installed, configured and setup correctly. For more information see:</p>
<ul>
<li><ahref="./SingleCluster.html">Single Node Setup</a> for first-time users.</li>
<li><ahref="./ClusterSetup.html">Cluster Setup</a> for large, distributed clusters.</li></ul></div>
<divclass="section">
<h3>Overview<aname="Overview"></a></h3>
<p>Service Level Authorization is the initial authorization mechanism to ensure clients connecting to a particular Hadoop service have the necessary, pre-configured, permissions and are authorized to access the given service. For example, a MapReduce cluster can use this mechanism to allow a configured list of users/groups to submit jobs.</p>
<p>The <tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/hadoop-policy.xml</tt> configuration file is used to define the access control lists for various Hadoop services.</p>
<p>Service Level Authorization is performed much before to other access control checks such as file-permission checks, access control on job queues etc.</p></div>
<divclass="section">
<h3>Configuration<aname="Configuration"></a></h3>
<p>This section describes how to configure service-level authorization via the configuration file <tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/hadoop-policy.xml</tt>.</p>
<divclass="section">
<h4>Enable Service Level Authorization<aname="Enable_Service_Level_Authorization"></a></h4>
<p>By default, service-level authorization is disabled for Hadoop. To enable it set the configuration property hadoop.security.authorization to true in <tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/core-site.xml</tt>.</p></div>
<divclass="section">
<h4>Hadoop Services and Configuration Properties<aname="Hadoop_Services_and_Configuration_Properties"></a></h4>
<p>This section lists the various Hadoop services and their configuration knobs:</p>
<tableborder="1"class="bodyTable">
<trclass="a">
<thalign="left">Property</th>
<thalign="left">Service</th></tr>
<trclass="b">
<tdalign="left">security.client.protocol.acl</td>
<tdalign="left">ACL for ClientProtocol, which is used by user code via the DistributedFileSystem.</td></tr>
<tdalign="left">ACL for JobSubmissionProtocol, used by job clients to communciate with the jobtracker for job submission, querying job status etc.</td></tr>
<tdalign="left">ACL for RefreshAuthorizationPolicyProtocol, used by the dfsadmin and mradmin commands to refresh the security policy in-effect.</td></tr>
<tdalign="left">ACL for HAService protocol used by HAAdmin to manage the active and stand-by states of namenode.</td></tr></table></div>
<divclass="section">
<h4>Access Control Lists<aname="Access_Control_Lists"></a></h4>
<p><tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/hadoop-policy.xml</tt> defines an access control list for each Hadoop service. Every access control list has a simple format:</p>
<p>The list of users and groups are both comma separated list of names. The two lists are separated by a space.</p>
<p>Add a blank at the beginning of the line if only a list of groups is to be provided, equivalently a comman-separated list of users followed by a space or nothing implies only a set of given users.</p>
<p>A special value of <tt>*</tt> implies that all users are allowed to access the service.</p></div>
<divclass="section">
<h4>Refreshing Service Level Authorization Configuration<aname="Refreshing_Service_Level_Authorization_Configuration"></a></h4>
<p>The service-level authorization configuration for the NameNode and JobTracker can be changed without restarting either of the Hadoop master daemons. The cluster administrator can change <tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/hadoop-policy.xml</tt> on the master nodes and instruct the NameNode and JobTracker to reload their respective configurations via the <tt>-refreshServiceAcl</tt> switch to <tt>dfsadmin</tt> and <tt>mradmin</tt> commands respectively.</p>
<p>Refresh the service-level authorization configuration for the NameNode:</p>
<p>Of course, one can use the <tt>security.refresh.policy.protocol.acl</tt> property in <tt>$<aname="HADOOP_CONF_DIR">HADOOP_CONF_DIR</a>/hadoop-policy.xml</tt> to restrict access to the ability to refresh the service-level authorization configuration to certain users/groups.</p></div>
<divclass="section">
<h4>Examples<aname="Examples"></a></h4>
<p>Allow only users <tt>alice</tt>, <tt>bob</tt> and users in the <tt>mapreduce</tt> group to submit jobs to the MapReduce cluster:</p>