<ahref="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
<ahref="http://maven.apache.org/"title="Built by Maven"class="poweredBy">
<imgalt="Built by Maven"src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<divid="bodyColumn">
<divid="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><divclass="section">
<p>This guide describes the native hadoop library and includes a small discussion about native shared libraries.</p>
<p>Note: Depending on your environment, the term "native libraries" could refer to all *.so's you need to compile; and, the term "native compression" could refer to all *.so's you need to compile that are specifically related to compression. Currently, however, this document only addresses the native hadoop library (<tt>libhadoop.so</tt>).</p></div>
<p>Hadoop has native implementations of certain components for performance reasons and for non-availability of Java implementations. These components are available in a single, dynamically-linked native library called the native hadoop library. On the *nix platforms the library is named <tt>libhadoop.so</tt>.</p></div>
<divclass="section">
<h3>Usage<aname="Usage"></a></h3>
<p>It is fairly easy to use the native hadoop library:</p>
<olstyle="list-style-type: decimal">
<li>Review the components.</li>
<li>Review the supported platforms.</li>
<li>Either download a hadoop release, which will include a pre-built version of the native hadoop library, or build your own version of the native hadoop library. Whether you download or build, the name for the library is the same: libhadoop.so</li>
<li>Install the compression codec development packages (>zlib-1.2, >gzip-1.2): + If you download the library, install one or more development packages - whichever compression codecs you want to use with your deployment. + If you build the library, it is mandatory to install both development packages.</li>
<li>Check the runtime log files.</li></ol></div>
<divclass="section">
<h3>Components<aname="Components"></a></h3>
<p>The native hadoop library includes two components, the zlib and gzip compression codecs:</p>
<ul>
<li>zlib</li>
<li>gzip</li></ul>
<p>The native hadoop library is imperative for gzip to work.</p></div>
<p>The native hadoop library is supported on *nix platforms only. The library does not to work with Cygwin or the Mac OS X platform.</p>
<p>The native hadoop library is mainly used on the GNU/Linus platform and has been tested on these distributions:</p>
<ul>
<li>RHEL4/Fedora</li>
<li>Ubuntu</li>
<li>Gentoo</li></ul>
<p>On all the above distributions a 32/64 bit native hadoop library will work with a respective 32/64 bit jvm.</p></div>
<divclass="section">
<h3>Download<aname="Download"></a></h3>
<p>The pre-built 32-bit i386-Linux native hadoop library is available as part of the hadoop distribution and is located in the <tt>lib/native</tt> directory. You can download the hadoop distribution from Hadoop Common Releases.</p>
<p>Be sure to install the zlib and/or gzip development packages - whichever compression codecs you want to use with your deployment.</p></div>
<divclass="section">
<h3>Build<aname="Build"></a></h3>
<p>The native hadoop library is written in ANSI C and is built using the GNU autotools-chain (autoconf, autoheader, automake, autoscan, libtool). This means it should be straight-forward to build the library on any platform with a standards-compliant C compiler and the GNU autotools-chain (see the supported platforms).</p>
<p>The packages you need to install on the target platform are:</p>
<li>zlib-development package (stable version >= 1.2.0)</li></ul>
<p>Once you installed the prerequisite packages use the standard hadoop pom.xml file and pass along the native flag to build the native hadoop library:</p>
<li>It is mandatory to install both the zlib and gzip development packages on the target platform in order to build the native hadoop library; however, for deployment it is sufficient to install just one package if you wish to use only one codec.</li>
<li>It is necessary to have the correct 32/64 libraries for zlib, depending on the 32/64 bit jvm for the target platform, in order to build and deploy the native hadoop library.</li></ul></div>
<divclass="section">
<h3>Runtime<aname="Runtime"></a></h3>
<p>The bin/hadoop script ensures that the native hadoop library is on the library path via the system property: <tt>-Djava.library.path=<path></tt></p>
<p>During runtime, check the hadoop log files for your MapReduce tasks.</p>
<ul>
<li>If everything is all right, then: <tt>DEBUG util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...</tt><tt>INFO util.NativeCodeLoader - Loaded the native-hadoop library</tt></li>
<li>If something goes wrong, then: <tt>INFO util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable</tt></li></ul></div>
<p>You can load any native shared library using DistributedCache for distributing and symlinking the library files.</p>
<p>This example shows you how to distribute a shared library, mylib.so, and load it from a MapReduce task.</p>
<olstyle="list-style-type: decimal">
<li>First copy the library to the HDFS: <tt>bin/hadoop fs -copyFromLocal mylib.so.1 /libraries/mylib.so.1</tt></li>
<li>The job launching program should contain the following: <tt>DistributedCache.createSymlink(conf);</tt><tt>DistributedCache.addCacheFile("hdfs://host:port/libraries/mylib.so. 1#mylib.so", conf);</tt></li>
<li>The MapReduce task can contain: <tt>System.loadLibrary("mylib.so");</tt></li></ol>
<p>Note: If you downloaded or built the native hadoop library, you don’t need to use DistibutedCache to make the library available to your MapReduce tasks.</p></div></div>