<ahref="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
<ahref="http://maven.apache.org/"title="Built by Maven"class="poweredBy">
<imgalt="Built by Maven"src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<divid="bodyColumn">
<divid="contentBox">
<!-- Licensed under the Apache License, Version 2.0 (the "License"); --><!-- you may not use this file except in compliance with the License. --><!-- You may obtain a copy of the License at --><!-- --><!-- http://www.apache.org/licenses/LICENSE-2.0 --><!-- --><!-- Unless required by applicable law or agreed to in writing, software --><!-- distributed under the License is distributed on an "AS IS" BASIS, --><!-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. --><!-- See the License for the specific language governing permissions and --><!-- limitations under the License. See accompanying LICENSE file. --><divclass="section">
<p>libhdfs is a JNI based C API for Hadoop's Distributed File System (HDFS). It provides C APIs to a subset of the HDFS APIs to manipulate HDFS files and the filesystem. libhdfs is part of the Hadoop distribution and comes pre-compiled in <tt>$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/libhdfs/libhdfs.so</tt> .</p></div>
<divclass="section">
<h3>The APIs<aname="The_APIs"></a></h3>
<p>The libhdfs APIs are a subset of: <ahref="#hadoop_fs_APIs"></a>.</p>
<p>The header file for libhdfs describes each API in detail and is available in <tt>$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/src/c++/libhdfs/hdfs.h</tt></p></div>
fprintf(stderr, "Failed to 'flush' %s\n", writePath);
exit(-1);
}
hdfsCloseFile(fs, writeFile);
}</pre></div></div>
<divclass="section">
<h3>How To Link With The Library<aname="How_To_Link_With_The_Library"></a></h3>
<p>See the Makefile for <tt>hdfs_test.c</tt> in the libhdfs source directory (<tt>$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/src/c++/libhdfs/Makefile</tt>) or something like: <tt>gcc above_sample.c -I$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/src/c++/libhdfs -L$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/libhdfs -lhdfs -o above_sample</tt></p></div>
<p>The most common problem is the <tt>CLASSPATH</tt> is not set properly when calling a program that uses libhdfs. Make sure you set it to all the Hadoop jars needed to run Hadoop itself. Currently, there is no way to programmatically generate the classpath, but a good bet is to include all the jar files in <tt>$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a></tt> and <tt>$<aname="HADOOP_PREFIX">HADOOP_PREFIX</a>/lib</tt> as well as the right configuration directory containing <tt>hdfs-site.xml</tt></p></div>
<divclass="section">
<h3>Thread Safe<aname="Thread_Safe"></a></h3>
<p>libdhfs is thread safe.</p>
<ul>
<li>Concurrency and Hadoop FS "handles"
<p>The Hadoop FS implementation includes a FS handle cache which caches based on the URI of the namenode along with the user connecting. So, all calls to <tt>hdfsConnect</tt> will return the same handle but calls to <tt>hdfsConnectAsUser</tt> with different users will return different handles. But, since HDFS client handles are completely thread safe, this has no bearing on concurrency.</p></li>
<li>Concurrency and libhdfs/JNI
<p>The libhdfs calls to JNI should always be creating thread local storage, so (in theory), libhdfs should be as thread safe as the underlying calls to the Hadoop FS.</p></li></ul></div></div>