Page Menu
Home
c4science
Search
Configure Global Search
Log In
Files
F102961312
LibHdfs.apt.vm
No One
Temporary
Actions
Download File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Subscribers
None
File Metadata
Details
File Info
Storage
Attached
Created
Tue, Feb 25, 21:39
Size
3 KB
Mime Type
text/x-c
Expires
Thu, Feb 27, 21:39 (1 d, 23 h)
Engine
blob
Format
Raw Data
Handle
24468048
Attached To
R3704 elastic-yarn
LibHdfs.apt.vm
View Options
~~ Licensed under the Apache License, Version 2.0 (the "License");
~~ you may not use this file except in compliance with the License.
~~ You may obtain a copy of the License at
~~
~~ http://www.apache.org/licenses/LICENSE-2.0
~~
~~ Unless required by applicable law or agreed to in writing, software
~~ distributed under the License is distributed on an "AS IS" BASIS,
~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
~~ See the License for the specific language governing permissions and
~~ limitations under the License. See accompanying LICENSE file.
---
C API libhdfs
---
---
${
maven
.
build
.
timestamp
}
C API libhdfs
%
{
toc|section=1|fromDepth=0}
* Overview
libhdfs is a JNI based C API for Hadoop's Distributed File System
(HDFS). It provides C APIs to a subset of the HDFS APIs to manipulate
HDFS files and the filesystem. libhdfs is part of the Hadoop
distribution and comes pre-compiled in
<<<
${
HADOOP_PREFIX
}
/libhdfs/libhdfs.so>>> .
* The APIs
The libhdfs APIs are a subset of:
{{{
hadoop fs APIs}}}.
The header file for libhdfs describes each API in detail and is
available in <<<
${
HADOOP_PREFIX
}
/src/c++/libhdfs/hdfs.h>>>
* A Sample Program
----
\
#
include
"hdfs.h"
int main(int argc, char **argv)
{
hdfsFS fs = hdfsConnect("default", 0);
const char* writePath = "/tmp/testfile.txt";
hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY|O_CREAT, 0, 0, 0);
if(!writeFile)
{
fprintf(stderr, "Failed to open %s for writing!\n", writePath);
exit(-1);
}
char* buffer = "Hello, World!";
tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1);
if (hdfsFlush(fs, writeFile))
{
fprintf(stderr, "Failed to 'flush' %s\n", writePath);
exit(-1);
}
hdfsCloseFile(fs, writeFile);
}
----
* How To Link With The Library
See the Makefile for <<<hdfs_test.c>>> in the libhdfs source directory
(<<<
${
HADOOP_PREFIX
}
/src/c++/libhdfs/Makefile>>>) or something like:
<<<gcc above_sample.c -I
${
HADOOP_PREFIX
}
/src/c++/libhdfs -L
${
HADOOP_PREFIX
}
/libhdfs -lhdfs -o above_sample>>>
* Common Problems
The most common problem is the <<<CLASSPATH>>> is not set properly when
calling a program that uses libhdfs. Make sure you set it to all the
Hadoop jars needed to run Hadoop itself. Currently, there is no way to
programmatically generate the classpath, but a good bet is to include
all the jar files in <<<
${
HADOOP_PREFIX
}
>>> and <<<
${
HADOOP_PREFIX
}
/lib>>> as well
as the right configuration directory containing <<<hdfs-site.xml>>>
* Thread Safe
libdhfs is thread safe.
* Concurrency and Hadoop FS "handles"
The Hadoop FS implementation includes a FS handle cache which
caches based on the URI of the namenode along with the user
connecting. So, all calls to <<<hdfsConnect>>> will return the same
handle but calls to <<<hdfsConnectAsUser>>> with different users will
return different handles. But, since HDFS client handles are
completely thread safe, this has no bearing on concurrency.
* Concurrency and libhdfs/JNI
The libhdfs calls to JNI should always be creating thread local
storage, so (in theory), libhdfs should be as thread safe as the
underlying calls to the Hadoop FS.
Event Timeline
Log In to Comment