Page Menu
Home
c4science
Search
Configure Global Search
Log In
Files
F102938688
index.html
No One
Temporary
Actions
Download File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Subscribers
None
File Metadata
Details
File Info
Storage
Attached
Created
Tue, Feb 25, 16:01
Size
14 KB
Mime Type
text/html
Expires
Thu, Feb 27, 16:01 (2 d)
Engine
blob
Format
Raw Data
Handle
24369381
Attached To
R3704 elastic-yarn
index.html
View Options
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2014-02-11 -->
<html
xmlns=
"http://www.w3.org/1999/xhtml"
>
<head>
<title>
Apache Hadoop Distributed Copy -
DistCp
</title>
<style
type=
"text/css"
media=
"all"
>
@import
url
(
"./css/maven-base.css"
)
;
@import
url
(
"./css/maven-theme.css"
)
;
@import
url
(
"./css/site.css"
)
;
</style>
<link
rel=
"stylesheet"
href=
"./css/print.css"
type=
"text/css"
media=
"print"
/>
<meta
name=
"Date-Revision-yyyymmdd"
content=
"20140211"
/>
<meta
http-equiv=
"Content-Type"
content=
"text/html; charset=UTF-8"
/>
</head>
<body
class=
"composite"
>
<div
id=
"banner"
>
<a
href=
"http://hadoop.apache.org/"
id=
"bannerLeft"
>
<img
src=
"http://hadoop.apache.org/images/hadoop-logo.jpg"
alt=
""
/>
</a>
<a
href=
"http://www.apache.org/"
id=
"bannerRight"
>
<img
src=
"http://www.apache.org/images/asf_logo_wide.png"
alt=
""
/>
</a>
<div
class=
"clear"
>
<hr/>
</div>
</div>
<div
id=
"breadcrumbs"
>
<div
class=
"xleft"
>
<a
href=
"http://www.apache.org/"
class=
"externalLink"
>
Apache
</a>
>
<a
href=
"http://hadoop.apache.org/"
class=
"externalLink"
>
Hadoop
</a>
>
Apache Hadoop Distributed Copy
</div>
<div
class=
"xright"
>
<a
href=
"http://wiki.apache.org/hadoop"
class=
"externalLink"
>
Wiki
</a>
|
<a
href=
"https://svn.apache.org/repos/asf/hadoop/"
class=
"externalLink"
>
SVN
</a>
| Last Published: 2014-02-11
| Version: 2.3.0
</div>
<div
class=
"clear"
>
<hr/>
</div>
</div>
<div
id=
"leftColumn"
>
<div
id=
"navcolumn"
>
<h5>
General
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../index.html"
>
Overview
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/SingleCluster.html"
>
Single Node Setup
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/ClusterSetup.html"
>
Cluster Setup
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CommandsManual.html"
>
Hadoop Commands Reference
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/FileSystemShell.html"
>
File System Shell
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/Compatibility.html"
>
Hadoop Compatibility
</a>
</li>
</ul>
<h5>
Common
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CLIMiniCluster.html"
>
CLI Mini Cluster
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/NativeLibraries.html"
>
Native Libraries
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/Superusers.html"
>
Superusers
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/SecureMode.html"
>
Secure Mode
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html"
>
Service Level Authorization
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/HttpAuthentication.html"
>
HTTP Authentication
</a>
</li>
</ul>
<h5>
HDFS
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html"
>
HDFS User Guide
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html"
>
High Availability With QJM
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html"
>
High Availability With NFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/Federation.html"
>
Federation
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html"
>
HDFS Snapshots
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html"
>
HDFS Architecture
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html"
>
Edits Viewer
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html"
>
Image Viewer
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html"
>
Permissions and HDFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html"
>
Quotas and HDFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/Hftp.html"
>
HFTP
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/LibHdfs.html"
>
C API libhdfs
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/WebHDFS.html"
>
WebHDFS REST API
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-hdfs-httpfs/index.html"
>
HttpFS Gateway
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html"
>
Short Circuit Local Reads
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html"
>
Centralized Cache Management
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html"
>
HDFS NFS Gateway
</a>
</li>
</ul>
<h5>
MapReduce
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html"
>
Compatibilty between Hadoop 1.x and Hadoop 2.x
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html"
>
Encrypted Shuffle
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html"
>
Pluggable Shuffle/Sort
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html"
>
Distributed Cache Deploy
</a>
</li>
</ul>
<h5>
YARN
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/YARN.html"
>
YARN Architecture
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"
>
Writing YARN Applications
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html"
>
Capacity Scheduler
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/FairScheduler.html"
>
Fair Scheduler
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"
>
Web Application Proxy
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/YarnCommands.html"
>
YARN Commands
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-sls/SchedulerLoadSimulator.html"
>
Scheduler Load Simulator
</a>
</li>
</ul>
<h5>
YARN REST APIs
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html"
>
Introduction
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html"
>
Resource Manager
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html"
>
Node Manager
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html"
>
MR Application Master
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html"
>
History Server
</a>
</li>
</ul>
<h5>
Auth
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/index.html"
>
Overview
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/Examples.html"
>
Examples
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/Configuration.html"
>
Configuration
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/BuildingIt.html"
>
Building
</a>
</li>
</ul>
<h5>
Reference
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/releasenotes.html"
>
Release Notes
</a>
</li>
<li
class=
"none"
>
<a
href=
"../api/index.html"
>
API docs
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CHANGES.txt"
>
Common CHANGES.txt
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/CHANGES.txt"
>
HDFS CHANGES.txt
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt"
>
MapReduce CHANGES.txt
</a>
</li>
</ul>
<h5>
Configuration
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/core-default.xml"
>
core-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml"
>
hdfs-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml"
>
mapred-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-common/yarn-default.xml"
>
yarn-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/DeprecatedProperties.html"
>
Deprecated Properties
</a>
</li>
</ul>
<a
href=
"http://maven.apache.org/"
title=
"Built by Maven"
class=
"poweredBy"
>
<img
alt=
"Built by Maven"
src=
"./images/logos/maven-feather.png"
/>
</a>
</div>
</div>
<div
id=
"bodyColumn"
>
<div
id=
"contentBox"
>
<!-- Copyright 2002-2004 The Apache Software Foundation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. -->
<div
class=
"section"
>
<h2>
Overview
<a
name=
"Overview"
></a></h2>
<p>
DistCp (distributed copy) is a tool used for large inter/intra-cluster
copying. It uses Map/Reduce to effect its distribution, error
handling and recovery, and reporting. It expands a list of files and
directories into input to map tasks, each of which will copy a partition
of the files specified in the source list.
</p>
<p>
The erstwhile implementation of DistCp has its share of quirks and
drawbacks, both in its usage, as well as its extensibility and
performance. The purpose of the DistCp refactor was to fix these shortcomings,
enabling it to be used and extended programmatically. New paradigms have
been introduced to improve runtime and setup performance, while simultaneously
retaining the legacy behaviour as default.
</p>
<p>
This document aims to describe the design of the new DistCp, its spanking
new features, their optimal use, and any deviance from the legacy
implementation.
</p>
</div>
</div>
</div>
<div
class=
"clear"
>
<hr/>
</div>
<div
id=
"footer"
>
<div
class=
"xright"
>
©
2014
Apache Software Foundation
-
<a
href=
"http://maven.apache.org/privacy-policy.html"
>
Privacy Policy
</a></div>
<div
class=
"clear"
>
<hr/>
</div>
</div>
</body>
</html>
Event Timeline
Log In to Comment