Page Menu
Home
c4science
Search
Configure Global Search
Log In
Files
F102935335
cli.html
No One
Temporary
Actions
Download File
Edit File
Delete File
View Transforms
Subscribe
Mute Notifications
Award Token
Subscribers
None
File Metadata
Details
File Info
Storage
Attached
Created
Tue, Feb 25, 15:18
Size
19 KB
Mime Type
text/html
Expires
Thu, Feb 27, 15:18 (1 d, 20 h)
Engine
blob
Format
Raw Data
Handle
24363839
Attached To
R3704 elastic-yarn
cli.html
View Options
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2014-02-11 -->
<html
xmlns=
"http://www.w3.org/1999/xhtml"
>
<head>
<title>
Apache Hadoop Distributed Copy -
Command Line Options
</title>
<style
type=
"text/css"
media=
"all"
>
@import
url
(
"./css/maven-base.css"
)
;
@import
url
(
"./css/maven-theme.css"
)
;
@import
url
(
"./css/site.css"
)
;
</style>
<link
rel=
"stylesheet"
href=
"./css/print.css"
type=
"text/css"
media=
"print"
/>
<meta
name=
"Date-Revision-yyyymmdd"
content=
"20140211"
/>
<meta
http-equiv=
"Content-Type"
content=
"text/html; charset=UTF-8"
/>
</head>
<body
class=
"composite"
>
<div
id=
"banner"
>
<a
href=
"http://hadoop.apache.org/"
id=
"bannerLeft"
>
<img
src=
"http://hadoop.apache.org/images/hadoop-logo.jpg"
alt=
""
/>
</a>
<a
href=
"http://www.apache.org/"
id=
"bannerRight"
>
<img
src=
"http://www.apache.org/images/asf_logo_wide.png"
alt=
""
/>
</a>
<div
class=
"clear"
>
<hr/>
</div>
</div>
<div
id=
"breadcrumbs"
>
<div
class=
"xleft"
>
<a
href=
"http://www.apache.org/"
class=
"externalLink"
>
Apache
</a>
>
<a
href=
"http://hadoop.apache.org/"
class=
"externalLink"
>
Hadoop
</a>
>
Apache Hadoop Distributed Copy
</div>
<div
class=
"xright"
>
<a
href=
"http://wiki.apache.org/hadoop"
class=
"externalLink"
>
Wiki
</a>
|
<a
href=
"https://svn.apache.org/repos/asf/hadoop/"
class=
"externalLink"
>
SVN
</a>
| Last Published: 2014-02-11
| Version: 2.3.0
</div>
<div
class=
"clear"
>
<hr/>
</div>
</div>
<div
id=
"leftColumn"
>
<div
id=
"navcolumn"
>
<h5>
General
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../index.html"
>
Overview
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/SingleCluster.html"
>
Single Node Setup
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/ClusterSetup.html"
>
Cluster Setup
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CommandsManual.html"
>
Hadoop Commands Reference
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/FileSystemShell.html"
>
File System Shell
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/Compatibility.html"
>
Hadoop Compatibility
</a>
</li>
</ul>
<h5>
Common
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CLIMiniCluster.html"
>
CLI Mini Cluster
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/NativeLibraries.html"
>
Native Libraries
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/Superusers.html"
>
Superusers
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/SecureMode.html"
>
Secure Mode
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html"
>
Service Level Authorization
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/HttpAuthentication.html"
>
HTTP Authentication
</a>
</li>
</ul>
<h5>
HDFS
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html"
>
HDFS User Guide
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html"
>
High Availability With QJM
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html"
>
High Availability With NFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/Federation.html"
>
Federation
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html"
>
HDFS Snapshots
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html"
>
HDFS Architecture
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html"
>
Edits Viewer
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html"
>
Image Viewer
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html"
>
Permissions and HDFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html"
>
Quotas and HDFS
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/Hftp.html"
>
HFTP
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/LibHdfs.html"
>
C API libhdfs
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/WebHDFS.html"
>
WebHDFS REST API
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-hdfs-httpfs/index.html"
>
HttpFS Gateway
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html"
>
Short Circuit Local Reads
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html"
>
Centralized Cache Management
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html"
>
HDFS NFS Gateway
</a>
</li>
</ul>
<h5>
MapReduce
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html"
>
Compatibilty between Hadoop 1.x and Hadoop 2.x
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html"
>
Encrypted Shuffle
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html"
>
Pluggable Shuffle/Sort
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html"
>
Distributed Cache Deploy
</a>
</li>
</ul>
<h5>
YARN
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/YARN.html"
>
YARN Architecture
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"
>
Writing YARN Applications
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html"
>
Capacity Scheduler
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/FairScheduler.html"
>
Fair Scheduler
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html"
>
Web Application Proxy
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/YarnCommands.html"
>
YARN Commands
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-sls/SchedulerLoadSimulator.html"
>
Scheduler Load Simulator
</a>
</li>
</ul>
<h5>
YARN REST APIs
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html"
>
Introduction
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html"
>
Resource Manager
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html"
>
Node Manager
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html"
>
MR Application Master
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html"
>
History Server
</a>
</li>
</ul>
<h5>
Auth
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/index.html"
>
Overview
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/Examples.html"
>
Examples
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/Configuration.html"
>
Configuration
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-auth/BuildingIt.html"
>
Building
</a>
</li>
</ul>
<h5>
Reference
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/releasenotes.html"
>
Release Notes
</a>
</li>
<li
class=
"none"
>
<a
href=
"../api/index.html"
>
API docs
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/CHANGES.txt"
>
Common CHANGES.txt
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/CHANGES.txt"
>
HDFS CHANGES.txt
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt"
>
MapReduce CHANGES.txt
</a>
</li>
</ul>
<h5>
Configuration
</h5>
<ul>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/core-default.xml"
>
core-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml"
>
hdfs-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml"
>
mapred-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-yarn/hadoop-yarn-common/yarn-default.xml"
>
yarn-default.xml
</a>
</li>
<li
class=
"none"
>
<a
href=
"../hadoop-project-dist/hadoop-common/DeprecatedProperties.html"
>
Deprecated Properties
</a>
</li>
</ul>
<a
href=
"http://maven.apache.org/"
title=
"Built by Maven"
class=
"poweredBy"
>
<img
alt=
"Built by Maven"
src=
"./images/logos/maven-feather.png"
/>
</a>
</div>
</div>
<div
id=
"bodyColumn"
>
<div
id=
"contentBox"
>
<!-- Copyright 2002-2004 The Apache Software Foundation
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. -->
<div
class=
"section"
>
<h2>
Options Index
<a
name=
"Options_Index"
></a></h2>
<table
border=
"0"
class=
"bodyTable"
>
<tr
class=
"a"
>
<th>
Flag
</th>
<th>
Description
</th>
<th>
Notes
</th></tr>
<tr
class=
"b"
>
<td><tt>
-p[rbugp]
</tt></td>
<td>
Preserve
<br
/>
r: replication number
<br
/>
b: block size
<br
/>
u: user
<br
/>
g: group
<br
/>
p: permission
<br
/></td>
<td>
Modification times are not preserved. Also, when
<tt>
-update
</tt>
is specified, status updates will
<b>
not
</b>
be synchronized unless the file sizes
also differ (i.e. unless the file is re-created).
</td></tr>
<tr
class=
"a"
>
<td><tt>
-i
</tt></td>
<td>
Ignore failures
</td>
<td>
As explained in the Appendix, this option
will keep more accurate statistics about the copy than the
default case. It also preserves logs from failed copies, which
can be valuable for debugging. Finally, a failing map will not
cause the job to fail before all splits are attempted.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-log
<
logdir
>
</tt></td>
<td>
Write logs to
<
logdir
>
</td>
<td>
DistCp keeps logs of each file it attempts to copy as map
output. If a map fails, the log output will not be retained if
it is re-executed.
</td></tr>
<tr
class=
"a"
>
<td><tt>
-m
<
num_maps
>
</tt></td>
<td>
Maximum number of simultaneous copies
</td>
<td>
Specify the number of maps to copy data. Note that more maps
may not necessarily improve throughput.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-overwrite
</tt></td>
<td>
Overwrite destination
</td>
<td>
If a map fails and
<tt>
-i
</tt>
is not specified, all the
files in the split, not only those that failed, will be recopied.
As discussed in the Usage documentation, it also changes
the semantics for generating destination paths, so users should
use this carefully.
</td></tr>
<tr
class=
"a"
>
<td><tt>
-update
</tt></td>
<td>
Overwrite if src size different from dst size
</td>
<td>
As noted in the preceding, this is not a
"
sync
"
operation. The only criterion examined is the source and
destination file sizes; if they differ, the source file
replaces the destination file. As discussed in the
Usage documentation, it also changes the semantics for
generating destination paths, so users should use this carefully.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-f
<
urilist_uri
>
</tt></td>
<td>
Use list at
<
urilist_uri
>
as src list
</td>
<td>
This is equivalent to listing each source on the command
line. The
<tt>
urilist_uri
</tt>
list should be a fully
qualified URI.
</td></tr>
<tr
class=
"a"
>
<td><tt>
-filelimit
<
n
>
</tt></td>
<td>
Limit the total number of files to be
<
= n
</td>
<td><b>
Deprecated!
</b>
Ignored in the new DistCp.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-sizelimit
<
n
>
</tt></td>
<td>
Limit the total size to be
<
= n bytes
</td>
<td><b>
Deprecated!
</b>
Ignored in the new DistCp.
</td></tr>
<tr
class=
"a"
>
<td><tt>
-delete
</tt></td>
<td>
Delete the files existing in the dst but not in src
</td>
<td>
The deletion is done by FS Shell. So the trash will be used,
if it is enable.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-strategy {dynamic|uniformsize}
</tt></td>
<td>
Choose the copy-strategy to be used in DistCp.
</td>
<td>
By default, uniformsize is used. (i.e. Maps are balanced on the
total size of files copied by each map. Similar to legacy.)
If
"
dynamic
"
is specified,
<tt>
DynamicInputFormat
</tt>
is
used instead. (This is described in the Architecture section,
under InputFormats.)
</td></tr>
<tr
class=
"a"
>
<td><tt>
-bandwidth
</tt></td>
<td>
Specify bandwidth per map, in MB/second.
</td>
<td>
Each map will be restricted to consume only the specified
bandwidth. This is not always exact. The map throttles back
its bandwidth consumption during a copy, such that the
<b>
net
</b>
bandwidth used tends towards the
specified value.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-atomic {-tmp
<
tmp_dir
>
}
</tt></td>
<td>
Specify atomic commit, with optional tmp directory.
</td>
<td><tt>
-atomic
</tt>
instructs DistCp to copy the source
data to a temporary target location, and then move the
temporary target to the final-location atomically. Data will
either be available at final target in a complete and consistent
form, or not at all.
Optionally,
<tt>
-tmp
</tt>
may be used to specify the
location of the tmp-target. If not specified, a default is
chosen.
<b>
Note:
</b>
tmp_dir must be on the final
target cluster.
</td></tr>
<tr
class=
"a"
>
<td><tt>
-mapredSslConf
<
ssl_conf_file
>
</tt></td>
<td>
Specify SSL Config file, to be used with HSFTP source
</td>
<td>
When using the hsftp protocol with a source, the security-
related properties may be specified in a config-file and
passed to DistCp.
<
ssl_conf_file
>
needs to be in
the classpath.
</td></tr>
<tr
class=
"b"
>
<td><tt>
-async
</tt></td>
<td>
Run DistCp asynchronously. Quits as soon as the Hadoop
Job is launched.
</td>
<td>
The Hadoop Job-id is logged, for tracking.
</td></tr>
</table>
</div>
</div>
</div>
<div
class=
"clear"
>
<hr/>
</div>
<div
id=
"footer"
>
<div
class=
"xright"
>
©
2014
Apache Software Foundation
-
<a
href=
"http://maven.apache.org/privacy-policy.html"
>
Privacy Policy
</a></div>
<div
class=
"clear"
>
<hr/>
</div>
</div>
</body>
</html>
Event Timeline
Log In to Comment