Page MenuHomec4science

HdfsSnapshots.html
No OneTemporary

File Metadata

Created
Tue, Feb 25, 06:33

HdfsSnapshots.html

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!-- Generated by Apache Maven Doxia at 2014-02-11 -->
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Apache Hadoop 2.3.0 -
HFDS Snapshots</title>
<style type="text/css" media="all">
@import url("./css/maven-base.css");
@import url("./css/maven-theme.css");
@import url("./css/site.css");
</style>
<link rel="stylesheet" href="./css/print.css" type="text/css" media="print" />
<meta name="Date-Revision-yyyymmdd" content="20140211" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body class="composite">
<div id="banner">
<a href="http://hadoop.apache.org/" id="bannerLeft">
<img src="http://hadoop.apache.org/images/hadoop-logo.jpg" alt="" />
</a>
<a href="http://www.apache.org/" id="bannerRight">
<img src="http://www.apache.org/images/asf_logo_wide.png" alt="" />
</a>
<div class="clear">
<hr/>
</div>
</div>
<div id="breadcrumbs">
<div class="xleft">
<a href="http://www.apache.org/" class="externalLink">Apache</a>
&gt;
<a href="http://hadoop.apache.org/" class="externalLink">Hadoop</a>
&gt;
<a href="../">Apache Hadoop Project Dist POM</a>
&gt;
Apache Hadoop 2.3.0
</div>
<div class="xright"> <a href="http://wiki.apache.org/hadoop" class="externalLink">Wiki</a>
|
<a href="https://svn.apache.org/repos/asf/hadoop/" class="externalLink">SVN</a>
|
<a href="http://hadoop.apache.org/" class="externalLink">Apache Hadoop</a>
&nbsp;| Last Published: 2014-02-11
&nbsp;| Version: 2.3.0
</div>
<div class="clear">
<hr/>
</div>
</div>
<div id="leftColumn">
<div id="navcolumn">
<h5>General</h5>
<ul>
<li class="none">
<a href="../../index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SingleCluster.html">Single Node Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ClusterSetup.html">Cluster Setup</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CommandsManual.html">Hadoop Commands Reference</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/FileSystemShell.html">File System Shell</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Compatibility.html">Hadoop Compatibility</a>
</li>
</ul>
<h5>Common</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CLIMiniCluster.html">CLI Mini Cluster</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/NativeLibraries.html">Native Libraries</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/Superusers.html">Superusers</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/SecureMode.html">Secure Mode</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/ServiceLevelAuth.html">Service Level Authorization</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/HttpAuthentication.html">HTTP Authentication</a>
</li>
</ul>
<h5>HDFS</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html">HDFS User Guide</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html">High Availability With QJM</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithNFS.html">High Availability With NFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Federation.html">Federation</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html">HDFS Snapshots</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsDesign.html">HDFS Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsEditsViewer.html">Edits Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html">Image Viewer</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsPermissionsGuide.html">Permissions and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsQuotaAdminGuide.html">Quotas and HDFS</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/Hftp.html">HFTP</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/LibHdfs.html">C API libhdfs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/WebHDFS.html">WebHDFS REST API</a>
</li>
<li class="none">
<a href="../../hadoop-hdfs-httpfs/index.html">HttpFS Gateway</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/ShortCircuitLocalReads.html">Short Circuit Local Reads</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CentralizedCacheManagement.html">Centralized Cache Management</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/HdfsNfsGateway.html">HDFS NFS Gateway</a>
</li>
</ul>
<h5>MapReduce</h5>
<ul>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html">Compatibilty between Hadoop 1.x and Hadoop 2.x</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Encrypted Shuffle</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/PluggableShuffleAndPluggableSort.html">Pluggable Shuffle/Sort</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/DistributedCacheDeploy.html">Distributed Cache Deploy</a>
</li>
</ul>
<h5>YARN</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YARN.html">YARN Architecture</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html">Writing YARN Applications</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html">Capacity Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/FairScheduler.html">Fair Scheduler</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebApplicationProxy.html">Web Application Proxy</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/YarnCommands.html">YARN Commands</a>
</li>
<li class="none">
<a href="../../hadoop-sls/SchedulerLoadSimulator.html">Scheduler Load Simulator</a>
</li>
</ul>
<h5>YARN REST APIs</h5>
<ul>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html">Introduction</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/ResourceManagerRest.html">Resource Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/NodeManagerRest.html">Node Manager</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/MapredAppMasterRest.html">MR Application Master</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-site/HistoryServerRest.html">History Server</a>
</li>
</ul>
<h5>Auth</h5>
<ul>
<li class="none">
<a href="../../hadoop-auth/index.html">Overview</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Examples.html">Examples</a>
</li>
<li class="none">
<a href="../../hadoop-auth/Configuration.html">Configuration</a>
</li>
<li class="none">
<a href="../../hadoop-auth/BuildingIt.html">Building</a>
</li>
</ul>
<h5>Reference</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/releasenotes.html">Release Notes</a>
</li>
<li class="none">
<a href="../../api/index.html">API docs</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/CHANGES.txt">Common CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/CHANGES.txt">HDFS CHANGES.txt</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-mapreduce/CHANGES.txt">MapReduce CHANGES.txt</a>
</li>
</ul>
<h5>Configuration</h5>
<ul>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/core-default.xml">core-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-hdfs/hdfs-default.xml">hdfs-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml">mapred-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-default.xml</a>
</li>
<li class="none">
<a href="../../hadoop-project-dist/hadoop-common/DeprecatedProperties.html">Deprecated Properties</a>
</li>
</ul>
<a href="http://maven.apache.org/" title="Built by Maven" class="poweredBy">
<img alt="Built by Maven" src="./images/logos/maven-feather.png"/>
</a>
</div>
</div>
<div id="bodyColumn">
<div id="contentBox">
<!-- Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
(the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. -->
<h1>HDFS Snapshots</h1>
<ul>
<li><a href="#Overview">Overview</a>
<ul>
<li><a href="#Snapshottable_Directories">Snapshottable Directories</a></li>
<li><a href="#Snapshot_Paths">Snapshot Paths</a></li></ul></li>
<li><a href="#Snapshot_Operations">Snapshot Operations</a>
<ul>
<li><a href="#Administrator_Operations">Administrator Operations</a>
<ul>
<li><a href="#Allow_Snapshots">Allow Snapshots</a></li>
<li><a href="#Disallow_Snapshots">Disallow Snapshots</a></li></ul></li>
<li><a href="#User_Operations">User Operations</a>
<ul>
<li><a href="#Create_Snapshots">Create Snapshots</a></li>
<li><a href="#Delete_Snapshots">Delete Snapshots</a></li>
<li><a href="#Rename_Snapshots">Rename Snapshots</a></li>
<li><a href="#Get_Snapshottable_Directory_Listing">Get Snapshottable Directory Listing</a></li>
<li><a href="#Get_Snapshots_Difference_Report">Get Snapshots Difference Report</a></li></ul></li></ul></li></ul>
<a name="Overview"></a>
<div class="section" id="Overview">
<h2>Overview<a name="Overview"></a></h2>
<p>
HDFS Snapshots are read-only point-in-time copies of the file system.
Snapshots can be taken on a subtree of the file system or the entire file system.
Some common use cases of snapshots are data backup, protection against user errors
and disaster recovery.
</p>
<p>
The implementation of HDFS Snapshots is efficient:
</p>
<ul>
<li>Snapshot creation is instantaneous:
the cost is <i>O(1)</i> excluding the inode lookup time.</li>
<li>Additional memory is used only when modifications are made relative to a snapshot:
memory usage is <i>O(M)</i>,
where <i>M</i> is the number of modified files/directories.</li>
<li>Blocks in datanodes are not copied:
the snapshot files record the block list and the file size.
There is no data copying.</li>
<li>Snapshots do not adversely affect regular HDFS operations:
modifications are recorded in reverse chronological order
so that the current data can be accessed directly.
The snapshot data is computed by subtracting the modifications
from the current data.</li>
</ul>
<a name="SnapshottableDirectories"></a>
<div class="section" id="SnapshottableDirectories">
<h3>Snapshottable Directories<a name="Snapshottable_Directories"></a></h3>
<p>
Snapshots can be taken on any directory once the directory has been set as
<i>snapshottable</i>.
A snapshottable directory is able to accommodate 65,536 simultaneous snapshots.
There is no limit on the number of snapshottable directories.
Administrators may set any directory to be snapshottable.
If there are snapshots in a snapshottable directory,
the directory can be neither deleted nor renamed
before all the snapshots are deleted.
</p>
<p>
Nested snapshottable directories are currently not allowed.
In other words, a directory cannot be set to snapshottable
if one of its ancestors/descendants is a snapshottable directory.
</p>
</div>
<a name="SnapshotPaths"></a>
<div class="section" id="SnapshotPaths">
<h3>Snapshot Paths<a name="Snapshot_Paths"></a></h3>
<p>
For a snapshottable directory,
the path component <i>&quot;.snapshot&quot;</i> is used for accessing its snapshots.
Suppose <tt>/foo</tt> is a snapshottable directory,
<tt>/foo/bar</tt> is a file/directory in <tt>/foo</tt>,
and <tt>/foo</tt> has a snapshot <tt>s0</tt>.
Then, the path </p>
<div class="source">
<pre>/foo/.snapshot/s0/bar</pre></div>
refers to the snapshot copy of <tt>/foo/bar</tt>.
The usual API and CLI can work with the &quot;.snapshot&quot; paths.
The following are some examples.
<ul>
<li>Listing all the snapshots under a snapshottable directory:
<div class="source">
<pre>hdfs dfs -ls /foo/.snapshot</pre></div></li>
<li>Listing the files in snapshot <tt>s0</tt>:
<div class="source">
<pre>hdfs dfs -ls /foo/.snapshot/s0</pre></div></li>
<li>Copying a file from snapshot <tt>s0</tt>:
<div class="source">
<pre>hdfs dfs -cp /foo/.snapshot/s0/bar /tmp</pre></div></li>
</ul>
<p>
<b>Note</b> that the name &quot;.snapshot&quot; is now a reserved file name in HDFS
so that users cannot create a file/directory with &quot;.snapshot&quot; as the name.
If &quot;.snapshot&quot; is used in a previous version of HDFS, it must be renamed before upgrade;
otherwise, upgrade will fail.
</p>
</div>
</div>
<a name="SnapshotOperations"></a>
<div class="section" id="SnapshotOperations">
<h2>Snapshot Operations<a name="Snapshot_Operations"></a></h2>
<a name="AdministratorOperations"></a>
<div class="section" id="AdministratorOperations">
<h3>Administrator Operations<a name="Administrator_Operations"></a></h3>
<p>
The operations described in this section require superuser privilege.
</p>
<div class="section">
<h4>Allow Snapshots<a name="Allow_Snapshots"></a></h4>
<p>
Allowing snapshots of a directory to be created.
If the operation completes successfully, the directory becomes snapshottable.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs dfsadmin -allowSnapshot &lt;path&gt;</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>void allowSnapshot(Path path)</tt> in <tt>HdfsAdmin</tt>.
</p>
</div>
<div class="section">
<h4>Disallow Snapshots<a name="Disallow_Snapshots"></a></h4>
<p>
Disallowing snapshots of a directory to be created.
All snapshots of the directory must be deleted before disallowing snapshots.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs dfsadmin -disallowSnapshot &lt;path&gt;</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>void disallowSnapshot(Path path)</tt> in <tt>HdfsAdmin</tt>.
</p>
</div></div>
<a name="UserOperations"></a>
<div class="section" id="UserOperations">
<h3>User Operations<a name="User_Operations"></a></h3>
<p>
The section describes user operations.
Note that HDFS superuser can perform all the operations
without satisfying the permission requirement in the individual operations.
</p>
<div class="section">
<h4>Create Snapshots<a name="Create_Snapshots"></a></h4>
<p>
Create a snapshot of a snapshottable directory.
This operation requires owner privilege of the snapshottable directory.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs dfs -createSnapshot &lt;path&gt; [&lt;snapshotName&gt;]</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
<tr class="b">
<td>snapshotName</td>
<td>
The snapshot name, which is an optional argument.
When it is omitted, a default name is generated using a timestamp with the format
<tt>&quot;'s'yyyyMMdd-HHmmss.SSS&quot;</tt>, e.g. &quot;s20130412-151029.033&quot;.
</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>Path createSnapshot(Path path)</tt> and
<tt>Path createSnapshot(Path path, String snapshotName)</tt>
in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><tt>FileSystem</tt></a>.
The snapshot path is returned in these methods.
</p>
</div>
<div class="section">
<h4>Delete Snapshots<a name="Delete_Snapshots"></a></h4>
<p>
Delete a snapshot of from a snapshottable directory.
This operation requires owner privilege of the snapshottable directory.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs dfs -deleteSnapshot &lt;path&gt; &lt;snapshotName&gt;</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
<tr class="b">
<td>snapshotName</td>
<td>The snapshot name.</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>void deleteSnapshot(Path path, String snapshotName)</tt>
in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><tt>FileSystem</tt></a>.
</p>
</div>
<div class="section">
<h4>Rename Snapshots<a name="Rename_Snapshots"></a></h4>
<p>
Rename a snapshot.
This operation requires owner privilege of the snapshottable directory.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs dfs -renameSnapshot &lt;path&gt; &lt;oldName&gt; &lt;newName&gt;</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
<tr class="b">
<td>oldName</td>
<td>The old snapshot name.</td></tr>
<tr class="a">
<td>newName</td>
<td>The new snapshot name.</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>void renameSnapshot(Path path, String oldName, String newName)</tt>
in <a href="../../api/org/apache/hadoop/fs/FileSystem.html"><tt>FileSystem</tt></a>.
</p>
</div>
<div class="section">
<h4>Get Snapshottable Directory Listing<a name="Get_Snapshottable_Directory_Listing"></a></h4>
<p>
Get all the snapshottable directories where the current user has permission to take snapshtos.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs lsSnapshottableDir</pre></div></li>
<li>Arguments: none</li>
</ul>
<p>
See also the corresponding Java API
<tt>SnapshottableDirectoryStatus[] getSnapshottableDirectoryListing()</tt>
in <tt>DistributedFileSystem</tt>.
</p>
</div>
<div class="section">
<h4>Get Snapshots Difference Report<a name="Get_Snapshots_Difference_Report"></a></h4>
<p>
Get the differences between two snapshots.
This operation requires read access privilege for all files/directories in both snapshots.
</p>
<ul>
<li>Command:
<div class="source">
<pre>hdfs snapshotDiff &lt;path&gt; &lt;fromSnapshot&gt; &lt;toSnapshot&gt;</pre></div></li>
<li>Arguments:
<table border="0" class="bodyTable">
<tr class="a">
<td>path</td>
<td>The path of the snapshottable directory.</td></tr>
<tr class="b">
<td>fromSnapshot</td>
<td>The name of the starting snapshot.</td></tr>
<tr class="a">
<td>toSnapshot</td>
<td>The name of the ending snapshot.</td></tr>
</table></li>
</ul>
<p>
See also the corresponding Java API
<tt>SnapshotDiffReport getSnapshotDiffReport(Path path, String fromSnapshot, String toSnapshot)</tt>
in <tt>DistributedFileSystem</tt>.
</p>
</div></div>
</div>
</div>
</div>
<div class="clear">
<hr/>
</div>
<div id="footer">
<div class="xright">&#169; 2014
Apache Software Foundation
- <a href="http://maven.apache.org/privacy-policy.html">Privacy Policy</a></div>
<div class="clear">
<hr/>
</div>
</div>
</body>
</html>

Event Timeline