harden_directory.php: reduce footprint of directories by hardlinking them
Summary:
This is a small step toward a world of watirworld / sandcastle / integration
tests / async unit tests / web bisect. Do Sandcastle-style disk-size reduction
via hardlinks.
I've incorporated a couple of tricks and such:
- Detect the hardlink limit thing and deal with it, I don't remember what
Sandcastle's solution was (presumably something similar?) but this should dodge
that.
- If possible, use 'git show-tree' instead of hashing files ourselves.
- If not, use git hashes.
- Deal with executable bits correctly.
- Deal correctly with git submodules.
- Layout the workflow so that the "soft" directory can be used by other
processes and silently (and atomically) becomes the hard directory later.
Ideally this should let us use the "soft" directory immediately, but still
benefit from hardening it.
@slawekbiel, am I missing anything obvious here?
Test Plan:
Created a sandpit with every revision of Phabricator in it. It consumes 37MB of
disk + 3.0MB of hardlinks (for ~650 copies), versus 63MB for a single git
repository.
Made an attempt to verify all the special behaviors (no git, executable,
symlinks, hardlink limit).
I haven't played with the perf on this. It's pretty fast (<<1 second) for
incremental sandpits, although the initial copy of 12,000 files took a little
while.
Reviewed By: slawekbiel
Reviewers: slawekbiel, aran, jungejason, tuomaspelkonen
CC: aran, slawekbiel, epriestley
Differential Revision: 464