Detect copied code by own algorithm
Summary:
Required for D2321.
Deprecates D2320.
Uses algorithm described at D2320#16.
Complexity of this algorithm would be O(N) (N stands for number of lines) in most cases.
The worst case is O(A*F) (A stands for number of added lines, F for number of colliding lines) but it should be pretty rare. Real-world example is 100 modified files with moved license block (15 lines) in each. This will require 1500*100 comparisons because the algorithm will be trying to find the longest block in each file.
Test Plan:
arc diff --only on commit with copied code.
More tests on standalone algorithm.
Reviewers: epriestley
Reviewed By: epriestley
CC: aran, Koolvin
Differential Revision: https://secure.phabricator.com/D2333