Homec4science

Detect moves and copies with some unchanged lines as moves or copies

Authored by epriestley <git@epriestley.com> on Mar 24 2015, 21:12.

Description

Detect moves and copies with some unchanged lines as moves or copies

Summary:
Ref T1266. We won't detect a move/copy if fewer than 3 lines are changed.

However, you may move a block like:

Complicated Line A
Trivial Line B
Complicated Line C

...where "Trivial Line B" is something like a curly brace. If you move this block somewhere that happened to previously have a similar trivial curly brace line, we won't be able to find 3 contiguous added lines in order to detect the copy/move.

Instead, consider both changed and unchanged lines when trying to find contiguous blocks. This allows us to detect across gaps where lines were not actually changed.

This new algorithm may be too liberal (for example, we may end up incorrectly identifying moved/copied code before or after changed lines, not just between changed lines), but we can keep an eye on it and tweak it. The algorithm is better factored and better covered, now.

Test Plan:

  • Added a unit test for this case.
  • Spot-checked a handful of diffs and generally saw behavior that made sense and looked better than before.

Reviewers: btrahan

Reviewed By: btrahan

Subscribers: epriestley

Maniphest Tasks: T1266

Differential Revision: https://secure.phabricator.com/D12146

Details

Committed
epriestley <git@epriestley.com>Mar 24 2015, 21:12
Pushed
aubortJan 31 2017, 17:16
Parents
rPH373aaa643a51: Clean up copy detection code a bit
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPHaa310230b6bc: Detect moves and copies with some unchanged lines as moves or copies (authored by epriestley <git@epriestley.com>).Mar 24 2015, 21:12