Homec4science

Make PhutilProseDifferenceEngine degrade on large inputs instead of consuming…

Authored by epriestley <git@epriestley.com> on Oct 7 2016, 01:06.

Description

Make PhutilProseDifferenceEngine degrade on large inputs instead of consuming all RAM in the entire world

Summary:
Fixes T11743. For very large inputs which can't be simplified (e.g., dissimilar text at the beginning and end across a very large number of paragraphs) we currently try to build an edit distance matrix, but this is exponential in space and time and slowness and how PHP-ish it is.

Instead, just give up for very large inputs. This will still prose-diff any corpuses with fewer than 128 paragraphs, which is the vast majority of documents.

Test Plan:

  • Created a degnerate Paste similar to the one in T11743.
  • Viewed change details and ran bin/worker execute --id <id> --trace before and after patch.
    • Before: everything hung forever.
    • After: everything worked great, although the diff wasn't perfect.

Reviewers: chad

Reviewed By: chad

Subscribers: wizsrk, gregprice

Maniphest Tasks: T11743

Differential Revision: https://secure.phabricator.com/D16682

Details

Committed
epriestley <git@epriestley.com>Oct 7 2016, 15:22
Pushed
aubortMar 17 2017, 12:03
Parents
rPHU48fb6fac0232: Add support for exporting RECURRENCE-ID in ICS events
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPHU50cd143e07ca: Make PhutilProseDifferenceEngine degrade on large inputs instead of consuming… (authored by epriestley <git@epriestley.com>).Oct 7 2016, 15:22