Improve UTF8StringTruncator behavior for huge inputs
Summary:
Fixes T9632. Currently, when you truncate a very big input (like a huge paste) into a very small output (like a snippet of that paste), it can take a long time. The amount of work we do is proportional to the size of the input.
Reorganize some of the UTF8 code so we can do less work, and only examine about as much of the input as we can possibly need to look at in order to generate the desired output.
Test Plan:
- This code is well-covered by unit tests.
- Added a new unit test which ran in ~4s before the change and runs in ~2ms afterward on my machine (2000x).
- Created a huge paste, viewed from web UI.
Reviewers: chad
Reviewed By: chad
Maniphest Tasks: T9632
Differential Revision: https://secure.phabricator.com/D14339