Homec4science

Use "\1", not "~", to delimit remarkup tokens

Authored by epriestley <git@epriestley.com> on Mar 20 2012, 03:13.

Description

Use "\1", not "~", to delimit remarkup tokens

Summary:

  • In general, markup engines need to handle rule precedence and prevent the output of rules from being incorrectly modified by further rules.
  • For instance, without proper precedence rules, http://x/ might be marked up into <a href="http://x/">http://x/</a>, and then that might be marked up into <a href="http:<em>x:/>http:</em>x/</a>. From there, it's a short jump to XSS (this particular case can't happen because we're more subtle with regexps, but this is a good example of the general problem).
  • Remarkup handles rule precedence by doing token replacement. Once we've matched a block of text to a rule, we remove it from the corpus and replace it with a token that doesn't match any rules. Then we run other rules safely, and eventually go back and replace all the tokens with the stored text. See PhutilRemarkupBlockStorage for a description of this.
  • We currently use "~1Z", "~2Z", etc., as tokens. These don't match other rules and survive HTML encoding, so they are appropriate selections.
  • But, we want to introduce xxx for strikethrough, which conflicts with these tokens.
  • Use "\11Z", "\12Z" as tokens instead. These have the same properties as the "~" rules but free up "~" for use.

Test Plan: Ran unit tests, which have reasonably extensive coverage of this case.

Reviewers: 20after4, jungejason, btrahan

Reviewed By: btrahan

CC: aran, epriestley

Differential Revision: https://secure.phabricator.com/D1942

Details

Committed
epriestley <git@epriestley.com>Mar 20 2012, 03:13
Pushed
aubortMar 17 2017, 12:03
Parents
rPHU6bcccffaeb16: Move ob_get_level() script initialization check to libphutil from arcanist
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPHU0765c8fe5db5: Use "\1", not "~", to delimit remarkup tokens (authored by epriestley <git@epriestley.com>).Mar 20 2012, 03:13