Homec4science

refextract: Increased the number of recognised author formats

Description

refextract: Increased the number of recognised author formats

  • Added 'surname [and surname] et al' recognition (et al must be present)
  • Improved underscore author text validation (escapes all tags and all

tagged content now, rather than just titles). Completely removes
the change that part of tagged text (or a tag itself) is seen as an
author.

  • Improved author split/dump heuristics (will dump into misc if two

author groups are found in a row, with minimal misc text between them)

  • Added some more test reference lines
  • Added comments to some methods (still need to complete this)

Event Timeline

Christopher Hayward <christopher.james.hayward@cern.ch> committed R3600:e139fdc53e34: refextract: Increased the number of recognised author formats (authored by Christopher Hayward <christopher.james.hayward@cern.ch>).Feb 3 2011, 17:25