Homec4science

WebSearch: word pair index for phrase searches

Authored by Ludmila Marian <ludmila.marian@gmail.com> on Nov 3 2009, 13:55.

Description

WebSearch: word pair index for phrase searches

  • Uses word pair indexes for partial phrase matching and exact phrase matching for fields specified in internal variable CFG_WEBSEARCH_IDXPAIRS_FIELDS. Notably any field, title, abstract, and caption indices where this is the expected search behaviour. For other indices such as reportnumber the old behaviour is still kept since it may be important for matching etc. The per-index behaviour will be made fully confifgurable later. (closes #137)
  • Creates new BibIndexTokenizer classes for all the methods that were doing phrase splitting (into words, into pairs, into phrases).
  • Moves all the washing functions from bibindex_engine in a new file bibindex_engine_washer.py.
  • Amends Search Tips and Search Guide to express the non-difference in search syntax between simple and double quoted expressions for some indices.

Co-authored-by: Tibor Simko <tibor.simko@cern.ch>

Details

Committed
Tibor Simko <tibor.simko@cern.ch>Sep 12 2012, 05:04
Parents
R3600:928826fed558: WebSubmit: bibdocfile-related build fix
Branches
Unknown
Tags
Unknown

Event Timeline

Tibor Simko <tibor.simko@cern.ch> committed R3600:d102ae65cf48: WebSearch: word pair index for phrase searches (authored by Ludmila Marian <ludmila.marian@gmail.com>).Sep 12 2012, 05:04