Homec4science

Use stemming in the MySQL fulltext search engine

Authored by epriestley <git@epriestley.com> on Nov 25 2016, 22:52.

Description

Use stemming in the MySQL fulltext search engine

Summary:
Ref T6740. When we index a document, also save a copy of the stemmed version.

When querying, search the combined corpus for the terms.

(We may need to tune this a bit later since it's possible for literal, quoted terms to match in the stemmed section, but I think this wil rarely cause issues in practice.)

A downside here is that search sort of breaks if you upgrade into this and don't reindex. I wasn't able to find a way to issue the query that remained compatible with older indexes and didn't have awful performance, so my plan is:

  • Put this on secure.
  • Rebuild the index.
  • If things look good after a couple of days, add a way that we can tell people they need to rebuild the search index with a setup warning.

We might get some reports between now and then, but if this is super awful we should know by the end of the weekend.

Test Plan:
WOW AMAZING

{F2021466}

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T6740

Differential Revision: https://secure.phabricator.com/D16947

Details

Committed
epriestley <git@epriestley.com>Nov 26 2016, 00:30
Pushed
aubortJan 31 2017, 17:16
Parents
rPHd54c14c64444: If InnoDB FULLTEXT is available, use it for for fulltext indexes
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPH7c5b5327c8cc: Use stemming in the MySQL fulltext search engine (authored by epriestley <git@epriestley.com>).Nov 26 2016, 00:30