Homec4science

Improve Search architecture

Authored by epriestley <git@epriestley.com> on Dec 21 2012, 23:21.

Description

Improve Search architecture

Summary:
The search indexing API has several problems right now:

  • Always runs in-process.
    • It would be nice to push this into the task queue for performance. However, the API currently passses an object all the way through (and some indexers depend on preloaded object attributes), so it can't be dumped into the task queue at any stage since we can't serialize it.
    • Being able to use the task queue will also make rebuilding indexes faster.
    • Instead, make the API phid-oriented.
  • No uniform indexing API.
    • Each "Editor" currently calls SomeCustomIndexer::indexThing(). This won't work with AbstractTransactions. The API is also just weird.
    • Instead, provide a uniform API.
  • No uniform CLI.
    • We have scripts/search/reindex_everything.php, but it doesn't actually index everything. Each new document type needs to be separately added to it, leading to stuff like D3839. Third-party applications can't provide indexers.
    • Instead, let indexers expose documents for indexing.
  • Not application-oriented.
    • All the indexers live in search/ right now, which isn't the right organization in an application-orietned view of the world.
    • Instead, move indexers to applications and load them with SymbolLoader.

Test Plan:

  • bin/search index
    • Indexed one revision, one task.
    • Indexed --type TASK, --type DREV, etc., for all types.
    • Indexed --all.
  • Added the word "saboteur" to a revision, task, wiki page, and question and then searched for it.
    • Creating users is a pain; searched for a user after indexing.
    • Creating commits is a pain; searched for a commit after indexing.
    • Mocks aren't currently loadable in the result view, so their indexing is moot.

Reviewers: btrahan, vrana

Reviewed By: btrahan

CC: 20after4, aran

Maniphest Tasks: T1991, T2104

Differential Revision: https://secure.phabricator.com/D4261

Details

Committed
epriestley <git@epriestley.com>Dec 21 2012, 23:21
Pushed
aubortJan 31 2017, 17:16
Parents
rPHaae5f9efd3d3: Implement a more compact, general database-backed key-value cache
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPHf6b196474008: Improve Search architecture (authored by epriestley <git@epriestley.com>).Dec 21 2012, 23:21