Homec4science

Improve construction of commit queries from blame lookups

Authored by epriestley <git@epriestley.com> on Jan 7 2016, 00:34.

Description

Improve construction of commit queries from blame lookups

Summary:
Ref T2450. File blame tends to have the same commit a lot of times, and we don't do lookups like this efficiently right now.

In particular, for a file like __phutil_library_map__.php, we would issue a query with ~9,000 clauses like this:

(repositoryID = 1 AND commitIdentifier LIKE "XYZ%")

...but only a few hundred of those identifiers were unique. Instead, issue only one clause per unique identifier.

MySQL also seems to do a little better on "commitIdentifier = X" if we have the full hash, so special case that slightly.

Test Plan:

  • Issuing a query for only unique identifiers dropped the cost from 400ms to 100ms locally.
  • Swapping to = if we have the full hash dropped the cost from 100ms to 75ms locally.

Reviewers: chad

Reviewed By: chad

Maniphest Tasks: T2450

Differential Revision: https://secure.phabricator.com/D14962

Details

Committed
epriestley <git@epriestley.com>Jan 7 2016, 03:43
Pushed
aubortJan 31 2017, 17:16
Parents
rPH741118a08f3d: Improve Diffusion behavior for directories with impressive numbers of files
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPH0759b84d77c0: Improve construction of commit queries from blame lookups (authored by epriestley <git@epriestley.com>).Jan 7 2016, 03:43