Homec4science

Fix LiskMigrationIterator scaling in O(n^2)

Authored by Michael Schuett <schuettm@uberatc.com> on May 9 2016, 21:40.

Description

Fix LiskMigrationIterator scaling in O(n^2)

Summary:
When running migrations that insert more than 1000 rows at a time the PhutilChunkedIterator becomes exponentially slower at generating chunks. The code below was where I encountered this issue originally.

<?php

$message_table = new HarbormasterBuildUnitMessage();

$unit_test_table = new CIUnitTestHarbormasterBuildUnitMessage();
$conn_w = $unit_test_table->establishConnection('w');

$size = 5000;
$row_iter = id(new LiskMigrationIterator($message_table))->setPageSize($size);
$chunk_iter = new PhutilChunkedIterator($row_iter, $size);

// point 1
foreach ($chunk_iter as $chunk) {
  // point 2
  $sql = array();
  ...
}

where the time between the point 1 and point 2 comment exceed 1 second around the 5k mark and become unresponsive around 15k. PHP seems to do a very poor job at converting large arrays to boolean values as can be seen with this code sample. As I was testing this I noticed it's completely data dependent so it's possibly for arrays containg scalar values this is much more performant.

<?php

$root = dirname(__FILE__);
require_once $root.'/integrator/scripts/__init_script__.php';

$bigbig = $build_unit_messages = id(new HarbormasterBuildUnitMessage())
  ->loadAllWhere('id > %s LIMIT 1000000', 0);

$start = microtime(true);
(bool)$bigbig;
echo microtime(true) - $start;

This code on my computer causes 1.88 seconds to be output as opposed to 0.0000006 seconds when wrapping it in count.

My title claim is somewhat bold. In my case that was the issue but your millage may vary.

Test Plan: As seen above.

Reviewers: #blessed_reviewers, epriestley

Reviewed By: #blessed_reviewers, epriestley

Subscribers: epriestley

Differential Revision: https://secure.phabricator.com/D15869

Details

Committed
michaeljs1990 <autocommitter@example.com>May 9 2016, 21:40
Pushed
aubortMar 17 2017, 12:03
Parents
rPHU5d47fb1027f5: Allow the default Remarkup <p> tag to be inline-styled for mail
Branches
Unknown
Tags
Unknown

Event Timeline

michaeljs1990 <autocommitter@example.com> committed rPHU180f0999d094: Fix LiskMigrationIterator scaling in O(n^2) (authored by Michael Schuett <schuettm@uberatc.com>).May 9 2016, 21:40