Fix LiskMigrationIterator scaling in O(n^2)
Summary:
When running migrations that insert more than 1000 rows at a time the PhutilChunkedIterator becomes exponentially slower at generating chunks. The code below was where I encountered this issue originally.
<?php $message_table = new HarbormasterBuildUnitMessage(); $unit_test_table = new CIUnitTestHarbormasterBuildUnitMessage(); $conn_w = $unit_test_table->establishConnection('w'); $size = 5000; $row_iter = id(new LiskMigrationIterator($message_table))->setPageSize($size); $chunk_iter = new PhutilChunkedIterator($row_iter, $size); // point 1 foreach ($chunk_iter as $chunk) { // point 2 $sql = array(); ... }
where the time between the point 1 and point 2 comment exceed 1 second around the 5k mark and become unresponsive around 15k. PHP seems to do a very poor job at converting large arrays to boolean values as can be seen with this code sample. As I was testing this I noticed it's completely data dependent so it's possibly for arrays containg scalar values this is much more performant.
<?php $root = dirname(__FILE__); require_once $root.'/integrator/scripts/__init_script__.php'; $bigbig = $build_unit_messages = id(new HarbormasterBuildUnitMessage()) ->loadAllWhere('id > %s LIMIT 1000000', 0); $start = microtime(true); (bool)$bigbig; echo microtime(true) - $start;
This code on my computer causes 1.88 seconds to be output as opposed to 0.0000006 seconds when wrapping it in count.
My title claim is somewhat bold. In my case that was the issue but your millage may vary.
Test Plan: As seen above.
Reviewers: #blessed_reviewers, epriestley
Reviewed By: #blessed_reviewers, epriestley
Subscribers: epriestley
Differential Revision: https://secure.phabricator.com/D15869