Fix timeout interactions in ExecFuture
Summary:
See D1868. Future/ExecFuture has two timeouts:
- When you resolve() you can specify a timeout for that operation. If that time elapses, we return null.
- When you use ExecFuture, you can specify a timeout for the whole future. If that time elapses, we fail with some known error.
- Currently, we conflate these incorrectly. The future timeout is used as the operation timeout if no operation timeout is provided. This creates a bit of a race; if you lose, resolve() returns NULL and then the program hangs indefinitely while terminating, waiting for the child to exit (it SHOULD have been killed by the future timeout, but we returned too quickly when we lost the race).
- Instead, don't use the future timeout as an operational timeout. This fixes the race -- if the future timeout happens first we fail normally. If the operational timeout happens first we return null, as expected.
- The downside to this (and the reason we conflated them in the first place, I suspect) is that we now wait 1s before detecting the timeout, because we have a default 1s select() timeout. Instead, allow futures to set a default timeout separately (without conflating operational and future timeouts). In Futures(), choose the smallest timeout from all the futures.
Test Plan: Previously failing tests now pass on race-succeptible systems (Ubuntu). New tests now pass.
Reviewers: nh, btrahan
Reviewed By: nh
CC: aran, epriestley
Differential Revision: https://secure.phabricator.com/D1908