Homec4science

Fix timeout interactions in ExecFuture

Authored by epriestley <git@epriestley.com> on Mar 15 2012, 16:38.

Description

Fix timeout interactions in ExecFuture

Summary:
See D1868. Future/ExecFuture has two timeouts:

  • When you resolve() you can specify a timeout for that operation. If that time elapses, we return null.
  • When you use ExecFuture, you can specify a timeout for the whole future. If that time elapses, we fail with some known error.
  • Currently, we conflate these incorrectly. The future timeout is used as the operation timeout if no operation timeout is provided. This creates a bit of a race; if you lose, resolve() returns NULL and then the program hangs indefinitely while terminating, waiting for the child to exit (it SHOULD have been killed by the future timeout, but we returned too quickly when we lost the race).
  • Instead, don't use the future timeout as an operational timeout. This fixes the race -- if the future timeout happens first we fail normally. If the operational timeout happens first we return null, as expected.
  • The downside to this (and the reason we conflated them in the first place, I suspect) is that we now wait 1s before detecting the timeout, because we have a default 1s select() timeout. Instead, allow futures to set a default timeout separately (without conflating operational and future timeouts). In Futures(), choose the smallest timeout from all the futures.

Test Plan: Previously failing tests now pass on race-succeptible systems (Ubuntu). New tests now pass.

Reviewers: nh, btrahan

Reviewed By: nh

CC: aran, epriestley

Differential Revision: https://secure.phabricator.com/D1908

Details

Committed
epriestley <git@epriestley.com>Mar 15 2012, 16:38
Pushed
aubortMar 17 2017, 12:03
Parents
rPHU8bbf8a1cf258: Respect letter case in method name
Branches
Unknown
Tags
Unknown

Event Timeline

epriestley <git@epriestley.com> committed rPHU1ce62c546e0b: Fix timeout interactions in ExecFuture (authored by epriestley <git@epriestley.com>).Mar 15 2012, 16:38