update with fastest version so far (2x wrt CPU)
Former-commit-id: d8d45487cc5390135fd23dad809bbfb984addfdb