fix an bad bug with per atom tallied properties.
ev_reduce_thr() must only be called when EVFLAG
is true. otherwise the ev_setup_thr() has not
been run and enlarged the per-atom energy and
virial arrays, so that we get a segfault when
reducing them. this could also give a little
performance improvement since now ev_reduce_thr()
is only called when energies are computed and tallied.