Homec4science

intbitset: code and performance improvements

Authored by Samuele Kaplun <samuele.kaplun@cern.ch> on Jul 23 2012, 13:08.

Description

intbitset: code and performance improvements

  • Removes unused intBitSetCreateNoAllocate() C function.
  • Uses unsigned int when possible (in intBitSetResize(), intBitSetIsInElem(), intBitSetAddElem(), intBitSetDelElem()).
  • Removes two warnings about const assignment when compiling C code.
  • Improves intbitset.init() to recognize tuple of tuple results coming from SQLAlchemy (where the objects are not real tuples but proxy to them). This avoid a segmentation fault when providing intbitset with direct results from SQLAlchemy queries.
  • Fixes a couple of bugs in new .getitem() implementation.
  • Makes .difference(), .difference_update(), .intersection(), .intersection_update(), .union(), .union_update(), .symmetric_difference(), .symmetric_difference_update() as real aliases of respectively sub, isub, and, iand, or, ior, xor, ixor.
  • Rewrites several C loops in order to trigger automatic loop vectorization on CPU with dedicated machine instructions, thus achieving performance improvements in intBitSetResize(), intBitSetSub(), intBitSetISub()
  • Improves intBitSetResetFromBuffer() to free memory and realloc only when the incoming buffer is larger.
  • Improves intBitSetGetTot() by using __builtin_popcountl() to count bits in a long word, thus improving speed of this function by an order of magnitude.
  • Adds new unit test for len().

Details

Event Timeline

Samuele Kaplun <samuele.kaplun@cern.ch> committed R3600:e16df32c3ad4: intbitset: code and performance improvements (authored by Samuele Kaplun <samuele.kaplun@cern.ch>).Jul 23 2012, 14:15