Homec4science

BibAuthorID: major backend rewrite and speed ups

Authored by Samuele Carli <samuele.carli@cern.ch> on Jan 24 2012, 17:17.

Description

BibAuthorID: major backend rewrite and speed ups

  • New database structure allows for faster and more reliable operations.
  • New BibAuthorID fast update and garbage collection algorithm: Rabbit. Taking advantage of the new database shape, the algorithm is much faster, reliable and performant then ever. It can be now scheduled and run ideally several times per hour, so that BibAuthorID world is always up to date with the rest of the system.
  • New full disambiguation algorithm: Tortoise. The new full disambiguation algorithm allows for state of the art quality, periodic full-disambiguation of authors. NB: While it is possible to use it for starting a system from scratch, the merging algorithm required for periodic updates is not fully production ready yet.
  • Lots of bugfixes, optimizations and improvements; the backend code have been rewritten from scratch.

Co-authored-by: Nikola Yolov <nikola.yolov@cern.ch>

Details

Event Timeline

Tibor Simko <tibor.simko@cern.ch> committed R3600:bd39f76ec091: BibAuthorID: major backend rewrite and speed ups (authored by Samuele Carli <samuele.carli@cern.ch>).Apr 3 2012, 16:41