Homec4science

BibAuthorID: improvements, fixes and optimizations

Authored by Samuele Carli <samuele.carli@cern.ch> on Nov 6 2012, 19:35.

Description

BibAuthorID: improvements, fixes and optimizations

  • Now it's possible to disambiguate surname clusters without precomputing clusters for every surname, makes testing and tweaking much faster
  • Disable citation analysis for INSPIRE disambiguation, too big memory footprint for such a small increase in accuracy
  • Tweaking: facilities to collect and analyze statistics about final results in disambiguation. Allow easy plotting of results features against thresholds and so on
  • Added lots of assertions and tests to allow early detection of insidious problems during disambiguation
  • Names comparison now translates unicode to ascii, thus ignoring accents. This improves disambiguation and search.
  • Outdated claims are now correctly handled by the interface, and present message boxes to the user in order to make clear what is happening
  • Greatly improved debug prints to ease debugging
  • Comparison functions now behave properly and offer always correct results in correct range
  • Comparison function fixed bugs in handling of INSPIREid
  • Redistribution of weight in comparison function to achieve more significative results
  • Disambiguation computations redefined in order to limit operation overflow/underflow problems
  • Paper claiming code now handles safely all known stressful situations dealing with outdated data
  • Admin paper claiming interface offers the possibility to manage external ids for persons. Kind and features of external IDs are configurable through config file.
  • UI style fixes
  • UI fixed various bugs (mostly javascript related)
  • UI improved extendability and reusability in templates through verbiage dictionaries
  • UI Disabled option to 'forget decision' on rejected papers which made no sense
  • Added a prototype for export of personID information in XML format. Usable but unfinished, will probably be moved to bibexport.

Co-authored-by: Nedko Nedkov <nedko.stefanov.nedkov@cern.ch>

Details

Committed
Tibor Simko <tibor.simko@cern.ch>Feb 12 2013, 14:27
Parents
R3600:e34c06150f18: Merge branch 'maint-1.0' into maint-1.1
Branches
Unknown
Tags
Unknown

Event Timeline

Tibor Simko <tibor.simko@cern.ch> committed R3600:bf328e50aa23: BibAuthorID: improvements, fixes and optimizations (authored by Samuele Carli <samuele.carli@cern.ch>).Feb 12 2013, 14:27