Homec4science

bibupload: use CFG_OAI_ID_FIELD for deduping

Authored by Alexander Wagner <alexander.wagner@desy.de> on Feb 24 2015, 15:32.

Description

bibupload: use CFG_OAI_ID_FIELD for deduping

  • FIX In case of replicating datasets between instances by means of OAI harvesting it is plausible to check for external OAI-IDs not only in the field specified by CFG_BIBUPLOAD_EXTERNAL_OAIID_TAG but also in CFG_OAI_ID_FIELD. (closes #2812) (PR #2816)
  • Note: find_record_from_oaiid() always returned the first record with a given OAI-ID assuming that it is unique. However, due to manual intervention one may have accidentially produced dupes here. Thus check, if more than one record with an OAI-ID exists. If so, check if all records except one are deleted. In this case return this surviving record, as cataloguers have resolved dupes manually. In case we still have dupes resort to the old behaviour for sake of compatibility, but at least throw a warning. This is somewhat of a TODO. Probably one should refuse merge here and wait for manual curation (to be discussed).

Details

Committed
Tibor Simko <tibor.simko@cern.ch>Nov 18 2016, 12:31
Parents
R3600:12cd78cbcf7e: urlutils: hashlib and md5
Branches
Unknown
Tags
Unknown

Event Timeline

Tibor Simko <tibor.simko@cern.ch> committed R3600:6cfa771e9292: bibupload: use CFG_OAI_ID_FIELD for deduping (authored by Alexander Wagner <alexander.wagner@desy.de>).Nov 18 2016, 12:31