refextract: improve affiliated author search
- Include delimiters when arranging affiliated authors.
- Preserve the realigned numeration when searching for authors.
- Reuse the 'around-comma' numeration swapping when looking for affiliated authors.
- Add another config variable capable for the replacement of affiliation terms. Rename the original other affiliation config variable to include the work 'reduction'.
- Improve the numeration obtaining regular expressions; Only match numeration on lines which hold other content too.
- Collect numerated affiliation data together when searching.
- Show the list of affiliated authors per affiliation when searching for affiliated authors. Control with verbosity cli option.
- Change the flag associated with the extraction of affiliations from -f to -l, avoiding the issue of the forthcoming fulltext api change to Refextract (-f, --fulltext for providing fulltext input)
- Fix the mechanism of adding to the list of affiliated author info, by only appending a new affiliated author item if authors actually exist for that item. This prevents an invalid selection of a set of affiliated authors (over a set of standard authors), in the event that no actual authors exist, just affiliation/strength data.
- Add cli verbosity-controlled messages, depicting the current status of the author extraction process.
- Repair the cli arguments used inside get_cli_opts.
- Change the returning document information from extract_top_document_information_from_fulltext. Now returns a list of dictionaries containing author data with possible affiliations, and a list of affiliation data.
- This excludes a list of 'marked-up' author data, which is now assembled outside of this function call.
- Relocate the act of locating of a document's reference section into the functions concerned with either extracting references or authors/affiliations.
- Rename variables relating to lines holding either reference or top-section data, away from reference specific names.