History Graph
History Graph
Commit | Author | Details | Committed | |||
---|---|---|---|---|---|---|
9036d847ae65 | Tibor Simko | global: fix copyright notice years | Jan 13 2011 | |||
08160858adf2 | Tibor Simko | global: replace cdsware URLs by invenio URLs | Dec 15 2010 | |||
8f0027eea7ee | Tibor Simko | global: update copyright years | Nov 1 2010 | |||
d30e56fa001a | Tibor Simko | global: CDS Invenio becomes Invenio | Nov 1 2010 | |||
4f63bfe58dc9 | Christopher Hayward/Tibor Simko | refextract: improvements to author recognition | Oct 15 2010 | |||
ba106f2e4a97 | Christopher Hayward/Tibor Simko | refextract: identify author groups in citations | Oct 15 2010 | |||
ff97c0942af0 | Christopher Hayward/Tibor Simko | refextract: add DOI recognition functionality | Oct 15 2010 | |||
209ab90f8065 | Jerome Caffaro/Tibor Simko | Refextract: fixed "--raw-references" CLI option | Aug 2 2010 | |||
c3a322f0eaf6 | Benoit Thiell/Tibor Simko | textutils: centralized encode_for_xml() function | Mar 5 2009 | |||
7ab0dcd8ea47 | Samuele Kaplun/Tibor Simko | Fix usage of tabs in code. | Dec 2 2008 | |||
51bfbcb9fa42 | Tibor Simko | Removed $Id$ from file preamble comments. | Oct 27 2008 | |||
779d4ea76d86 | Tony Osborne | Added massaging for arxiv report numbers to change from e.g. arXiv-yymm-1234… | Jun 30 2008 | |||
67226ac65ffd | Tibor Simko | Fixed 175 cases of bad code indentation throughout the codebase. (Please set up… | Mar 25 2008 | |||
eceb995a1b50 | Tibor Simko | Updated copyright years. | Feb 4 2008 | |||
b4d00fcb0485 | Tibor Simko | Prettified the usage message a bit. | Jan 18 2008 | |||
29018687c982 | Tony Osborne | Renamed functions (get_first_reference_line_numeration_marker_patterns_via_brac… | Nov 2 2007 | |||
4f63c5c768fd | Tony Osborne | New functionality: | Oct 30 2007 | |||
47ce4d6ba85f | Tibor Simko | Made refextract runnable without having to have done the full Invenio… | Oct 3 2007 | |||
062e6266d46c | Nicholas Robinson | When stripping footers, introduced a check to ensure that the document body… | Jul 27 2007 | |||
017c8ee0885e | Nicholas Robinson | Commit on behalf of Tony: + Added function "limit_m_tags" - truncate extremely… | Jul 27 2007 | |||
9657e2ace771 | Samuele Kaplun | Moved from the obsolete sre module to re. | May 22 2007 | |||
1d6f7a5e95c5 | Nicholas Robinson | Corrected bugs in handling of URLs. URLs are now processed BEFORE the… | Apr 30 2007 | |||
103822f367ad | Nicholas Robinson | Modification of the regexp patterns used to identify the numeration that… | Apr 26 2007 | |||
b3f7f40fab8f | Nicholas Robinson | Altered the pattern used to recognise "IBID" instances that do not actually… | Apr 25 2007 | |||
aad0b2e03b4d | Nicholas Robinson | Removed the import of the now unused 'CFG_REFEXTRACT_MARKER_CLOSING_URL' config… | Apr 25 2007 | |||
2d59c329303d | Nicholas Robinson | Fixed a bug in the handling of URLs. Previously, URLs could be corrupted during… | Apr 25 2007 | |||
06e1f6c9bb10 | Nicholas Robinson | Corrected 2 bugs in the function "move_tagged_series_into_tagged_title": *… | Apr 24 2007 | |||
6744c6cbe48c | Nicholas Robinson | In certain papers, " bf " appears just before the volume of a cited item. It is… | Apr 24 2007 | |||
e38af54dcbbf | Nicholas Robinson | Removed tagging of series information as it conflicted with the KB in certain… | Apr 23 2007 | |||
20bdb2683d21 | Nicholas Robinson | When creating the knowledge base of Journal titles: * Enforced sorting of… | Apr 20 2007 | |||
54dcf3807ddf | Nicholas Robinson | + Fixed a bug when looking for the end of the references section. Previously… | Mar 29 2007 | |||
322424c25882 | Nicholas Robinson | Fixed a bug that occurred when matching report-numbers: the wrong length of the… | Mar 17 2007 | |||
2036e7a590dd | Nicholas Robinson | Added a dictionary to keep a count of all periodical titles found in reference… | Mar 15 2007 | |||
afd852f45332 | Nicholas Robinson | Code cleaning to fix pylint complaints about conventions, etc; | Mar 13 2007 | |||
5df99323fe9d | Nicholas Robinson | Made title-numeration patterns 'prettier' (split them over multiple lines and… | Mar 8 2007 | |||
8f875d9ae4b6 | Nicholas Robinson | Updated title-numeration patterns: when searching for 'volume', the word 'No'… | Mar 6 2007 | |||
52b3f74ac1ce | Nicholas Robinson | Added new pattern for recognition of numeration of titles: 'YEAR' (with… | Mar 6 2007 | |||
262996b58979 | Nicholas Robinson | When recognising numeration for titles, made parentheses optional for YEAR in… | Mar 5 2007 | |||
9506e327b701 | Nicholas Robinson | Reference line markers are now stripped before the line is processed for… | Mar 5 2007 | |||
580e6cf4e94a | Nicholas Robinson | Modification of title-numeration patterns: Now only recognising reasonable… | Mar 5 2007 | |||
a0701690a5da | Nicholas Robinson | Corrected handling of IBIDs: in certain cases, the series could be tagged onto… | Mar 2 2007 | |||
15785017b87f | Nicholas Robinson | Added optional comma before and after volume when searching for title… | Mar 1 2007 | |||
bb7d141760fa | Nicholas Robinson | Added checking for UnicodeErrors when reading in the preprint-report numbers… | Feb 28 2007 | |||
11ccb10bc99c | Nicholas Robinson | Fixed a problem with the rebuilding of the reference section: The function that… | Feb 27 2007 | |||
053a65f6c7b5 | Nicholas Robinson | Fixed IndexError (in reading-line) when rebuilding reference line | Feb 24 2007 | |||
296919302a28 | Nicholas Robinson | If parentheses were stripped from the end of a title, the were not accounted… | Feb 23 2007 | |||
78dc628b932e | Nicholas Robinson | Altered the pattern used for the recognition of numeration that immediately… | Feb 22 2007 | |||
920dcde17cae | Nicholas Robinson | Added newlines to verbose output. | Feb 22 2007 | |||
0425375d7c64 | Nicholas Robinson | cli_opts was accidentally declared inside the except clause of the try/except… | Feb 22 2007 | |||
a650af9a6f53 | Nicholas Robinson | Updates written by TonyO: New, improved behaviour of "verbose" mode; various… | Feb 22 2007 | |||
f27c33ed484f | Nicholas Robinson | Code cleaning after pylint warnings (etc); Added docstring to function… | Feb 15 2007 | |||
843eff28d0fb | Nicholas Robinson | Replaced use of 'cgi.escape' with 'encode_for_xml' when creating XML output… | Feb 15 2007 | |||
e1b72d8b4d1f | Nicholas Robinson | Fixed a bug in the recognition/markup of URLs: string index was out when… | Feb 14 2007 | |||
df75ba0b1feb | Tibor Simko | Updated copyright years (2007). | Feb 14 2007 | |||
6258c992576c | Nicholas Robinson | Fixed a bug in recognition of reference line 'marker'; Fixed a bug in counting… | Feb 14 2007 | |||
64eca59c42f9 | Nicholas Robinson | Complete rewrite of refextract. No more Object-Orientation, new treatment of… | Feb 14 2007 | |||
a80c2c037ce5 | Tibor Simko | Fixed free variable problems (UnicodedecodeError, curitem). | Dec 1 2006 | |||
5dda83af3d76 | Tibor Simko | When comparing to None, do not use "== None" or "!= None", but rather "is None"… | Nov 28 2006 | |||
bdaa3eafc82c | Tibor Simko | Added __revision__ for all Python files that did not have it. | Sep 14 2006 | |||
8242c27ba40b | Tibor Simko | Use uppercase CFG_REFEXTRACT_* module variables. | Sep 13 2006 | |||
499383b3c92c | Nicholas Robinson | Cleaning of newly updated nueration recognition patterns. | Aug 15 2006 | |||
3665339c3342 | Nicholas Robinson | Changes to recognition of preprint report numbers: in the user-defined… | Aug 15 2006 | |||
090c843e4f30 | Nicholas Robinson | Added 999C6a subfield containing information about status of extracted… | Aug 3 2006 | |||
18b100fe8575 | Tibor Simko | Implemented name change CDSware to CDS Invenio. Also, introduced new configure… | May 4 2006 | |||
d5b36a574e3c | Tibor Simko | Updated copyright years. | May 2 2006 | |||
6b1700acdd20 | Tibor Simko | Python imports are now done in an absolute way (from cdsware.foo import bar)… | Dec 20 2005 | |||
988d2fb16a34 | Tibor Simko | Getting rid of WML. | May 12 2005 | |||
5b5d4dcdce4e | Tibor Simko | Fixed errors with umlaut-like corrections. (Nick) | Apr 6 2005 | |||
bf6c17e2d07b | Tibor Simko | Pylint-related code cleanup by Nick. | Mar 17 2005 | |||
250e4736e5cb | Tibor Simko | Initial release of refextract, the reference extraction program. | Mar 15 2005 |
c4science · Help