Graphmaster
Graph
master
History Graph
History Graph
Commit | Author | Details | Committed | ||||
---|---|---|---|---|---|---|---|
db6864b1a785 | Jan Linder | Add the possibility to have more than one affiliation category per page | Oct 22 2020 | ||||
120cd4c49b66 | Jan Linder | Insert new plot and boxing for tesseract to presentation | Oct 22 2020 | ||||
0ad97826fa8a | Jan Linder | Added new issues in source to readme | Oct 22 2020 | ||||
2e7f25ecd1bc | Jan Linder | Add presentation for Paula and Marlene, week6 | Oct 22 2020 | ||||
5531d854d21a | Jan Linder | add generate plots | Oct 21 2020 | ||||
62f11a926efb | Jan Linder | Minor bugfixes and improvements | Oct 20 2020 | ||||
0b5560f27517 | Jan Linder | Explain Analyzer in readme | Oct 20 2020 | ||||
3fd706cf5bd0 | Jan Linder | Rename cop1to5, included cop5 there | Oct 20 2020 | ||||
97b00603c9f4 | Jan Linder | Improve Analyzers, especially cop7to8 | Oct 20 2020 | ||||
67a5ee11403c | Jan Linder | Implement cop7to8_analyzer and affilition list extractor. | Oct 19 2020 | ||||
883bce5e3184 | Jan Linder | Added party recognition for all cops | Oct 19 2020 | ||||
8b9912f6d1fa | Jan Linder | First edition of affiliation category extraction. Only works for cop5+ yet | Oct 17 2020 | ||||
fa3369344206 | Jan Linder | minor adaption op salutory addresses | Oct 14 2020 | ||||
2186bf7609de | Jan Linder | Make pdfToText unstatic. | Oct 14 2020 | ||||
7514566bb3e2 | Jan Linder | Add todo week5 | Oct 13 2020 | ||||
dc91f390862e | Jan Linder | Improve pdftotext and its analysis. | Oct 12 2020 | ||||
e22f3e528e6a | Jan Linder | Implement pdfToTxt manually quite correctly | Oct 12 2020 | ||||
a0f6ce99d856 | Jan Linder | implement PDFPageDetailedAggregator to get the positions of LTContainers | Oct 10 2020 | ||||
c66cb28f7eac | Jan Linder | bring copnewer_analyzer to work | Oct 6 2020 | ||||
9dd74b6c1ab6 | Jan Linder | add todos week4 | Oct 6 2020 | ||||
0f3b244e9c77 | Jan Linder | Remove old code files | Oct 6 2020 | ||||
29e001a8ff57 | Jan Linder | IMPLEMENT PDF TO TXT CORRECTLY Now use pdfminer with the laparam argument to… | Oct 6 2020 | ||||
66d3e48b2e2a | Jan Linder | the ocr now works correctly | Oct 5 2020 | ||||
154caa57dcc3 | Jan Linder | blacklist | Oct 5 2020 | ||||
2f69d7e02c12 | Jan Linder | whitelist ocr: not perfect | Oct 5 2020 | ||||
9cc36860b206 | Jan Linder | Improve analysis for cop3 and cop4 | Oct 5 2020 | ||||
0d36b15c21a8 | Jan Linder | Small corrections for OCR | Oct 5 2020 | ||||
67d367f6243d | Jan Linder | finish modularization | Oct 5 2020 | ||||
e2ddd912c5b8 | Jan Linder | File structure updated. (untested) | Oct 4 2020 | ||||
26f03fd369c3 | Jan Linder | Begin with making a proper modularization | Oct 4 2020 | ||||
2d93960b8990 | Jan Linder | inserted boxes for OCR and right parameters | Sep 30 2020 | ||||
75edac7ca39e | Jan Linder | added todo week3 | Sep 29 2020 | ||||
d37074c1d20f | Jan Linder | minor changes of process, added raw of cop3 | Sep 28 2020 | ||||
5b6aa3b3ac8b | Jan Linder | progress on cop2-4 | Sep 28 2020 | ||||
57cdc59e9210 | Jan Linder | Use of process_copX.py precised in README | Sep 27 2020 | ||||
355f0c163915 | Jan Linder | Implemented processing of cop2-4. Works good for countries, but has major… | Sep 27 2020 | ||||
a557a870c334 | Jan Linder | try with pypdf2 | Sep 27 2020 | ||||
13691532eec5 | Jan Linder | implemented process cop for 5 - 25 with textract but there are major errors in… | Sep 26 2020 | ||||
7f3f311f331c | Jan Linder | progress on the class and process script | Sep 23 2020 | ||||
c747832a97fe | Jan Linder | Began the copx file | Sep 22 2020 | ||||
0055a1d8e4a0 | Jan Linder | added todo | Sep 22 2020 | ||||
aa92caa65e57 | Jan Linder | added raw txt for cop25 | Sep 22 2020 | ||||
37b4e9f47446 | Jan Linder | added my testing files for OCR with cop1 | Sep 22 2020 | ||||
9514a37465f4 | Jan Linder | data complete | Sep 17 2020 | ||||
77b17d5e7bd6 | Jan Linder | data complete | Sep 17 2020 | ||||
45310b6c42f1 | Jan Linder | first part of the lists | Sep 17 2020 |
c4science · Help