Switched ExtractSiteLinks and InvertAnchorText over to WacArcInputFormat; link structure the same, anchor text results better due to switch from ISO-8859-1 to UTF-8 decoding of pages.
Description
Description
Details
Details
- Committed
lintool <jimmylin@umd.edu> Aug 23 2014, 17:53 - Pushed
dportabella Oct 19 2016, 16:29 - Parents
- R1473:369ba2731f5b: Minor refactoring, revised counters.
- Branches
- Unknown
- Tags