Fixed issues #86, #85, #74
Issue #86: Move link extraction code over to using new WacArcInputFormat
Issue #85: Writable container for ARCRecord
Issue #74: TableMapper wrapper for interoperability between HBase vs. processing (W)ARC data
lintool <jimmylin@umd.edu> | Aug 23 2014, 19:31 |
dportabella | Oct 19 2016, 16:29 |
Commit | Author | Details | Committed | |||
---|---|---|---|---|---|---|
9faed3817dcf | lintool | Refactored WacMapReduceHBaseWrapperDemo, now takes advantage of… | Aug 23 2014 | |||
999fa0af0280 | lintool | Merge branch 'working' into table-wrapper | Aug 23 2014 | |||
08530c1cbe52 | lintool | WacArcInputFormat now generates ArcRecordWritables. | Aug 23 2014 | |||
98d7150f5eec | lintool | Switched ExtractSiteLinks and InvertAnchorText over to WacArcInputFormat; link… | Aug 23 2014 | |||
369ba2731f5b | lintool | Minor refactoring, revised counters. | Aug 23 2014 | |||
74b023ed8b16 | lintool | Both the Jwat and Wac versions of ExtractLinks gives the same exact output. | Aug 23 2014 | |||
ccee8fd2204a | lintool | Implemented ArcRecordWritable. | Aug 23 2014 | |||
c5f4b9efa212 | lintool | Very rough prototype of wrapper that allows interoperability between HBase… | Aug 23 2014 |