Homec4science

Fixed issues #86, #85, #74

Authored by lintool <jimmylin@umd.edu> on Aug 23 2014, 19:31.

Description

Fixed issues #86, #85, #74

Issue #86: Move link extraction code over to using new WacArcInputFormat
Issue #85: Writable container for ARCRecord
Issue #74: TableMapper wrapper for interoperability between HBase vs. processing (W)ARC data

Event Timeline

lintool <jimmylin@umd.edu> committed R1473:fb07114090f5: Fixed issues #86, #85, #74 (authored by lintool <jimmylin@umd.edu>).Aug 23 2014, 19:31

Merged Changes

CommitAuthorDetailsCommitted
9faed3817dcflintool
Refactored WacMapReduceHBaseWrapperDemo, now takes advantage of… 
Aug 23 2014
999fa0af0280lintool
Merge branch 'working' into table-wrapper 
Aug 23 2014
08530c1cbe52lintool
WacArcInputFormat now generates ArcRecordWritables. 
Aug 23 2014
98d7150f5eeclintool
Switched ExtractSiteLinks and InvertAnchorText over to WacArcInputFormat; link… 
Aug 23 2014
369ba2731f5blintool
Minor refactoring, revised counters. 
Aug 23 2014
74b023ed8b16lintool
Both the Jwat and Wac versions of ExtractLinks gives the same exact output. 
Aug 23 2014
ccee8fd2204alintool
Implemented ArcRecordWritable. 
Aug 23 2014
c5f4b9efa212lintool
Very rough prototype of wrapper that allows interoperability between HBase… 
Aug 23 2014