Homec4science

Merge branch 'working'

Authored by lintool <jimmylin@umd.edu> on Nov 20 2013, 15:47.

Description

Merge branch 'working'

Details

Committed
lintool <jimmylin@umd.edu>Nov 20 2013, 15:47
Pushed
dportabellaOct 19 2016, 16:29
Parents
R1473:0aeaeb3a754b: Tweaked README.
R1473:79dee598b2df: Merge branch 'milad-master'
Branches
Unknown
Tags
Unknown

Event Timeline

lintool <jimmylin@umd.edu> committed R1473:0e4d1592c244: Merge branch 'working' (authored by lintool <jimmylin@umd.edu>).Nov 20 2013, 15:47

Merged Changes

CommitAuthorDetailsCommitted
0aeaeb3a754blintool
Tweaked README. 
Nov 20 2013
bf51f6b75bd8milad621
Code refactoring after pair coding. 
Nov 18 2013
43d12364a70fmilad621
Fixed ava.lang.NegativeArraySizeException at org.apache.commons.io.output. 
Nov 8 2013
c3e37d50e856milad621
Created a new runnable to find a uri inside warc/arc files. 
Nov 7 2013
33966c708354milad621
Uses HTablePool instead of creating a new connection each time. 
Nov 7 2013
64ef175db2fdmilad621
Some code cleanup in servlet. DetectDuplicates fixed with new hbase table… 
Nov 7 2013
bf1b3616e4c0milad621
updated servlet. 
Nov 7 2013
76949eacabd5milad621
Added a seperate class to manage HBase connection and addRecord 
Nov 6 2013
01dff2a1f2a5milad621
One runnable to process both arc and warc files in a folder. 
Nov 5 2013
1354c68684d9milad621
fixed some issues with the servlet. content and types might not follow eachother 
Oct 30 2013
e9e3d1c9d940milad621
IngestWarcFiles fixed. Now it uses jwat-warc to ingest warcfiles to hbase. 
Oct 30 2013
70c00addd99emilad621
Updated WarcBrowser to work with the new structure of hbase table (supports… 
Oct 24 2013
e8f7af4176f8milad621
IngestArc updated. 
Oct 23 2013
735b77efcad4milad621
IngestWarcFiles updated. Uses jwat and stores content type in htable 
Oct 23 2013
b1bee7dc3a8emilad621
Arc Processing tools added. Not working with hbase yet. 
Oct 17 2013
803daecd43efmilad621
No need for name arg 
Oct 9 2013
b321544c8a1fmilad621
url style fixed. Can capture table names from url and home page changed to http… 
Oct 9 2013
c860aef8368cmilad621
URL style changed 
Oct 2 2013
4b7ec6fab139milad621
URL style changed 
Oct 2 2013
3770fab2cafemilad621
close button fixed 
Sep 27 2013