Graphmaster
Graph
master
History Graph
History Graph
Commit | Author | Details | Committed | ||||
---|---|---|---|---|---|---|---|
f7eefd5da751 | lintool | Removed outdated code; cleaned up WarcBrowserServlet. | Aug 16 2014 | ||||
a57c6b83eaf2 | lintool | Refactoring: created ArcRecordUtils. | Aug 16 2014 | ||||
d58afca206fc | lintool | Refactoring packge org.warcbase.demo; added Jwat prefix to Hadoop InputFormats… | Aug 16 2014 | ||||
a00e413edff4 | lintool | Fixed issues #71, #70, #66, #63 and #62 (partially) | Aug 16 2014 | ||||
377a3e3a429f | lintool | Better handling of ARC parse errors. | Aug 15 2014 | ||||
2d16fd032060 | lintool | Bump max value size up to 10 MB, tweak WAL settings. | Aug 14 2014 | ||||
bf606c07d453 | lintool | Refactoring UrlUtil and related classes. | Aug 14 2014 | ||||
ff34480981dd | lintool | Restored some sanity to versions and transitive dependencies. | Aug 14 2014 | ||||
55a7334ae0bb | lintool | Fixed issue #60: Merge in Clemens et al. contributions to Warcbase | Aug 14 2014 | ||||
b6431bed6f8b | lintool | Minor tweaks. | Aug 14 2014 | ||||
37cc53a7fd66 | lintool | Tweaked settings. | Aug 14 2014 | ||||
72c1afbca77c | lintool | Merge branch 'master' into cneud-integration | Aug 14 2014 | ||||
5a8dc8ab24e3 | lintool | Fixed: | Aug 14 2014 | ||||
a77bdc3fac02 | lintool | Updated documentation. | Aug 14 2014 | ||||
f29edf052e19 | lintool | Added command-line options. | Aug 14 2014 | ||||
14fa0f329072 | lintool | More refactoring. | Aug 14 2014 | ||||
6eeb0ea431f1 | lintool | Refactoring on local UrlMappingBuilder. | Aug 14 2014 | ||||
5022874b83e5 | lintool | Lightweight refactoring. | Aug 14 2014 | ||||
ae5581b740fc | lintool | Bumped up memory, fixed class renaming. | Aug 14 2014 | ||||
0bc5d4876199 | lintool | Uri -> Url classes renaming. | Aug 14 2014 | ||||
ab64384ff204 | lintool | Light refactoring | Aug 14 2014 | ||||
bc47be0fea19 | Jeffyrao | resolve conflicts | Aug 13 2014 | ||||
0623b52cd05c | Jeffyrao | reformat code | Aug 13 2014 | ||||
607a42335808 | lintool | Updated documentation, issue #61 | Aug 13 2014 | ||||
7f13ec11d968 | lintool | Updated documentation. | Aug 13 2014 | ||||
83414e51b351 | lintool | Configuration for Wayback/Warcbase integration. | Aug 13 2014 | ||||
910c2f03e6b1 | lintool | Removed extra command-line argument. | Aug 13 2014 | ||||
a14cda2217fa | lintool | Commented out LibmagicJnaWrapper functionality because the jar isn't generally… | Aug 12 2014 | ||||
93f6d42f4ff4 | lintool | Fixed compile and broken test issues. | Aug 12 2014 | ||||
accd1978862d | lintool | Merge branch 'master' of github.com:cneud/warcbase into cneud-integration | Aug 12 2014 | ||||
a0a594f92b94 | lintool | Prototype integration of Wayback/Warcbase via REST API on HBase. | Aug 12 2014 | ||||
eb4893e7201c | lintool | Merge branch 'cleanup' into wayback-integration | Aug 12 2014 | ||||
c665a23bf1f3 | lintool | Fixes issue #58: Wayback reads directly from REST API instead of writing and… | Aug 12 2014 | ||||
ffec5b2caa3d | lintool | Fixed issue #59: Unable to fetch URLs from archive with '?' in them | Aug 12 2014 | ||||
0f4f541c24d7 | lintool | Fixed issues with fetching URLs with spaces in them. | Aug 12 2014 | ||||
b5ccb86d330e | lintool | Better handling of errors: when REST API is unavailable, when URL isn't found… | Aug 11 2014 | ||||
515bab098cff | lintool | Code cleanup for browser code; removed unneeded files and associated web files. | Aug 11 2014 | ||||
74f3dece6c62 | lintool | Merge branch 'rest-api-bug-fix' of github.com:lintool/warcbase into wayback… | Aug 11 2014 | ||||
3075ca19d76d | lintool | Converted host/port/table information to bean settings. | Aug 11 2014 | ||||
37e97073d57c | lintool | Simplified code. | Aug 11 2014 | ||||
bca5b6944ac7 | lintool | Refactoring; mostly reformatting. | Aug 11 2014 | ||||
9e8765b4e3f0 | lintool | Minor fix for NPE when capture isn't in HBase. | Aug 11 2014 | ||||
4f387e8952a7 | lintool | Initial check-in of Warcbase integration points with Open Wayback. | Aug 11 2014 | ||||
03d5c5365481 | lintool | /*/ query returns MIME type. | Aug 11 2014 | ||||
65b6dc5b7138 | lintool | Refactor to confirm to /*/ of Wayback to fetch list of available versions. | Aug 10 2014 | ||||
096878f5f932 | lintool | Switched over to 14 digit dates for URLs to align with Wayback. Further… | Aug 10 2014 | ||||
7f9764b1a793 | lintool | Cleaned up servlet fetch code. | Aug 10 2014 | ||||
b685d65eba7b | lintool | Fixed 14 digit date parsing issue (now uses ArchiveUtils); was an issue with… | Aug 10 2014 | ||||
bc6aab1ffd21 | lintool | Fixed a few minor ingestion issues. | Aug 10 2014 | ||||
b20ef84e5df9 | lintool | Refactoring; removing WARC ingestion for now. | Aug 10 2014 | ||||
cfe508a831f0 | lintool | Tweaks to ingest code. | Aug 10 2014 | ||||
180b57fa5dc1 | lintool | Janky, but seems to work: ingesting and serving up raw ARC records. | Aug 10 2014 | ||||
8b5a6be9db61 | lintool | Quick and dirty switch over to webarchive-commons API; stores raw ARC records. | Aug 10 2014 | ||||
65c6e54a948e | Jeffyrao | add UriMappingBuilder Mapreduce version | Aug 4 2014 | ||||
59aa95aab857 | Jeffyrao | fix the bug of selecting webpage by date ineffective | Jul 25 2014 | ||||
7175239e0751 | Jeffyrao | add Hadoop/HBase input choice for ExtractLinks and ExtractSiteLinks classes | Jul 22 2014 | ||||
17f5787120af | lintool | Fixed issues #17, #25, #30, #35 | Jun 27 2014 | ||||
60a9b96beebe | Milad Gholami | Merging with master. | Jun 26 2014 | ||||
149ce6cf969f | Milad Gholami | Fixing git history. | Jun 26 2014 | ||||
113332758f06 | lintool | Fixed issues #45, #46, #49, #50 | Jun 18 2014 | ||||
3b8484a944a8 | lintool | More work on the admin interface. | Jun 18 2014 | ||||
f3015cd7ba4b | lintool | issue #50 | Jun 18 2014 | ||||
10d60bd28c75 | lintool | Fixed broken merge. | Jun 17 2014 | ||||
6c452cbb6b5d | lintool | Merge branch 'master' into admin | Jun 17 2014 | ||||
71d5a4e0803e | lintool | fixed issue #43 and issue #48 | Jun 17 2014 | ||||
781fe4247b31 | lintool | Fixed issue #48 | Jun 17 2014 | ||||
5f62c2f6fb9d | lintool | Added comment. | Jun 17 2014 | ||||
d17dddca8deb | lintool | Minor refactoring. | Jun 17 2014 | ||||
ee7f7749a30a | lintool | Initial working version of anchor text inversion program: issue #43 | Jun 17 2014 | ||||
02c26d6d8ea4 | lintool | Started working on issue #46 cleanup of org.warcbase.data.Util | Jun 17 2014 | ||||
c7a7247d5fa2 | lintool | Appears to have fixed issue #49, starting work on admin tool, issue #45. | Jun 17 2014 | ||||
37f9b1fce90f | milad621 | openwayback upgraded to 2.0.0.BETA.2 | Jun 16 2014 | ||||
59c50bb33254 | lintool | Fixed issue #42 and issue #44 | Jun 16 2014 | ||||
a6be4375e0c7 | lintool | ExtractLinks using HBase appears to be working. | Jun 13 2014 | ||||
21a07efb350e | lintool | Refactored HDFS extractor; HBase extractor still broken. | Jun 13 2014 | ||||
bbc73ab64808 | lintool | Merge branch 'master' into refactoring | Jun 12 2014 | ||||
6d50f37d6ada | lintool | Merge branch 'hbase_experiments' | Jun 12 2014 | ||||
273e5969e943 | lintool | Light refactoring, pushed column family filter into scan. | Jun 12 2014 | ||||
3283bb8512ef | lintool | Alternative implementation based on iterating over maps... slightly slower. | Jun 12 2014 | ||||
dbfbcb0b3c7e | lintool | More light refactoring. | Jun 12 2014 | ||||
34273fdb935a | Jeffyrao | add hbase option for ExtractLinks | Jun 12 2014 | ||||
cfb89d2d3379 | Jeffyrao | reformat Jinfeng's code | Jun 12 2014 | ||||
b02f33c9b43a | lintool | Refactoring, code cleanup. | Jun 11 2014 | ||||
d4c29085ee12 | lintool | Fixed issue #39 | Jun 11 2014 | ||||
c3a4348e1250 | lintool | Debugged HBase scan parameters so that they don't knock over region servers… | Jun 11 2014 | ||||
0e61a3094550 | milad621 | Issue 38 fixed. Still need to add other URL Encoding characters. | Jun 6 2014 | ||||
5425313990da | lintool | Fixed Issues #31, #32, #40, #41 | Jun 5 2014 | ||||
20ef503dbfe5 | lintool | Moving ExtractLinks and ExtractSiteLinks into analysis.graph package, per Issue… | Jun 5 2014 | ||||
dc58365a9579 | lintool | Extracts values at different timestamps. | Jun 5 2014 | ||||
60b827d1c174 | lintool | Refactored getIdRange method signature, add more test cases to UriMapping. | Jun 5 2014 | ||||
a6b17787ccd1 | lintool | Merge branch 'extract-links' of github.com:Jeffyrao/warcbase into refactoring | Jun 5 2014 | ||||
7fc1b92d60a7 | Jeffyrao | fix issue 40 that UriMapping prefix search should return empty result when no… | Jun 5 2014 | ||||
a8953616248c | lintool | Refactoring, added test case (currently broken). | Jun 4 2014 | ||||
6518836b7ea0 | Jeffyrao | fix issue 32, update ExtractSiteLinks code | Jun 3 2014 | ||||
a17e3c74954a | Jeffyrao | fix issue 30, add ExtractSiteLinks code | May 29 2014 | ||||
3f36eff4f429 | Jeffyrao | fix issue 31 | May 27 2014 | ||||
b4ab2ea0d499 | lintool | Initial MapReduce over HBase demo. | May 26 2014 | ||||
0598dcd91073 | Jeffyrao | more edits | May 25 2014 | ||||
cad5eddceb19 | Jeffyrao | remove javacsv dependency, add opencsv dependency | May 25 2014 | ||||
f0e36b46b872 | Jinfeng Rao | Merge remote-tracking branch 'upstream/master' into extract-links | May 25 2014 |
c4science · Help