Graphmaster
Graph
master
History Graph
History Graph
Commit | Author | Details | Committed | ||||
---|---|---|---|---|---|---|---|
21397e4e4ff3 | lintool | upgraded to webarchive-commons 1.1.4. | Oct 18 2014 | ||||
de4267bf28f1 | lintool | Updated documentation. | Oct 13 2014 | ||||
457a71345d2b | lintool | Merge branch 'master' into warc | Sep 15 2014 | ||||
8fae0c067d68 | lintool | Fixed issue #87 | Sep 14 2014 | ||||
05db518ccd83 | lintool | Added timing info. | Sep 14 2014 | ||||
157df31c0f15 | lintool | Cleanup. | Sep 14 2014 | ||||
85c5b4a2ecc5 | lintool | Added debug output; Fixed deprecated HBase APIs. | Sep 14 2014 | ||||
159596e9b378 | lintool | Fixed build issues in upgrade to CDH 5.1.2. | Sep 13 2014 | ||||
4798a4314b6e | lintool | Figured out how to extract MIME type and date from WARC. | Aug 30 2014 | ||||
f3516c7fd7f0 | lintool | Added test cases to try loading WARC records from a stream; back-ported same… | Aug 29 2014 | ||||
28c5c007f4fd | lintool | Added simple test case. | Aug 28 2014 | ||||
8e67d49d44b3 | lintool | WARC sample from https://archive.org/details/ExampleArcAndWarcFiles | Aug 28 2014 | ||||
fb07114090f5 | lintool | Fixed issues #86, #85, #74 | Aug 23 2014 | ||||
9faed3817dcf | lintool | Refactored WacMapReduceHBaseWrapperDemo, now takes advantage of… | Aug 23 2014 | ||||
999fa0af0280 | lintool | Merge branch 'working' into table-wrapper | Aug 23 2014 | ||||
08530c1cbe52 | lintool | WacArcInputFormat now generates ArcRecordWritables. | Aug 23 2014 | ||||
98d7150f5eec | lintool | Switched ExtractSiteLinks and InvertAnchorText over to WacArcInputFormat; link… | Aug 23 2014 | ||||
369ba2731f5b | lintool | Minor refactoring, revised counters. | Aug 23 2014 | ||||
74b023ed8b16 | lintool | Both the Jwat and Wac versions of ExtractLinks gives the same exact output. | Aug 23 2014 | ||||
ccee8fd2204a | lintool | Implemented ArcRecordWritable. | Aug 23 2014 | ||||
1e1c2dcde9e2 | lintool | Fixed issues #84, #72 | Aug 23 2014 | ||||
846ebb216100 | lintool | Minor refactoring. | Aug 23 2014 | ||||
c5f4b9efa212 | lintool | Very rough prototype of wrapper that allows interoperability between HBase… | Aug 23 2014 | ||||
47f3c46c099d | lintool | Added Hadoop bindings for webarchive-commons ARC readers, demo, test cases. | Aug 22 2014 | ||||
b07376aba3fe | lintool | Added/refactored test cases for JWAT. | Aug 22 2014 | ||||
4663558b1d1a | lintool | Added @Override annotations to appropriate methods. | Aug 22 2014 | ||||
6e2e6628ea08 | lintool | Fixed issues #80 and #78 | Aug 19 2014 | ||||
9cbd26b2e3bb | lintool | Minor tweaks. | Aug 19 2014 | ||||
43e952bfdd99 | lintool | Merge branch 'selenium' into working | Aug 19 2014 | ||||
3c7da3b094fd | lintool | Refactoring of Wayback/Warcbase integration points. | Aug 19 2014 | ||||
afa516d702c9 | lintool | Minor tweaks. | Aug 17 2014 | ||||
1622603fcfc8 | lintool | Fixed Issue #79: ExtractSiteLinks documentation is out of date in README.md | Aug 17 2014 | ||||
c5b4973aba96 | lintool | Merge branch 'master' into selenium | Aug 17 2014 | ||||
f9671b6db516 | lintool | Fixed issue #77: Make sure ExtractLinks and related classes still work | Aug 17 2014 | ||||
4090318d2f99 | lintool | Cleaned up scripts. | Aug 17 2014 | ||||
b02fd402870f | lintool | Cute Selenium browser to conduct a random walk through the archive. | Aug 17 2014 | ||||
52cf7bb90940 | lintool | Cleaned up programs for manipulating graphs; removed scanning HBase option for… | Aug 17 2014 | ||||
45895b659265 | lintool | Fixed issues #75, #73, #67 | Aug 16 2014 | ||||
f7eefd5da751 | lintool | Removed outdated code; cleaned up WarcBrowserServlet. | Aug 16 2014 | ||||
a57c6b83eaf2 | lintool | Refactoring: created ArcRecordUtils. | Aug 16 2014 | ||||
d58afca206fc | lintool | Refactoring packge org.warcbase.demo; added Jwat prefix to Hadoop InputFormats… | Aug 16 2014 | ||||
a00e413edff4 | lintool | Fixed issues #71, #70, #66, #63 and #62 (partially) | Aug 16 2014 | ||||
377a3e3a429f | lintool | Better handling of ARC parse errors. | Aug 15 2014 | ||||
2d16fd032060 | lintool | Bump max value size up to 10 MB, tweak WAL settings. | Aug 14 2014 | ||||
bf606c07d453 | lintool | Refactoring UrlUtil and related classes. | Aug 14 2014 | ||||
ff34480981dd | lintool | Restored some sanity to versions and transitive dependencies. | Aug 14 2014 | ||||
55a7334ae0bb | lintool | Fixed issue #60: Merge in Clemens et al. contributions to Warcbase | Aug 14 2014 | ||||
b6431bed6f8b | lintool | Minor tweaks. | Aug 14 2014 | ||||
37cc53a7fd66 | lintool | Tweaked settings. | Aug 14 2014 | ||||
72c1afbca77c | lintool | Merge branch 'master' into cneud-integration | Aug 14 2014 | ||||
5a8dc8ab24e3 | lintool | Fixed: | Aug 14 2014 | ||||
a77bdc3fac02 | lintool | Updated documentation. | Aug 14 2014 | ||||
f29edf052e19 | lintool | Added command-line options. | Aug 14 2014 | ||||
14fa0f329072 | lintool | More refactoring. | Aug 14 2014 | ||||
6eeb0ea431f1 | lintool | Refactoring on local UrlMappingBuilder. | Aug 14 2014 | ||||
5022874b83e5 | lintool | Lightweight refactoring. | Aug 14 2014 | ||||
ae5581b740fc | lintool | Bumped up memory, fixed class renaming. | Aug 14 2014 | ||||
0bc5d4876199 | lintool | Uri -> Url classes renaming. | Aug 14 2014 | ||||
ab64384ff204 | lintool | Light refactoring | Aug 14 2014 | ||||
bc47be0fea19 | Jeffyrao | resolve conflicts | Aug 13 2014 | ||||
0623b52cd05c | Jeffyrao | reformat code | Aug 13 2014 | ||||
607a42335808 | lintool | Updated documentation, issue #61 | Aug 13 2014 | ||||
7f13ec11d968 | lintool | Updated documentation. | Aug 13 2014 | ||||
83414e51b351 | lintool | Configuration for Wayback/Warcbase integration. | Aug 13 2014 | ||||
910c2f03e6b1 | lintool | Removed extra command-line argument. | Aug 13 2014 | ||||
a14cda2217fa | lintool | Commented out LibmagicJnaWrapper functionality because the jar isn't generally… | Aug 12 2014 | ||||
93f6d42f4ff4 | lintool | Fixed compile and broken test issues. | Aug 12 2014 | ||||
accd1978862d | lintool | Merge branch 'master' of github.com:cneud/warcbase into cneud-integration | Aug 12 2014 | ||||
a0a594f92b94 | lintool | Prototype integration of Wayback/Warcbase via REST API on HBase. | Aug 12 2014 | ||||
eb4893e7201c | lintool | Merge branch 'cleanup' into wayback-integration | Aug 12 2014 | ||||
c665a23bf1f3 | lintool | Fixes issue #58: Wayback reads directly from REST API instead of writing and… | Aug 12 2014 | ||||
ffec5b2caa3d | lintool | Fixed issue #59: Unable to fetch URLs from archive with '?' in them | Aug 12 2014 | ||||
0f4f541c24d7 | lintool | Fixed issues with fetching URLs with spaces in them. | Aug 12 2014 | ||||
b5ccb86d330e | lintool | Better handling of errors: when REST API is unavailable, when URL isn't found… | Aug 11 2014 | ||||
515bab098cff | lintool | Code cleanup for browser code; removed unneeded files and associated web files. | Aug 11 2014 | ||||
74f3dece6c62 | lintool | Merge branch 'rest-api-bug-fix' of github.com:lintool/warcbase into wayback… | Aug 11 2014 | ||||
3075ca19d76d | lintool | Converted host/port/table information to bean settings. | Aug 11 2014 | ||||
37e97073d57c | lintool | Simplified code. | Aug 11 2014 | ||||
bca5b6944ac7 | lintool | Refactoring; mostly reformatting. | Aug 11 2014 | ||||
9e8765b4e3f0 | lintool | Minor fix for NPE when capture isn't in HBase. | Aug 11 2014 | ||||
4f387e8952a7 | lintool | Initial check-in of Warcbase integration points with Open Wayback. | Aug 11 2014 | ||||
03d5c5365481 | lintool | /*/ query returns MIME type. | Aug 11 2014 | ||||
65b6dc5b7138 | lintool | Refactor to confirm to /*/ of Wayback to fetch list of available versions. | Aug 10 2014 | ||||
096878f5f932 | lintool | Switched over to 14 digit dates for URLs to align with Wayback. Further… | Aug 10 2014 | ||||
7f9764b1a793 | lintool | Cleaned up servlet fetch code. | Aug 10 2014 | ||||
b685d65eba7b | lintool | Fixed 14 digit date parsing issue (now uses ArchiveUtils); was an issue with… | Aug 10 2014 | ||||
bc6aab1ffd21 | lintool | Fixed a few minor ingestion issues. | Aug 10 2014 | ||||
b20ef84e5df9 | lintool | Refactoring; removing WARC ingestion for now. | Aug 10 2014 | ||||
cfe508a831f0 | lintool | Tweaks to ingest code. | Aug 10 2014 | ||||
180b57fa5dc1 | lintool | Janky, but seems to work: ingesting and serving up raw ARC records. | Aug 10 2014 | ||||
8b5a6be9db61 | lintool | Quick and dirty switch over to webarchive-commons API; stores raw ARC records. | Aug 10 2014 | ||||
65c6e54a948e | Jeffyrao | add UriMappingBuilder Mapreduce version | Aug 4 2014 | ||||
59aa95aab857 | Jeffyrao | fix the bug of selecting webpage by date ineffective | Jul 25 2014 | ||||
7175239e0751 | Jeffyrao | add Hadoop/HBase input choice for ExtractLinks and ExtractSiteLinks classes | Jul 22 2014 | ||||
17f5787120af | lintool | Fixed issues #17, #25, #30, #35 | Jun 27 2014 | ||||
60a9b96beebe | Milad Gholami | Merging with master. | Jun 26 2014 | ||||
149ce6cf969f | Milad Gholami | Fixing git history. | Jun 26 2014 | ||||
113332758f06 | lintool | Fixed issues #45, #46, #49, #50 | Jun 18 2014 | ||||
3b8484a944a8 | lintool | More work on the admin interface. | Jun 18 2014 | ||||
f3015cd7ba4b | lintool | issue #50 | Jun 18 2014 |
c4science · Help