History Graph
History Graph
Commit | Author | Details | Committed | |||
---|---|---|---|---|---|---|
f972206db516 | lintool | Created warcbase-core module. | Jun 16 2016 | |||
b5fd283dc492 | Jeremy Wiebe | Added keepContent() and discardContent() methods to RecordRDD | May 13 2016 | |||
b7a81a3afff4 | ianmilligan1 | changing case errors in ExtractCrawlDate, checking with TravisCI | Apr 21 2016 | |||
fc0d11495cf1 | ianmilligan1 | renaming ExtractTopLevelDomain to ExtractDomain | Mar 29 2016 | |||
8a4c55019413 | Jeremy Wiebe | Added discardUrlPatterns | Feb 16 2016 | |||
bf682d8e8efc | Jeremy Wiebe | Added keepUrlPatterns | Feb 16 2016 | |||
a7b0e0b07682 | Jeremy Wiebe | Merge branch 'format' for ExtractDate (#154) and TupleFormatter | Feb 13 2016 | |||
30a5e5d8d585 | Jeremy Wiebe | Added keepLanguages RDD filter | Feb 4 2016 | |||
4042cac0ad8b | Alice-Z | add method to filter date by component | Dec 25 2015 | |||
2adce498927d | Alice-Z | Refactor Record API (#189) | Dec 10 2015 | |||
2e88c1b19afb | lintool | Slapped Apache License boilerplate -- now we're a *real* open-source project :) | Nov 25 2015 | |||
14e521794754 | Alice-Z | Clean up, fix tests changed by new keepValidPages | Nov 21 2015 | |||
2fd98dbfdb67 | Alice-Z | Make keepValidPages smarter as in issue #163 | Nov 19 2015 | |||
a599db1f45b9 | Alice-Z | add documentation | Nov 11 2015 | |||
1ee0455efdf6 | Alice-Z | remove extract methods | Nov 11 2015 | |||
2c2607867ef1 | Alice-Z | Add keepValidPages transformation and layer for counting in Spark | Nov 11 2015 | |||
e1be481cd782 | Alice-Z | Clean up extracting code, use pattern matching | Nov 10 2015 | |||
3957b2cce7fb | Alice-Z | Clean up enum to function mapping | Nov 9 2015 | |||
8057c46945d0 | Alice-Z | Add Spark support | Nov 3 2015 |
c4science · Help