History Graph
History Graph
Commit | Author | Details | Committed | |||
---|---|---|---|---|---|---|
d58afca206fc | lintool | Refactoring packge org.warcbase.demo; added Jwat prefix to Hadoop InputFormats… | Aug 16 2014 | |||
b6431bed6f8b | lintool | Minor tweaks. | Aug 14 2014 | |||
a14cda2217fa | lintool | Commented out LibmagicJnaWrapper functionality because the jar isn't generally… | Aug 12 2014 | |||
db4ebe9f4826 | pmd | Added a null pointer check and a more Pig friendly return value from the UDFs | Dec 19 2013 | |||
ba201c27e210 | pmd | Refactored the configuration of the magic lib into the Pig script. | Dec 19 2013 | |||
5d312a3f80ec | pmd | Removed warnings | Dec 19 2013 | |||
b3fdd488fdcb | pmd | Refactored the DetectMimeType into two seperate methods: one for each detection… | Dec 19 2013 | |||
d6c0cec7efb4 | pmd | Added a TODO comment | Dec 11 2013 | |||
259f0057f400 | pmd | Add identification engine as a parameter to the DetectMimeType UDF | Dec 9 2013 | |||
b69563d53c83 | pmd | Enable the ArcLoader to load all types of files | Dec 9 2013 | |||
1a209cf5b04a | pmd | First version of a magic lib UDF | Dec 4 2013 | |||
ccb9ea44f90a | cneud | use tika for mime type detection | Dec 3 2013 | |||
fbf902fcd066 | cneud | use tika for language detection | Dec 2 2013 | |||
eb848477370a | lintool | Added WarcLoader | Dec 2 2013 | |||
1c57edb9e225 | lintool | Added ExtractRawText UDF, tweaked ExtractLinks. | Dec 2 2013 | |||
1fd881ddf836 | lintool | Loader now materializes actual text, added ExtractLinks UDF. | Dec 2 2013 | |||
02317746e1b2 | lintool | Added simple Pig Loader for Arc files, returns (url, time, mime) currently. | Dec 2 2013 |
c4science · Help