Homec4science

bibdocfile: guess_format_from_url() improvements

Authored by Samuele Kaplun <samuele.kaplun@cern.ch> on Oct 18 2012, 10:23.

Description

bibdocfile: guess_format_from_url() improvements

  • guess_format_from_url() is now always returning an extension (when the format is unknown it will return '.bin').
  • guess_format_from_url() will return '.txt' extension for simple text files when recognized by the magic library.
  • guess_format_from_url() always consider any extension existing in the filename when provided with a local path (.e.g /tmp/example.foo) will return '.foo' is a not recognized extension.
  • Refactored the code using the magic library.
  • If downloading a remote URL is necessary in order to guess the extension, always delete the temporary file at the end.

Details

Committed
Tibor Simko <tibor.simko@cern.ch>Oct 18 2012, 13:38
Parents
R3600:3323a3fb39d8: WebSearch: fix for webcoll
Branches
Unknown
Tags
Unknown

Event Timeline

Tibor Simko <tibor.simko@cern.ch> committed R3600:68487a793681: bibdocfile: guess_format_from_url() improvements (authored by Samuele Kaplun <samuele.kaplun@cern.ch>).Oct 18 2012, 13:38