Homec4science

WebSubmit: re-implementation of pdf2hocr2pdf

Authored by Juliusz Sompolski <julsomp@gmail.com> on Mar 29 2011, 17:53.

Description

WebSubmit: re-implementation of pdf2hocr2pdf

  • Re-implemented pdf2hocr2pdf fixing several shortcomings.
  • Uses pyPdf library to put the recognized text directly under the original pages, instead of assembling a new PDF from rasterized, rotated, deskewed images. (closes #17)

Details

Event Timeline

Samuele Kaplun <samuele.kaplun@cern.ch> committed R3600:89d10ba4f102: WebSubmit: re-implementation of pdf2hocr2pdf (authored by Juliusz Sompolski <julsomp@gmail.com>).Mar 31 2011, 16:03