Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

  • Related entries

    No related content found.

    • Archives

    • Recent comments

    ocrodjvu: increase accessibility of your DJVU books

    5th November 2009

    ocrodjvu = OCRopus (tesseract) + DJVU

    It is a small command-line tool to easily convert your image-only DJVU files into image+text DJVU files. In Debian testing, there are language packages for (in no specific order) German, English, French, Spanish, Vietnamese, Brasilian Portuguese, Dutch, and Italian. The original tesseract-ocr software includes training data & code, so it should be (at least in theory) easy to add more recognition languages.

    Share

    Leave a Reply

    XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>