Autarchy of the Private Cave

Tiny bits of bioinformatics, [web-]programming etc

    • Archives

    • Recent comments

    Archive for November 5th, 2009

    ocrodjvu: increase accessibility of your DJVU books

    5th November 2009

    ocrodjvu = OCRopus (tesseract) + DJVU

    It is a small command-line tool to easily convert your image-only DJVU files into image+text DJVU files. In Debian testing, there are language packages for (in no specific order) German, English, French, Spanish, Vietnamese, Brasilian Portuguese, Dutch, and Italian. The original tesseract-ocr software includes training data & code, so it should be (at least in theory) easy to add more recognition languages.

    Share

    Posted in Links, Software, Technologies | No Comments »