ocrodjvu: increase accessibility of your DJVU books
5th November 2009
ocrodjvu = OCRopus (tesseract) + DJVU
It is a small command-line tool to easily convert your image-only DJVU files into image+text DJVU files. In Debian testing, there are language packages for (in no specific order) German, English, French, Spanish, Vietnamese, Brasilian Portuguese, Dutch, and Italian. The original tesseract-ocr software includes training data & code, so it should be (at least in theory) easy to add more recognition languages.