Mind Dump, Tech And Life Blog
written by Ivan Alenko
published under license Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)copy! share!
posted in category Creativity / Documents
posted at 24. Aug '21

Howto Pack Series Of Images Into DjVU + OCR

Useful for scanned books. Use Tesseract 4, Tesseract 3 is not good.

for i in *.jpg; do convert $i $i.pbm; done
for i in *.pbm; do cjb2 -clean $i $i.djvu; done
djvm -c secretbook.djvu *.djvu
ocrodjvu --engine=tesseract --in-place secretbook.djvu 

Add Comment