• ☆ Yσɠƚԋσʂ ☆OP
    link
    fedilink
    arrow-up
    2
    ·
    1 month ago

    The accuracy depends on the quality of the source image, it tends to do pretty well even with compressed ones. Doing OCR on a whole book might be a bit slow, but could be worth running a few pages through to see what it would look like. You could definitely use crush to make a script that would feed a pdf through deepseek-ocr and output formatted text. You’d probably have to stream it through by doing a few pages at a time.