Yes it is inspired by the post about texts in videos. It reminded me I have been seeking such a tool for PDFs. Yandex advertises this however it has not been very helpful in my experience.

  • loathesome dongeater
    link
    41 year ago

    If you want ocr you can use tesseract-ocr. If you want to extract actual text from a pdf then you can use something like pdf2text from poppler tools but you will have to fix the formatting a lot.

    • @Lemmy_MouseOP
      link
      21 year ago

      Gotcha thanks I’m going to look into this

    • Makan ☭ CPUSA
      link
      21 year ago

      Anything with good formatting is fine in my book.

      Or at least one that gets the words right.

    • Makan ☭ CPUSA
      link
      11 year ago

      That’s what I’m thinking as well; lots of add-ons and extensions for browsers too out there.