I think the free PDF XChange Viewer (and perhaps Editor, I don’t know) solves a lot of the headaches associated with trying to achieve OCR of non-OCR PDFs. Here’s the link: http://www.tracker-software.com/prod…xchange-viewer. You have to search around to find the Viewer because it is being replaced by Editor, but Editor might be fine. I don’t know. I just know that Viewer works (and I need Viewer instead of Editor).
A while back someone was asking about how to deal with PDFs that don’t allow you to highlight text for copying, pasting etc.
As was explained, the problem with such PDFs is that they are photocopy images (pictures) that do not have character recognition (i.e., OCR) of the text in the PDF-picture.
And so, a lot of times when you apply OCR apps to such PDFs to try to obtain OCR, these apps’ accuracy at interpreting pictures of text is somewhat good but you usually have to watch out for a significant number of errors. Also, many OCR utilities will generate a separate, new PDF or text file that usually does a little distortion of the layout of the original.
Well, I’ve only used it once, but the free program PDF XChange Viewer has an OCR button (I accidentally saw it today). I clicked it. It asked me whether I wanted a medium or high level of accuracy. I selected high. Then it took a few minutes to do OCR, and then voila: Still working in the original PDF, I was then able to highlight text and right-click and save.
The developers are replacing PDF XChange Viewer with PDF XChange Editor. It is still free I believe, and it’s an upgrade to PDF XChange Viewer. You can still find the Viewer if you look carefully for it on one of the pages of the website. I mention this because if you are a Docear user (Docear is a program that looks at your XChange Viewer highlights, annotations, etc. and extracts them onto a mind map for you), then you will want to use PDF XChange Viewer and not Editor. Also, I don’t know a thing about Editor: I don’t know if it has the OCR button, though I would imagine it does.
Fingers crossed that the OCR is truly as accurate as it seems to be. That would be gravy!