OCR Solution: PDF XChange Viewer (and maybe Editor, too?)

Short version: 

I think the free PDF XChange Viewer (and perhaps Editor, I don’t know) solves a lot of the headaches associated with trying to achieve OCR of non-OCR PDFs. Here’s the link: http://www.tracker-software.com/prod…xchange-viewer. You have to search around to find the Viewer because it is being replaced by Editor, but Editor might be fine. I don’t know. I just know that Viewer works (and I need Viewer instead of Editor).


Detailed version:

A while back someone was asking about how to deal with PDFs that don’t allow you to highlight text for copying, pasting etc.

As was explained, the problem with such PDFs is that they are photocopy images (pictures) that do not have character recognition (i.e., OCR) of the text in the PDF-picture.

And so, a lot of times when you apply OCR apps to such PDFs to try to obtain OCR, these apps’ accuracy at interpreting pictures of text is somewhat good but you usually have to watch out for a significant number of errors. Also, many OCR utilities will generate a separate, new PDF or text file that usually does a little distortion of the layout of the original.

Well, I’ve only used it once, but the free program PDF XChange Viewer has an OCR button (I accidentally saw it today). I clicked it. It asked me whether I wanted a medium or high level of accuracy. I selected high. Then it took a few minutes to do OCR, and then voila: Still working in the original PDF, I was then able to highlight text and right-click and save.

Link: http://www.tracker-software.com/prod…xchange-viewer

The developers are replacing PDF XChange Viewer with PDF XChange Editor. It is still free I believe, and it’s an upgrade to PDF XChange Viewer. You can still find the Viewer if you look carefully for it on one of the pages of the website. I mention this because if you are a Docear user (Docear is a program that looks at your XChange Viewer highlights, annotations, etc. and extracts them onto a mind map for you), then you will want to use PDF XChange Viewer and not Editor. Also, I don’t know a thing about Editor: I don’t know if it has the OCR button, though I would imagine it does.

Fingers crossed that the OCR is truly as accurate as it seems to be. That would be gravy!




One thought on “OCR Solution: PDF XChange Viewer (and maybe Editor, too?)

  1. Thanks for the recommendation. PDF XChange Viewer is the best free PDF viewer/editor I’ve used so far; it has replaced Foxit PDF as my default reader .

    One issue I have is with using the OCR’d PDFs in Mendeley: I can’t select texts properly. Whenever I try to select text, it will select the complete row. I’ve tried it with two scanned and OCR’d PDFs got the same results.

    I’m reading through your blog and trying to replicate your workflow. Could you write am updated article about your workflow? I’m especially interested in your use of docear and Onenote. You’ve mentioned Onenote, but I couldn’t find details on how you’re using it.

    Thanks for your great site.

