1 Reply Latest reply: Jun 28, 2014 7:55 PM by CtDave RSS

    When I OCR two versions of the same document and then compare th documents in Acrobat Pro XI, I usually get the message that there are no changes to mark.  However, I know there a quite a few number of changes.  I raised this question more than a year ago

    Jerry_Legal Community Member

      When I OCR two versions of the same document and then compare the documents in Acrobat Pro XI, I usually get the message that there are no changes to mark.  However, I know there a quite a few number of changes.  I raised this question more than a year ago, and the response I received had to do with the quality of the OCR and the scans of the documents.  However, if I use Acrobat Pro XI to save the same documents in Word and then run a comparison in Word all of the changes are marked.  When a PDF is saved as a Word document in Acrobat Pro XI, is a different OCR module being used than the one used in Acrobat Pro XI for text recognition?

        • 1. Re: When I OCR two versions of the same document and then compare th documents in Acrobat Pro XI, I usually get the message that there are no changes to mark.  However, I know there a quite a few number of changes.  I raised this question more than a year
          CtDave CommunityMVP

          OCR is only for recoginition of the image / picture of text provided by an scanner.
          Content typed into a Word file which is converted to a PDF is (in Word and in PDF) *not* an image  or picture of text - it is the digital text. So, no OCR involved.

          When the "digital" (renderable) text of a PDF's page content is exported to Word no OCR is involved.

          When a PDF's content is from the image output of a scanner and this is a picture of text then OCR comes into play.

          If this content is exported to Word before doing OCR then it is the image that is exported to the Word file.


          Once OCR is performed it is the OCR output that is exported.
          OCR output is (always will be) impacted by "the quality of the OCR and the scans of the documents". 

          Regardless "Compare" is based on a Word file output to PDF1 then edits to the Word file followed by an output to PDF2. You use Acrobat Pro to do a compare of PDF1 & PDF2.
          Paper 1 scanned to image 1 to image 1 in PDF1 that gets OCR 1 and
          Paper 2 scanned to image 2 to image 2 in PDF2 that gets OCR 2
          being processed with Acrobat Pro's Compare can certainly be done.
          But - well you've described what can be observed.


          Be well...