• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Goading Acrobat Pro DC (Mac) into OCR in Edit mode

New Here ,
Dec 23, 2016 Dec 23, 2016

Copy link to clipboard

Copied

  Acrobat Pro DC for Mac 2015 Release (15.006.30244) frequently has trouble initiating OCR when I initiate Edit PDF.  Nothing happens, and the scanned image is not processed.  The text in the image is quite clear. 

   I've discovered that, if I select the image, flip it upside down and then back again, and click Add Text and then Edit...Acrobat will suddenly start processing the image and (accurately) performing ORC on it.

   This appears to be an obvious bug.

   I can supply a test file that displays this behavior reproducibly.  I'm running El Capitan (OS X 10.11.6) on a mid-2015 Retina MacBook Pro.

TOPICS
Edit and convert PDFs

Views

815

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 23, 2016 Dec 23, 2016

Copy link to clipboard

Copied

Please report the bug here: Feature Request/Bug Report Form

The Acrobat Forums are mainly used by users of Adobe's applications, so it's not the best way to make Adobe aware of bugs. I would upload the test file to the Document Cloud, and then share the link to the file in the bug report. Here are some instructions about sharing a file via DC: Share Documents via Adobe's Document Cloud - KHKonsulting LLC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 23, 2016 Dec 23, 2016

Copy link to clipboard

Copied

Dear Karl,

  Thanks for your helpful suggestions.  I submitted a forum post because the Adobe web site said that this was my only support option.  Also, I expect that other users might benefit from my little workaround.
   The file link is: Shared Files - Acrobat.com

Tom

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 29, 2016 Dec 29, 2016

Copy link to clipboard

Copied

Tom,

this file contains "renderable text". You can see that when you try to start OCR via the Tools>Enhance Scans>Recognize Text function. You will end up with this error message:

2016-12-29_15-34-49.png

"Renderable text" means normal text, text that is not part of an image. Acrobat for whatever reason does not want to OCR a page that already contains "normal" text. You can see this "text" when you bring up the "Contents" pane on the left side:

2016-12-29_15-36-41.png

In your case, when you expand this text element, you will see that it's actually empty. This means that you can easily remove this text element without changing the document. You do this by highlighting the "Text" element, and then using the delete key, or by right-clicking and selecting to delete from the menu. Once you do that, the OCR function should be executed without a problem when you try to edit the document (at least it was here, several times in a row).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 29, 2016 Dec 29, 2016

Copy link to clipboard

Copied

Dear Karl,

  Thanks for your detailed and insightful analysis.  First, it's curious (or, rather, problematic) that entering Edit PDF mode does not engender the "This page contains renderable text" error, as one would expect; instead OCR just stalls out without any message.  (Yes, we shouldn't enter into the absurdity that an OCR program would choke when a page contains - horrors! - actual text!)  Second, one wonders why Acrobat has created this useless and debilitating empty text element while scanning.  Third, my silly (but effective) workaround apparently deletes this ghost text element when flipping the selected image (you can watch it disappear from the Content panel in real time).  Fourth, doing so does NOT trigger another OCR attempt as it does when actually deleting the text element (following your advice); I'm forced to click Add Text and then Edit.

  In short, Acrobat is clearly buggy in a number of respects.  I submitted a bug report.

  Thanks again for all your help!

  Happy New Year,

Tom

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 30, 2016 Dec 30, 2016

Copy link to clipboard

Copied

LATEST

Tom, only Adobe can explain we you cannot OCR a page when it already contains renderable text.

Here is how I see Acrobat's OCR functionality: It's a great way to OCR simple documents, the results are pretty good (and has been improved over the years), and it's built right into an application that does a ton of other stuff. However, when it comes to more challenging OCR jobs, I keep a copy of Abbyy's FineReader around (e.g. more than one language, languages not supported by Acrobat and of course pages that already contain renderable text )

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines