Skip navigation
bublitzd206
Currently Being Moderated

OCR not recognizing text. Can I add manually?

Jun 26, 2012 4:25 PM

Tags: #acrobat #ocr #scanning #ocr_not_working

I work for an academic library and we are in the process of digitizing past issues of our alumni magazine, which are then uploaded to our Digital Commons repository.  We want these to be searchable PDFs, but are running into some issues with the OCR in Acrobat. We’ve been going through all sorts of forums and Google searches in the hopes of fixing these issues, but so far have not had much luck so I’m posting here in the hopes maybe someone will have some tips.

 

  • Sometimes the OCR doesn’t recognize text at all, even if there is no obvious reason why it should. I understand when it doesn’t recognize text that is set over a background image, but it often when it’s just black text on a white background. For example, there are usually several images on a page which have captions. Sometimes the OCR will capture these, but often it will completely ignore a caption that is right next to another one which has been identified as text.
    • Is there any way to manually enter OCR information into Acrobat? Even though the program isn’t automatically recognizing text, we’d settle for just being able to input our own data so it comes up in a search. This will happen in the same spots over and over, even if we rescan the page and try it from scratch.

 

 

We’re scanning and using the OCR on high resolution images, so it’s most likely not the resolution that is getting in the way. I’ve tried all three settings on the OCR (Searchable Image, Searchable Image (Exact), ClearScan) with the same results in all three.

 

Anyone have any suggestions?

 

Thanks!,

Dana

 

Running Acrobat X Pro on Windows 7

Process: Scan to TIFF, edit TIFF in Photoshop, compiled TIFFs into one PDF in Acrobat, run OCR/touch up reading order.

 
Replies
  • Currently Being Moderated
    Jul 2, 2012 11:34 AM   in reply to bublitzd206

    Do you have OCR turned off under Edit > Preferences > Convert to PDF> TIFF when you select the Edit Settings ... button? Also, what type of compression settings do you have turned on under this same preference?

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 8, 2012 6:14 AM   in reply to LoriAUC

    I've been battling with this same exact problem for 2 years.  I typically print html to pdf and am lucky if 50% of those have OCR applied.  Adobe never has anything more than robotic completely unhelpful suggestions.  It's just poorly written buggy software.  I'm considering making a move to Nuance's OCR and PDF software.  If you value your time you'll do the same.  Trust me - I've wasted countless hours on the phone/chatting w/ Adobe support.

     
    |
    Mark as:
  • Currently Being Moderated
    Jul 9, 2012 4:38 AM   in reply to bublitzd206

    At the end of this tutorial titled Scanning and OCR: Beyond the basics with Acrobat 9, there is a suggestion for adding a new text layer to a scanned PDF if you're interested. It's applicable to both Acrobat Pro. 9 and X.

     
    |
    Mark as:

More Like This

  • Retrieving data ...

Bookmarked By (0)

Answers + Points = Status

  • 10 points awarded for Correct Answers
  • 5 points awarded for Helpful Answers
  • 10,000+ points
  • 1,001-10,000 points
  • 501-1,000 points
  • 5-500 points