• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Replace or Repair OCR in scanned documents

New Here ,
Jan 18, 2018 Jan 18, 2018

Copy link to clipboard

Copied

We have about 2000 documents (reports) that were scanned to PDF. These have a text (OCR) layer, but the OCR is very bad, with breaks within most words and complete mis-alignment. The text layer is virtually useless. We have an immediate need to remove the OCR layer and re-create it with a better tool. I do not know the tool that was utilized when these were scanned.

Can Acrobat perform this task? If so, can it do a batch process?

I have looked in the past, and struggled with the fact that the pdf's have a text layer already, so no new OCR is performed.

Thanks in advance for any insights.

Chuck

TOPICS
Scan documents and OCR

Views

4.8K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Mar 06, 2018 Mar 06, 2018

Copy link to clipboard

Copied

Please run OCR again on all the documents. It will create a new layer of recognized text. Please use latest Acrobat DC for this as it supports running OCR again on OCRed documents instead of giving an error.

And you can OCR multiple files at a time using "In multiple files" option of Recognize text or by creating an action.

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 28, 2018 Mar 28, 2018

Copy link to clipboard

Copied

Garg,

Thank you for the tip, but it doesn't seem to work on the latest Acrobat Pro DC for Mac (2018.011.20038). It still throws up the error about the PDF already having recognized text.

Is there something I missed?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Apr 27, 2018 Apr 27, 2018

Copy link to clipboard

Copied

Hello Ibnabouna,

Sorry for the delayed response and inconvenience caused. You may try sanitizing the current PDF file and see if that helps.

To sanitize the PDF, you can refer to the Adobe article Removing sensitive content from PDFs in Adobe Acrobat DC

You may also try to print the PDF through Print to Adobe PDF.

Is it possible to share the PDF file with us? To share the file, please use Adobe Send feature, upload the file, share the link to files via private message only, How Do I Send Private Message

Let us know how it goes and share your findings.

Regards,

Anand Sri.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Apr 30, 2018 Apr 30, 2018

Copy link to clipboard

Copied

LATEST

It would be great if you can share 1 sample file.

Also, try an option if that is not done already. Use "Searchable Image Exact" option of OCR

Tools> Enhance scan> Recognize Text> In this file> Settings(change Output style to "Searchable Image Exact") > Recognize Text

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines