Copy link to clipboard
Copied
We have about 2000 documents (reports) that were scanned to PDF. These have a text (OCR) layer, but the OCR is very bad, with breaks within most words and complete mis-alignment. The text layer is virtually useless. We have an immediate need to remove the OCR layer and re-create it with a better tool. I do not know the tool that was utilized when these were scanned.
Can Acrobat perform this task? If so, can it do a batch process?
I have looked in the past, and struggled with the fact that the pdf's have a text layer already, so no new OCR is performed.
Thanks in advance for any insights.
Chuck
Copy link to clipboard
Copied
Please run OCR again on all the documents. It will create a new layer of recognized text. Please use latest Acrobat DC for this as it supports running OCR again on OCRed documents instead of giving an error.
And you can OCR multiple files at a time using "In multiple files" option of Recognize text or by creating an action.
Thanks.
Copy link to clipboard
Copied
Garg,
Thank you for the tip, but it doesn't seem to work on the latest Acrobat Pro DC for Mac (2018.011.20038). It still throws up the error about the PDF already having recognized text.
Is there something I missed?
Copy link to clipboard
Copied
Hello Ibnabouna,
Sorry for the delayed response and inconvenience caused. You may try sanitizing the current PDF file and see if that helps.
To sanitize the PDF, you can refer to the Adobe article Removing sensitive content from PDFs in Adobe Acrobat DC
You may also try to print the PDF through Print to Adobe PDF.
Is it possible to share the PDF file with us? To share the file, please use Adobe Send feature, upload the file, share the link to files via private message only, How Do I Send Private Message
Let us know how it goes and share your findings.
Regards,
Anand Sri.
Copy link to clipboard
Copied
It would be great if you can share 1 sample file.
Also, try an option if that is not done already. Use "Searchable Image Exact" option of OCR
Tools> Enhance scan> Recognize Text> In this file> Settings(change Output style to "Searchable Image Exact") > Recognize Text
Thanks.