3 Replies Latest reply on Jun 7, 2012 11:06 AM by mbilldx

    Paper Capture Recognition Service

      I have a user, using a HP Scanjet N8460 series scanner.
      We are trying to put our documents into PDF for storage, and so have a lot to do.
      She reported to me that out of 50 scans, 15 times, she got the following error:
      "Unable to process the page because the Paper Capture Recognition Service experienced an error"
      I did not see anything in the FAQ's about it. Is this something that is known, can someone give me some direction in resolving?
        • 1. Re: Paper Capture Recognition Service
          (Aandi_Inston) Level 1
          This isn't using Adobe Reader. Does it use Adobe Acrobat?

          Aandi Inston
          • 2. Re: Paper Capture Recognition Service
            Level 1
            Sorry, thought I was in Acrobat area.
            My mistake.
            • 3. Re: Paper Capture Recognition Service

              First, here is a solution (it looks long, but to be honest it is simple, but I just have an exaple in case someone did not understand the "steps" of hte procedure.  It you understand the 4 main steps, then there is no need to read the sub steps or the examples):


              1.     Bookmark AND "print to PDF" ALL pages which thrown the "Unable to process..." error message either:

                        By trial and error, or through a patch process (mentioned in sfxjhw2's post in the post script)
                        (for trial and error, notice that Adobe Acrobat will show the page it has last processed while running an OCR...
                        it will be in top left corner of screen, but may revert to page 1 quickly after click "ok" on the error message)

              • For the Trial And Error approach: lets say for example that I notice that pg 12 gave me my first error. 
                After clicking "ok" on the error message, I am back to square one as Adobe discards all the work done once it hits this error. 

                          A.     So, as a preliminary step. I would bookmark the page throwing the error (bookmark page 12) as "ERROR" then,

                    • run an OCR up to the page thrwoing the error, but not including the error page (ex, OCR pages 1-11, but NOT 12).         

                          B.     Then Run OCR on the next ten or twenty pages AFTER the error page (ex pg OCR pages 13-23, but NOT 12)

                          C.     Bookmark the next page (in that group) which generates an error (if any)

                    • Repeat this process until the "whole" document is OCRed (of course pages throwing errors will not be OCRed)
                    • Full example, I ran OCR on whole document, and pg 12 caused it to stop, and it discarded all the work it did.  So,
                      • I first bookmark pg 12 as "error", then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 1 to 11".  It successfully runs, then
                      • skip page 12 (it is already bookmarked as "error"), then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 13 to 23".  Page 20 throws an error, so next...
                      • Bookmark page 20 as "error"
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 13 to 19".  It successfully runs, then
                      • skip page 20 (it is already bookmarked as "error"), then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 21 to 31".  It successfully runs, then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 32 to 42".  It successfully runs, then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 43 to 53".  Page 50 throws asn error, so next...
                      • Bookmark page 50 as "error"
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 43 to 49".  It successfully runs, then
                      • skip page 50 (it is already bookmarked as "error"), then
                      • go to "document">"OCR text recognition">"recognize text using OCR">"From page 51 to 61".  It successfully runs, then
                      • Final Save file  (btw you should probably be saving the file as you go)

                          D.     "Print to pdf" those pages (and only those pages) which you have bookmarked as "error"


              • For The Batch Method: if you have a document over a few hundred pages there is a better way using batch process.  Export your PDF (or print your document to pdf) with each page as an individual PDF file.  And then Batch process each PDF into an OCRed PDF.  You will then have to search through those pages to find out which ones have not been ocred (im guessing by sorting by "date modified" or by saving them with a new name, ie "filename - ocr.pdf" (see quote below my ps, by "sfxjhw2")

              3.     Run OCR on those "error" pages

              4.     drag and drop or copy and paste those pages into your original OCRed pdf document.


              Now for the complaint... I have seen this question posted many times in Adobe forums, just to point out a few:


              It irks me that

              1. this problem has persisted through many versions of Acrobat
              2. Adobe has failed to either fix it or post an obvious solution in the forums
              3. Adobe has not even done a better job with error handling (causing the whole program to stop the OCR instead of asking if you would like to skip the page, continue, and and bookmark the page that has not been OCRed).


              Hope this helps.
              And, to Adobe, all in all great product, just wish the error handling was a little better, this problem was fixed, and there was better menu control to change the font on comments and the typewritter.



              Here is a great post by a person who addressed part of this problem:

              [post #10 in reply to http://forums.adobe.com/message/3765069]
              sfxjhw2Community Member  Sep 16, 2011 2:47 PM


              Ok I think I might have a workaround *IF* this error is only thrown on a few offending pages.  If you have only one page, OR if it is essential that all pages then this will probably not be a workaround for you.


              *FYI* this workaround also applies to the stupid "Unknown Error" that occurs in non-Clearscanned OCR.   ("Unknown Error" -- Ha, this is the type of error handling I use when I don't care about anyone using my programs, sounds like Adobe feels the same way).


              Since Acrobat spits out such a stupid error, and since it doesn't work in any other scan mode (non-clearscan in any DPI) for these pages, and since there is no log file to look at for the 'paper capture recognition service' to even diagnose the problem, and since they seem completely ok with their absoutely horrible error handling here and since Adobe seems incapable of fixing this error for the last few versions and since they don't care enough to offer any help in their KB regarding an issue affecting hundreds (maybe a few thousand) of users here is a POSSIBLE workaround to clearscan your document if you get this error:


              (The following instructions are for Acrobat X (you might have to click 'show and hide pannels' at the top-right of the TOOLS BOX if some of these options are not available; 9 has the same bug and workaround but where you find things are different of course)

              1) Scan your pages OR save your pdf as INDIVIDUAL PAGES (there must be a way to export your pdf into individual pages, you might need to use the batch manager (tools->action wizard on Acrobat X))

              2)  Tools->Action Wizard->Save individual files as pdf: If you don't have individual pdf pages (you have jpegs or tiff) then convert them to PDF FIRST (yes that's right, despite the seemingly obvious thought that a batch Clearscan would actually *gasp* batch Clearscan your files, it won't, it will only convert them to PDFs

              3)  Tools->Recognize Text->In Multiple Files:  Now select to clearscan all your individual files

              4) You will get an error on offending pages where Clearscan would normally give you an 'error 6' BUT this method will get around the moronic error-THEN-exit design of Clearscan and allow you to actually Clearscan most of your document

              5) Create->Combine Files into Single PDF:  Once all pages have been Clearscanned you can reassemble the individual pages to the full document


              If you needed that particular page to be clearscanned, this method will not work for you.  If you just want 99% of your document to be clearscanned, then hopefully this will help.


              We have to admit that Clearscan is a one-of-a-kind technology.  The OCR scientists should be given a raise for this.  The software engineers and support staff, on the other hand, should go back to the community college they flunked out of and learn a bit more about programming.