4 Replies Latest reply on Sep 19, 2013 11:57 AM by Corgano

    Add Words / text to page

    Corgano

      I'm indexing a whole bunch of scanned dociments using Javascript. Usually:

      1     The documents are scanned

      2     The OCR is run

      3     I get the text using a JS.getPageNthWord() loop

      4     Run regex to find document number

      4b       If this fails, I prompt the user to input the document number

      5     The pages are extracted sorted by document number

       

      And this works well. The problem is when the document number is not found, it asks the user for the number. But if I run the script on the file again, I have to ask the user for the document number AGAIN - every single time the script is run. Is there a way to add words / text to the top of the page, that is readable by JS.getPageNthWord()? To the effect:

       

      1     The documents are scanned

      2     The OCR is run

      3     I get the text using a JS.getPageNthWord() loop

      4     Run regex to find document number

      4b       If this fails, I prompt the user to input the document number

      4c       Add the document number to the top of the page so next time JS.getPageNthWord() can read it

      5     The pages are extracted sorted by document number

       

      Any help would be apprecieated

        • 1. Re: Add Words / text to page
          George_Johnson MVP & Adobe Community Professional

          You cannot add text to a page directly, but you can add annotations (form fields [doc.addField], text comments [doc.addAnnot]) and then flatten them. Flattening converts the text in the annotations to regular page contents and can be done with the doc.flattenPages JavaScript method.

          • 2. Re: Add Words / text to page
            Corgano Level 1

            I've looked at the help file examples and converted this script http://forums.adobe.com/thread/448868 but cannot get the annotations to show up

             

            Idealy, I'd want them to be invisible, but when I get all the text from the page it's not in there either. Why is it failing to make the annot?

            • 3. Re: Add Words / text to page
              George_Johnson MVP & Adobe Community Professional

              Can you post the JavaScript that you're using? Are you doing this from an external program?

              • 4. Re: Add Words / text to page
                Corgano Level 1

                Autoit, very similar to Visual Basic

                 

                 

                $oAcro = ObjCreate("acroexch.app")
                $oPDF = ObjCreate("AcroExch.PDDoc")
                
                
                $File = "pathto\test.pdf"
                
                $oPDF.Open($File)
                $oJS = $oPDF.getJSObject
                $oJSpdf =  $oJS.app.opendoc($File)
                
                $iPage = 0
                Dim $rectSize[4]
                
                    ; Determine where to place the annotation
                     $page = $oPDF.AcquirePage($iPage);
                     $page = $page.GetSize();
                     $rectSize[0] = 25;
                     $rectSize[1] = $page.y - 50;
                     $rectSize[2] = $page.x - 25;
                     $rectSize[3] = $page.y - 25;
                
                
                    ; Add the annotation
                    $oJSpdf.addAnnot($iPage, 'FreeText', $rectSize, 'Automated', 'Find1234');
                
                
                    ; Printing
                          $printParams =$oJSpdf.getPrintParams();
                    $printParams.interactive = -1;
                    $printParams.firstPage = 0;
                    $printParams.pageHandling = $printParams.constants.handling.fit;
                    ;$oJSpdf.print(printParams);
                
                     ;I'm prety sure at this point I should see the annot, but I don't
                
                $oJSpdf.flattenPages($iPage,$iPage)
                
                ; I know these funcs work, use them in other parts of the script. Outputs everything but no annot 
                ConsoleWrite( Page_GetText($oJSpdf, $file, $ipage, "Full")&@CRLF)