14 Replies Latest reply: Jan 8, 2013 8:56 AM by GalfromKalamazoo RSS

    Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it

    GalfromKalamazoo Community Member

      I originally posted this is the AcrobatUsers.com, but was advised to post it here instead.

       

      Using Win XP Pro, Adobe X, and Access 2003:

       

      This code runs thru a few files, then can't find text to be bookmarked (it's there, though).  If run again, same thing with next few files.

      There seem to be several issues going on here.  I've observed when the code is running, and Acrobat is searching for the text to bookmark, sometimes it finds totally random text, rather than the precise text specified (e.g., 'Landfilled waste').  Other times, it finds the correct text and also selects several disconnected words in the vicinity!  When the point is reached where Acrobat thinks it hasn't found the text, the code just hangs, waiting for Acrobat to do something, and then the message appears the the application is busy and I should either switch to that app, retry, or cancel, all of which fail to work.

       

      I have also noticed that when it fails to find the requested text, when I manually search the file, it can't find it, either, even when I locate the text myself, highlight it and copy it into the search box!  I also must state that this code worked perfectly a year ago, before Acrobat upgraded to version X.  Is this a bug?

      Private Sub cmdAddBookmark_Click()
      'accesses the pdf reports using the Acrobat object
      Dim BM As Object
      Dim AcApp As Object
      Dim ADoc As Object
      Dim PDoc As Object
      Dim bFileOpen As Boolean
      Dim btitle As Boolean
      Dim strfile As String
      Dim fso As New Scripting.FileSystemObject
      Dim fil As Scripting.File
      Dim fldr As Scripting.Folder
      Dim filn As String
      Dim strB As String

      'Create the Acrobat Object
          Set AcApp = CreateObject("AcroExch.App")
      'Create Doc objects
          Set PDoc = CreateObject("AcroExch.PDDoc", "")
          Set ADoc = CreateObject("AcroExch.AVDoc", "")

       

      'the following variables are used to set the directory where the files are to be processed
      If strYr = "" Then strYr = inputbox("Enter report data year", , Year(Now) - 2)
      If strtype = "" Then strtype = inputbox("Enter Corp or Indiv type of report", "Corp or Indiv", "Corp")
      If strReport = "" Then strReport = inputbox("Enter Energy, Env, or Combined for type of report", "Energy, Env, or Combined", "Energy")
      If strReport = "Energy" Then
          strB = "Appendix B - Fuel and Energy Use"
      Else
          strB = "Appendix B - Mill Environmental Data and Statistics"
      End If

      Set fso = CreateObject("Scripting.FileSystemObject")
      'Set path to pdf files
      Set fldr = fso.GetFolder(GetPath() & strYr & "\" & strReport & "\" & strtype & "\PDFRpts\")
      'check for pdfs in folder
      If Dir(fldr & "\*.pdf") <> "" Then
          For Each fil In fldr.Files
                   filn = fil.Name
                   strfile = fldr & "\" & filn
                   'code to load doc
         
                   bFileOpen = ADoc.Open(strfile, "")
                   Set PDoc = ADoc.GetPDDoc
                   'Create BookMark Object
                   Set BM = CreateObject("AcroExch.PDBookmark")
                   'Show the Application to be able to insert bookmark
                   AcApp.Show
                   'if bookmark hasn't already been added...
                   If Not BM.GetByTitle(PDoc, "Appendix B") Then
                
      'look for text to bookmark: case-sensitive, whole words only, starting from beginning
                                If ADoc.FindText(strB, True, True, True) Then
                                         Me.lblBM.Caption = "Adding bookmark to " & filn & "."
                                         Me.Repaint
                                         'execute the menu item
                                         AcApp.MenuItemExecute ("NewBookmark")
                                         'locate new bookmark
                                         btitle = BM.GetByTitle(PDoc, strB)
                                         'set bookmark title
                                         btitle = BM.SetTitle("Appendix B")

                                Else
                                         Me.lblBM.Caption = "Could not find text for Appendix B in file " & filn
                                         Me.Repaint
                                         ADoc.Close 0
                                         PDoc.Close
                                         GoTo exit_insert_bookmark
                                End If
                 Else
                            Me.lblBM.Caption = filn & " already has a bookmark for " & strB
                            Me.Repaint
                   End If
                   PDoc.Save 1, strfile
                   ADoc.Close 0
                   PDoc.Close
          Next fil
      End If
      Me.lblBM.Caption = ""
      MsgBox "Done!"

       

      exit_insert_bookmark:
          AcApp.CloseAllDocs
          AcApp.Exit
          Set AcApp = Nothing

       

      End Sub

        • 1. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
          ReinhardF Community Member

          Mmmh,

           

          if I remember right findText will only work proper searching one word and not 'Landfilled waste'.

          Perhabs you can test that with searching only for "Landfilled".

           

          I would split find and bookmarks.

          First find and write bm info on a variable, then set bookmarks based on that variable.

          Maybe that is easyer to write and maintan.

           

          br, Reinhard

          • 2. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
            GalfromKalamazoo Community Member

            Reinhard, thanks for the reply.  As to splitting finding and bookmarking, that is what I did.  I set a variable to the text to find, then set the bookmark based upon the variable sought, then named the bookmark.

             

            If it's true that findtext can only find single words, then I am indeed in trouble!  Because the text to be bookmarked can only be found using multiple words.

            • 3. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
              Test Screen Name CommunityMVP

              JavaScript can retrieve each word in turn from a page, so with some complication you can search for multiple words. The Acrobat SDK details the VB:JavaScript linkage.

              • 4. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                GalfromKalamazoo Community Member

                Thanks for the reply, TSN.  As my title says, I didn't want to use JS as I am not familiar with it.  However, if that's the only way to do it, some code that would illustrate that would be helpful.  Would I run JS from within the VBA, or directly from Acrobat?

                 

                Addendum:  why did this work in Acrobat 9?

                 

                Message was edited by: GalfromKalamazoo

                • 5. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                  ReinhardF Community Member

                  " Would I run JS from within the VBA ..."

                   

                  That's my preferred way. Have a look at:

                  http://forums.adobe.com/message/3718235

                   

                  br, Reinhard

                  • 6. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                    GalfromKalamazoo Community Member

                    Can you please provide js code for finding text and making it a bookmark?

                    I do not want to install the Acrobat SDK, as it also requires Visual Studio, apparently, and I can see this opening an even larger can of worms.  I don't know if you'd call this bookmark (Appendix B) a root or a child; it seems to stand alone.  There is really only one bookmark per pdf, as what you see below is the result of two separate pdfs having been combined.  For some reason the image appears fuzzy, sorry.

                    acrobat bookmarks.jpg

                    • 7. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                      Test Screen Name CommunityMVP

                      The Acrobat SDK is thousands of pages of documentation which you WILL need to reference. You cannot learn it by examples, even if they exist.

                       

                      But there is an online "exploded" SDK. http://livedocs.adobe.com/acrobat_sdk/10/Acrobat10_HTMLHelp/wwhelp/wwhimpl/js/html/wwhelp. htm?&accessible=true

                      • 8. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                        khkremer CommunityMVP

                        The Acrobat SDK does not require Visual Studio. It's a collection of

                        documentation, sample code and a few more things. You can safely install

                        it, even if you don't have VS installed. There are actually different APIs

                        in the SDK, and not every one of them requires VS - if you want to develop

                        only in Acrobat's JavaScript, there is no need for any outside development

                        environment, you just would use the documentation and the sample code for

                        JavaScript. For VBA, you would then in turn use the documentation and

                        sample code for the IAC API.

                         

                        So, go ahead and download and install the SDK - you will need it.

                         

                         

                        Karl Heinz Kremer

                        PDF Acrobatics Without a Net

                         

                        khk@khk.net

                        http://www.khkonsulting.com

                         

                         

                         

                        On Thu, Jan 3, 2013 at 11:27 AM, GalfromKalamazoo

                        • 10. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                          GalfromKalamazoo Community Member

                          Reinhard, this is what I have put together in VBA to search for a text phrase in all pdfs in a folder, and bookmark it.

                          Most of it doesn't work, and I've inserted capitalized comments where there was success or failure.  I would very much appreciate feedback on what I am doing wrong, as this has become a huge frustration.

                           

                          VBA Code snippet:

                           

                          Dim jso As Object

                          Dim oBMx As Object, oBMb As Object

                          Dim oTXT As Object

                           

                          'Create the Acrobat Object

                              Set AcApp = CreateObject("AcroExch.App", "")

                           

                          'Create Doc objects

                              Set PDoc = CreateObject("AcroExch.PDDoc", "")

                              Set ADoc = CreateObject("AcroExch.AVDoc", "")

                           

                          'following variables simply collect data that sets the path to the files to be processed

                          If strYr = "" Then strYr = inputbox("Enter report data year", , Year(Now) - 2)

                          If strtype = "" Then strtype = inputbox("Enter Corp or Indiv type of report", "Corp or Indiv", "Corp")

                          If strReport = "" Then strReport = inputbox("Enter Energy, Env, or Combined for type of report", "Energy, Env, or Combined", "Energy")

                           

                          If strReport = "Energy" Then

                              strB = "Appendix B - Fuel and Energy Use"

                          Else

                              strB = "Appendix B - Mill Environmental Data and Statistics"

                          End If

                           

                          Set fso = CreateObject("Scripting.FileSystemObject")

                          'Set path to pdf files

                          Set fldr = fso.GetFolder(GetPath() & strYr & "\" & strReport & "\" & strtype & "\PDFRpts\")

                           

                          'check for pdfs in folder

                          If Dir(fldr & "\*.pdf") <> "" Then

                              For Each fil In fldr.Files

                                  filn = fil.Name

                                  strfile = fldr & "\" & filn

                                  'code to load doc

                           

                                  bFileOpen = ADoc.Open(strfile, "")

                                  Set PDoc = ADoc.GetPDDoc

                                  jso = PDoc.GetJSObject

                                  'Activate the Application to be able to insert bookmark

                                  AcApp.Show

                           

                          'SUCCESS: THE CORRECT PDF IS SHOW

                           

                                  'if bookmark hasn't already been added...

                                  jso.search.wordMatching = "bookmarks"

                                  Set oBMb = jso.search.Query("Appendix B", "ActiveDoc")

                              'alternate command that was tried & failed was Set oBMb = jso.FindBookmarkByName(jso.this.bookmarkroot, "Appendix B")

                                  If oBMb Is Nothing Then       

                           

                          'FAILURE: oBMb IS ALWAYS NOTHING, EVEN WHEN BOOKMARK EXISTS

                           

                                         'look for text to bookmark: case-sensitive, not whole words only, starting from current location

                                          jso.search.wordMatching = "MatchPhrase"

                                          jso.search.MatchCase = True

                                          oTXT = jso.search.Query(strB, "ActiveDoc")

                                          If Not oTXT Is Nothing Then

                           

                          'FAILURE: oTXT IS ALWAYS NOTHING, EVEN THOUGH IT EXISTS

                           

                                              'execute the menu item

                                              AcApp.MenuItemExecute ("NewBookmark")

                                              'create an extra bookmark to move the focus off the first one

                                              AcApp.MenuItemExecute ("NewBookmark")

                           

                          'SUCCESS: THESE TWO NEW BOOKMARKS ARE ADDED

                           

                                              'locate new bookmark

                                              Set oBMb = jso.FindBookmarkByName(jso.this.bookmarkroot, "Untitled")

                                              If Not oBMb Is Nothing Then oBMb.Name = "Appendix B"

                           

                          'FAILURE: BOOKMARK NOT FOUND THEREFORE NOT RENAMED

                           

                                              'find extra bookmark and remove it

                                              Set oBMx = jso.FindBookmarkByName(jso.this.bookmarkroot, "Untitled")

                                              If Not oBMx Is Nothing Then oBMx.Destroy

                           

                          'FAILURE: BOOKMARK NOT FOUND THEREFORE NOT DELETED

                           

                                          End If

                                  End If

                                  PDoc.Save 1, strfile

                                  ADoc.Close

                                  PDoc.Close

                           

                          'FAILURE: OPENED PDF IS STILL OPEN

                           

                                  Set jso = Nothing

                                  Set oTXT = Nothing

                                  Set oBMb = Nothing

                                  Set oBMx = Nothing

                              Next fil

                          End If

                           

                          Message was edited by: GalfromKalamazoo

                          • 11. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                            ReinhardF Community Member

                            Hi,

                             

                            to find text on a page

                            ADoc.FindText seems not to work reliable,

                            search.query open up an unwanted search box,

                            Ctrl+F (or Menuitem search) with sendkeys works fine but it's not very sure,

                            so I prefer to read the text page by page and let vbs search.

                            In hidden mode it works faster then CTRL+F, in visible mode only unremarkable slower.

                            Attached an example. Change it to your needs.

                             

                            find bookmarks

                            Didn't "BM.GetByTitle(PDoc, "Appendix B")" from your first script already work propper?

                            Otherwise I may spend you a function which reads the bookmarks and give you a correct answer using Instr via VBS.

                             

                            saving

                            If  "PDoc.Save 1, strfile"  doesn't work acroJS can be used for that. Can be done later.

                             

                            br, Reinhard

                             

                            FindText.vbs

                            ---------------------------------------------------------------

                                '//Settings: Filename and Word to find
                            FileNM = "u:\temp\BalanceSheets.pdf"
                            WordTF = "Intercompany payables"

                            '// Check if file exist
                            set fs = CreateObject("Scripting.FileSystemObject")
                            if not fs.FileExists(FileNM) then
                                 MsgBox "Ups! " & FileNM & " doesn't exist? " & "Try new!", vbExclamation
                                 WScript.quit
                            end if
                                '//Start Acrobat and Open the File into View
                            Set gApp = CreateObject("AcroExch.App")
                            Set gAVDoc = CreateObject("AcroExch.AVDoc")
                            OK = gAVDoc.Open(FileNM, "")
                                    if  not OK Then if MsgBox("Error open Basic File") then Wscript.quit
                            '//comment both out to work hidden
                            gApp.show
                            gAVDoc.bringToFront()


                            readAndFindText()  '15 sec for 100 pages (10 sec hidden mode)
                            'findTextViaSendkeys() '14 sec for 100 pages (hidden not possible)

                             

                            function readAndFindText()
                            set gPdDoc = gAVDoc.GetPdDoc()
                            maxPages = gPdDoc.GetNumPages
                            foundOnPage = ""
                            Set gAVPageView = gAVDoc.GetAVPageView
                            for x = 0 to 100 'maxPages
                                   gAVPageView.goto(x)
                                   Set PdfPage = gPDDoc.AcquirePage(x)
                                   Set PageHL = CreateObject("AcroExch.HiliteList")
                                   PageHL.Add 0,9000  '<<--SET in FILE! (Start,END[9000=All])
                                   Set PageSel = PdfPage.CreatePageHilite(PageHL)
                                   for i = 0 to PageSel.Getnumtext - 1
                                         pdfData = PDFData & PageSel.GetText(i)
                                   Next

                                   if instr(pdfData, WordTF) then foundOnPage = foundOnPage &x &","

                                   'msgBox("page: " &x &" / " &foundOnPage &vbLF &pdfDATA)

                                   pdfData = ""

                            next
                            msgbox("found on Page: " &foundOnPage)
                            end function

                             

                            function findTextViaSendkeys()
                            set WshShell = CreateObject ("Wscript.Shell")
                            gTitle = gAvdoc.getTitle()
                            set gPdDoc = gAVDoc.GetPdDoc()
                            maxPages = gPdDoc.GetNumPages
                            foundOnPage = ""

                            Set gAVPageView = gAVDoc.GetAVPageView
                            for x = 0 to 100 'maxPages
                                   gAVPageView.goto(x) 'comment out to get all occurence on a Page
                                   if WshShell.AppActivate(gTitle) then
                                         WshShell.sendKeys "^f"
                                         wscript.sleep 500
                                         WshShell.sendKeys WordTf
                                         wscript.sleep 500
                                         WshShell.sendKeys "~"
                                         wscript.sleep 500
                                   end if
                                   if WshShell.AppActivate("Adobe Acrobat") then
                                         WshShell.sendKeys "~"
                                         msgbox "nothing found"
                                         exit for
                                   else
                                         pg = gAVPageView.GetPageNum()
                                         foundOnPage = foundOnPage &pg &","
                                         x = pg + 1
                                    end if
                            next
                            msgbox("found on Page: " &foundOnPage)
                            end function


                            Set gPdPage  = nothing
                            Set gAVPageView = Nothing
                            Set gAVDOC = Nothing
                            Set gAPP = Nothing

                            • 12. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                              GalfromKalamazoo Community Member

                              Thanks, Reinhard.  What's peculiar is that all of those methods worked with Acrobat 9; I only ran into problems this year after I (mistakenly, apparently) upgraded to X.  Findtext found Appendix B correctly every time.  First I used findtext to find the bookmark Appendix A, executed it, and then the next find of Appendix B was the correct text to bookmark.  The file was saved correctly, closed, and the next one opened. 

                               

                              Now none of that works, and I've revised my code all over the place.  I am currently in the process of having a student manually enter in all the bookmarks for Appendix B, which is 262 pdf files!

                               

                              I'll look at what you sent me, and play with it, but for this effort this year, I've wasted enough time trying to do this with automation.  I even tried to write a javascript to run within Acrobat as an action, thinking it would be simplest, but utterly failed.

                               

                              Thanks to all for you help!

                              • 14. Re: Use MS Access 2003 VBA (not JavaScript) to find text in pdf using Acrobat X, then bookmark it
                                GalfromKalamazoo Community Member

                                No, because I didn't even know I was playing in one, and certainly don't know how to turn it on or off.  But tell me, why would that help and what is it?

                                If that's enhanced security, protected view is OFF.

                                 

                                Message was edited by: GalfromKalamazoo