3 Replies Latest reply on Jul 26, 2013 4:42 AM by try67

    How to access text selection attributes (bounding box) in a pdf

    GautierThomas57

      How can we access text selection attributes in a pdf (bounding box), when the selection is made with the findtext method under vba javascript?

      I'm currently looking for this attribute/method as I need to position buttons on specifics words in a pdf.

        • 1. Re: How to access text selection attributes (bounding box) in a pdf
          try67 MVP & Adobe Community Professional

          In JS this is done by using the getPageNthWordQuads method, which returns an array of quads that represents that specific word.

          • 2. Re: How to access text selection attributes (bounding box) in a pdf
            GautierThomas57 Level 1

            yes thanks a lot for the info: It works fine!

            But using this method I can't access my string using the Findtext method and thus have to use getPageNthWord and access word by word the pages wich is pretty time consumming. Do you have any idea to accelerate the scanning/parsing of my entire documents (~3000 pages).

            here after is the code used to position poup on my string String2Search

             

            Sub PdfTextExtraction()

             

                Dim pddoc As New AcroPDDoc

                Dim jso As Object

             

                Dim txt As String

                Dim cntPages As Long

                Dim cntWords As Long

                Dim quad As Variant

                

                Dim ip As Long

                Dim iw As Long

                Dim RefPath As String

                Dim DocName As String

                Dim StringCurrent As String

                'Dim AcroRect As CAcroRect

                Dim counter01 As Integer

                Dim popupRect(0 To 3) As Integer

                Dim oVarDescr As Object

                Dim avobj As Acrobat.AcroAVDoc

                Set avobj = CreateObject("AcroExch.AVDoc")

             

                'Load the PDF File

               

                RefPath = "C:\tmp\"

                DocName = "filedoc"

                counter01 = 0

             

            pddoc.Open (RefPath & DocName & ".pdf")

            Set avobj = pddoc.OpenAVDoc(DocName)

                    Set jso = pddoc.GetJSObject

                    cntPages = pddoc.GetNumPages

                    For ip = 0 To cntPages - 1

                        cntWords = jso.getPageNumWords(ip)

                        For iw = 0 To cntWords - 1

                        'get words

                        'StringCurrent = jso.getPageNthWord(ip, iw, True)

                        If jso.getPageNthWord(ip, iw, False) = "String2Search" Then

                            quad = jso.getPageNthWordQuads(ip, iw)(0)

                                    'Set popup

                                        popupRect(0) = CLng(quad(0)) 'Left

                                        popupRect(1) = CLng(quad(1)) 'Top

                                        popupRect(2) = CLng(quad(2)) 'Right

                                        popupRect(3) = CLng(quad(5)) 'bottom

                                        Set oVarDescr = jso.AddField("PopUpID" & counter01, "button", ip, popupRect)

                                        With oVarDescr

                                            .UserName = "This cal is to determine..."

                                        End With

                                        counter01 = counter01 + 1

                          End If

                        Next iw

                    Next ip

                Set jso = Nothing

                Set pddoc = Nothing

            End Sub

            • 3. Re: How to access text selection attributes (bounding box) in a pdf
              try67 MVP & Adobe Community Professional

              That's the only way of doing it with JS.

               

               

              On Fri, Jul 26, 2013 at 12:54 PM, GautierThomas57