1 Reply Latest reply on Feb 19, 2010 5:33 AM by try67

    find string in pdf and return position

    dhs1223 Level 1

      I am trying to create a 'find' function that returns the coordinates for a word. I have had mixed success so far.  I have been doing this in VB, but would branch out if needed to get this resolved.


      Problems I have hit now is that,

      -It is ungodly slow

      -jso.getPageNthWord does not seem to consider hyphens, forward slashes and a bunch of other non alpha numeric characters as parts of words and it treats each of these special characters as blank strings it seems instead of their actual contents. So 123-456 would be returned as 3 words, '123', '', and '456'


      Is there a proven way to find a word in the pdf and the be able to return its 'quads'? If it does not pre-exist, can you point me in the direction I need to go in order to have this ability? Something with the API maybe? Thank you


      here is a code snippet, works fine but too slow and misses items with - and /





      i = 0 To count



      word = jso.getPageNthWord(0, i)


      If VarType(word) = vbString Then

      result = StrComp(word, strPartNum, CompareMethod.Text)



      If result = 0 Then 'there was a match



      Dim q As Array


      Dim obj1 As Array


      Dim rect(3) As Double


      obj1 = jso.getPageNthWordQuads(0, i)

      q = obj1(0)

      rect(0) = Round(q(4), 2)

      rect(1) = Round(q(5), 2)

      rect(2) = Round(q(2), 2)

      rect(3) = Round(q(3), 2)





      l = jso.addLink(0, rect)

      l.borderWidth = 1




      End If



      End If



      Next i