1 Reply Latest reply on Feb 19, 2010 5:33 AM by try67

    find string in pdf and return position

    dhs1223

      I am trying to create a 'find' function that returns the coordinates for a word. I have had mixed success so far.  I have been doing this in VB, but would branch out if needed to get this resolved.

       

      Problems I have hit now is that,

      -It is ungodly slow

      -jso.getPageNthWord does not seem to consider hyphens, forward slashes and a bunch of other non alpha numeric characters as parts of words and it treats each of these special characters as blank strings it seems instead of their actual contents. So 123-456 would be returned as 3 words, '123', '', and '456'

       

      Is there a proven way to find a word in the pdf and the be able to return its 'quads'? If it does not pre-exist, can you point me in the direction I need to go in order to have this ability? Something with the API maybe? Thank you

       

      here is a code snippet, works fine but too slow and misses items with - and /

       

       

      For

       

      i = 0 To count

       

       

      word = jso.getPageNthWord(0, i)

       

      If VarType(word) = vbString Then

      result = StrComp(word, strPartNum, CompareMethod.Text)

       

       

      If result = 0 Then 'there was a match

       

       

      Dim q As Array

       

      Dim obj1 As Array

       

      Dim rect(3) As Double

       

      obj1 = jso.getPageNthWordQuads(0, i)

      q = obj1(0)

      rect(0) = Round(q(4), 2)

      rect(1) = Round(q(5), 2)

      rect(2) = Round(q(2), 2)

      rect(3) = Round(q(3), 2)

       

       

       

       

      l = jso.addLink(0, rect)

      l.borderWidth = 1

      l.setAction(

      "this.getURL('http://www.google.com/');")

       

      End If

       

       

      End If

       

       

      Next i