I believe that there is an Assembler command to find text in a document. I think it is DocumentText and you have an attribute called withQuads that will give you 4 points that surround the text that you are looking for.
I am going from memory so I would have a look at the DDX reference that comes with Assembler.
I use the attribute WithQuads and it export an XML for all the characters in the PDF file. But sometime it returns words. Why it becomes this behavior? Is it possible to make the export always by word instead of character?