1 Reply Latest reply on Dec 5, 2008 12:09 AM by try67

    Copy text out of PDF into new PDF or Word document

    Level 1
      Please help getting me going in the right direction.
      I receive daily updated PDF files from which I need to copy a section out of and create a new PDF file from.
      The files contain only text, no graphics.
      The section I need to copy always starts and ends with the same string.
      Some of this text will end up as header of the new document.
      Some of this text will end up in the file name.

      Right now this extraction is done manually.
      I want to create a script to automatically extract the required text and create a new file containing that text as well as the header and file name.

      How do I go about doing this?
        • 1. Re: Copy text out of PDF into new PDF or Word document
          try67 MVP & Adobe Community Professional
          Acrobat JavaScript has a method called "getPageNthWord" which will retrieve word x out of page y of a document. So what you should do is scan the PDF, looking for the words that mark the begining of the segment you want, then record all the words from there until you arrive to the end marker. Then you can either dump all of this text to a new PDF report or export it as a text file using DataObjects.

          You didn't specify if you're familiar with Acrobat JavaScript. If you're not, this is a bit complicated for a first project...