I am trying to automate a repetitive task. I need to create tif image files of each individual test question from pdfs of the full test books for our archive. The pdfs are output from an InDesign layout, and i'm using Acrobat Pro 10 on a Mac. There are many of these to do, and It'd br a big help to find a way to do this more quickly and with fewer human-introduced errors. Any suggestions are appreciated, big time.
There are a different numbers of questions on each page and they are all different sizes. I think it might be possible to use the some of the repeating text elements that are in each of the questions to tell an automation program that a new question is starting, and that it should clip out an area as tall the interval between each text elements, and save each question as a seperate pdf file. Ideally I'd like each pdf to be automatically named with the 'Y09_0000' number before each question.
I was thinking that using the small 'Y09' numbers before each question as a que for the automation. Or maybe using the square behind the question number, which is a zapf dingbat.
If I can get these questions saved as separate pdfs with the correct file names, I should be able to easily batch process the pdfs into tifs with Photoshop.
An example of an original page:
An example of how the questions should look:
I'm not sure if this is the right forum to approach for this problem, and I applogise if I am way off track. Thank you for your help!
The only way I could think of doing this would be in InDesign - assuming each question starts with a distinct element that can be GREPped for, you could separate each one onto a new page by running a search/replace that add a page break in front of the element. You can then export all the 'single-question' pages as images, and use PS automation to trim the whitespace.