Hello, this issue has us contemplating leaving the IT environment to open a food joint. I hope this is the right place for this question.
We have hundreds of thousands of PDF files created over a 12 year span; all of these files contain no metadata whatsoever and are of the historical kind so OCR is nearly non-existent. The files are named using naming conventions given by historians, so each “collection” has its own naming structure completely different from the others.
To automatically (thru batch, script or third party software), utilize the existing naming convention of each individual collection and files and populate its own basic metadata fields.
I could create a script that transposes the directory structure into a CVS file, from there, not sure if I can parse it on to an XML file or if is even possible to make an XMP or FDF file. And assuming that it can be done, how do you make a batch that reads from the file containing the directory structure and incorporates it into the PDF file itself.
Collection 1: YYYYMMDD-Pub_Type-Pub_Number
Collection 2: Pub_Number- Pub_Type-Author-Desc- YYYYMMDD
From both examples the data can be manually entered into the metadata fields, but since each file is different, it will take forever and a day to accomplish that.
I am not looking for a cookie cutter solution, I know that the parameters will change from collection to collection, but when you consider that a collection can have over 10k PDF files, a script is the only way to go, and is definitively a lot easier to modify the scripts to fit the collection.
We also contemplated mass murder/suicide but figure it was better to ask for ideas/help… :-)
Answered on AcrobatUsers.com
I saw the answer you got. Looks a little bit complicated.
I think the answer depends on you scripting knowledge.
The more simple way seems to me to use vbs (or vba or.....what you know) and acrobat SDK IAC section.
So you can use file/folder statements and special metadata (IAC) commands in one script
and there is no need to use special structured files like xml, simple csv or txt should be enough.