4 Replies Latest reply on Aug 22, 2012 1:31 PM by SuperZu

    Batch sort by keyword

    SuperZu

      Hi everyone. I get a ton of searchable PDFs in the course of the day that go into one of two batches for entry into a document imaging system. I manually sort them based on a keyword, but I was thinking that there'd probably be an easier way to do it as a batch process. I didn't see anything in the pre-made batch options, so it looks like javascript is the way to go.

       

      Here's an example. I'll get 100 PDFs dropped into a folder on my system. 35 will have "Keyword X" in the text of the document. The other 65 will have "Keyword Y". Right now I open each one, look to see what the keyword is, and then drop it into one of two folders. I just need a bit of code that searches the body of the document. If it finds "Keyword X", then it saves it to Folder1. If it finds "Keyword Y", then it saves it to Folder2. I didn't find anything on the forums about this issue, but it seems straightforward and logical enough. Could anyone help me out with the code?

       

      If it makes a difference, we have Acrobat Pro 8. Thanks!

        • 1. Re: Batch sort by keyword
          gkaiseril MVP & Adobe Community Professional

          I would use the Action or batch processing in the Professional version, open a PDF check the keywords, info.Keywords, and save the file to the appropriate folder. You may need to allow for a PDF to contain both keywords.

          1 person found this helpful
          • 2. Re: Batch sort by keyword
            SuperZu Level 1

            Sorry for being confusing. It's not a keyword in the sense of metadata, it's a word within the body of the text that is either one of two values. I can run a search through javascript to pull out just the phrase I'm looking for, but I don't know where to go from there.

             

            For example, in a folder with 29 files, this code pulls out the 21 that have "Subject :I/O" in the body of the document.

            search.wordMatching = "MatchAllWords";

            search.query("Subject :I/O", "Folder", "/C/Documents and Settings/myfolder");

            • 3. Re: Batch sort by keyword
              gkaiseril MVP & Adobe Community Professional

              The search object returns the result to a window and not to a string or array variable within your JavaScript.

               

              You need to loop through each page and each word on the page for your words using getPageNthWord.

              1 person found this helpful
              • 4. Re: Batch sort by keyword
                SuperZu Level 1

                Okay, so if I set it up like:

                 

                 

                for (var p = 0; p < this.numPages; p++)

                 

                {

                var numWords = this.getPageNumWords(p);

                for (var i=0; i<numWords; i++)

                {

                var ckWord = this.getPageNthWord(p, i, true);
                if ( ckWord == "Subject :I/O")

                 

                Then would it make more sense to use "this.extractPages" or "console.println(p);"? Or maybe a combination of them?