2 Replies Latest reply on Jun 19, 2010 3:17 PM by DV22

    Get page numbers from string search

    DV22

      I've found that I can match a certain word and get its page number by looping through the numPages and numWords using getNthWord, but I need to search an entire phrase such as "api reference".

       

      The purpose is that I need to extract certain pages that have certain phrases. My thought was to search, record the page numbers, then extract and save.

       

      In v7+ all search.query's use the search plugin (nifty looking window pane) and if you hover over each match it will tell you the page it is on, so I know the data is there. But after scouring the api, I can't seem to find anything that would help.

       

      Here is my current idea, but as you can surmise for larger documents it takes an awful long time:

       

       

       

      var extrList = [];
      searchIt("Indicate by check");
      app.alert(extrList)


      function searchIt(srchStr) {
          var ckWord, numWords;
          var tmpArr = srchStr.split(" ");
          var tmpLen = tmpArr.length;
          for (var i = 0; i < this.numPages; i++) {
              numWords = this.getPageNumWords(i);
              for (var j = 0; j < numWords; j++) {
                  ckWord = this.getPageNthWord(i,j);
                  for (var t=1; t<tmpLen;t++) {
                      if (j+tmpLen < numWords) { // check for out of bounds
                          ckWord += " "+this.getPageNthWord(i,j+t)
                      }
                  }
                  if (ckWord == srchStr){
                      extrList.push(i)
                  } // end equal if
              } // end numword for
          } // end numpage for
      }

       

       

      I guess my question is, is there a better approach to getting the page numbers from a phrase search?

       

      Thank you.

        • 1. Re: Get page numbers from string search
          try67 MVP & Adobe Community Professional

          Can't answer your question directly, but I can tell you that it's possible

          to use getNthWord to match a phrase. You basically need to do what you did

          so far (a double loop to find the first word of the phrase), and then

          continue matching the following words. If you match all of them, you can a

          positive match. It's complicated, but doable.

          For example, I did just that in this script, which highlights all instances

          of a word or a phrase in a file:

          http://try67.blogspot.com/2008/11/acrobat-highlight-all-instances-of.html

           

          Edit: Didn't see the code when I posted this... It seems you got the hang of it.

          I don't know of any other way to achieve that with a script.

          • 2. Re: Get page numbers from string search
            DV22 Level 1

            Thanks for confirming what I thought, had the idea minutes after posting and wrote that snip real quick. So far thats the only way I see possible without doing something more crazy (like extracting a single page, search, delete, do next page).

             

            Thanks again!