2 Replies Latest reply on Nov 3, 2010 9:55 AM by try67

    how to check with extractPages if a file already exsists?

    ingrimm

      Hi Folks,

       

      My company runs a pretty big PDF Project right now. We send out printed PDF's to our Customers for them to check for certain values of their Household. They have to fill out the form (hand-written) and send it back to us. This is where it gets interesting.

       

      After we got all replies, we use a combination of OCR and javascript to get a document-id(or client-id) located in the header of the document. After that we use this id as a filename for the a copy of the scanned file. It "could" happen that the Person that scans the replies messes up a bit and scans more than one document per id (sometimes our clients do not just send the form back but the complete letter as well). The Result is, that the script overwrites the files that were read first with the ones that have been read at last.

       

      This is the script:

       

      //read RWGE Type 08xxxx-xxx-xxxxx-xxx 

      var idNumber = /08\d\d\d\d\-\d\d\d\-\d\d\d\d\d-\d\d\d/g;
      function ExtractFromDocument(reMatch)
      {
          try {
                 var Out = new Object();
                 for (var i = 0; i < 1; i++)
              {
                  numWords = this.getPageNumWords(i);
                  var PageText = "";
                  for (var j = 0; j < 30;j++) {
                      var word = this.getPageNthWord(i,j,false);
                      PageText += word;
                  }
                  var strMatches = PageText.match(reMatch);
                  if (strMatches == null) continue;
              }
              return strMatches;
          } catch(e)
          {
              app.alert("Processing error: "+e)
          }
      }


      var filename = ExtractFromDocument(idNumber);

      //if OCR fails to read the text, assign a random number as filename
      if(filename == null || filename == undefined) {
          filename = Math.round(Math.random()*999999999999);
          this.extractPages({nEnd:(this.numPages-1), cPath : "../GAG-out/"+filename+".pdf"});
      } else {
         this.extractPages({nEnd:(this.numPages-1), cPath : "../GAG-out/"+filename+"-001.pdf"});
      }

      The Complete Process looks like this:
      Open a folder and the current file, read the File with OCR and check if you can find a id
      using that regexp above. If (true), use extractpages to save the complete file in a new Document
      save it using that new filename and close.
      Repeat that until no files are left.

      It is all pretty basic. I need to find a way to get the following running:
      Check if file with the given id+".pdf" already exists. If (true) open that
      file and use extracpages to add the current pages to the end of this document.
      If (false) use extractPages to create a new document.

      This script runs every 2 Weeks for about 2000 files.
      Can I save the ID's inside an Array and use it trough the whole loop?

      greetings from germany :-)

      Michael