19 Replies Latest reply on Jul 17, 2009 11:59 PM by colourit2000

    How can I get a total page count of a PDF before placing every page in the PDF?

    Level 1
      Before the [long] spiel, I'm using javascript in InDesign CS3.

      I'd like to create a script that places a multiPage PDF in any number of different impositions. (saddle stitch, 4up signatures, 16up signatures etc.)

      I've easily created a script that iterates through a PDF and places it into a new document as (4,1|2,3), (8,5|6,7) etc, which works for printing in duplex, folding each page in half, and gluing the resulting spines to make a simple thick book (for PDFs with more than, say, 64 pages).

      However, the next step is to re-write the script to create a saddle stitch document (16,1|2,15), (14,3|4,13) ... (10,7|8,9). For this I need to know how many pages there are in the PDF before I start placing the PDF pages, and then making the new document [int((PDFpages+3)/4)] pages long.

      Is there a simple way to get the count of PDFpages without going through a loop and placing the next page from the PDF until the placed page number equals the first page number?

      This way seems wasteful:

      var totPDFPages = 1;
      app.pdfPlacePreferences.pageNumber = totPDFPages;
      myPDFPage = sheetFront.place(File(srcPDF), [0,0])[0];
      var pdfFirstPage = myPDFPage.pdfAttributes.pageNumber;
      while (doneCounting == false) {
      totPDFPages += 1;app.pdfPlacePreferences.pageNumber = totPDFPages;
      myPDFPage = sheetFront.place(File(srcPDF), [0,0])[0];
      if (myPDFPage.pdfAttributes.pageNumber == pdfFirstPage) {
      totPDFPages -=1;
      doneCounting = true;
      alert("PDF has " + totPDFPages + " pages!");exit();
      };
      myPDFPage.remove();
      };

      NB. Javascript above *hasn't* been run, but should look similar once debugged.

      The only thing I've though of to relieve the sheer duplication of placing the PDF twice (once for the count, and once for the imposition), is to create an array of impoPages[counter]=myPDFPage, and then shuffle the pages referenced by the array to the correct sheet and position.

      It'd be much easier to be able to assign pageCount = File(srcPDF).pageCount !!!

      Thanks for any help/tips or even a simple "What are you smoking, man!?"

      Cheers,
      Jezz
        • 1. Re: How can I get a total page count of a PDF before placing every page in the PDF?
          Level 1
          hi

          easiest way - ask user to type number of pages in PDF to impose ;) I do this in my tool ;)

          but if you really need to do this automagically - try to place pages, for example, for step = 100 ?
          place 100 - if placed (no error) place 200, ...
          if after placing 300 you have error - place 250, 275, 262, ...

          divide range when error - 100 / 50 / 25 / 12+13 / 6 (6+7) / 3 (3+4) / 1 (2+1)

          or try to open PDF in Acrobat

          robin

          --
          www.adobescripts.com
          • 2. Re: How can I get a total page count of a PDF before placing every page in the PDF?
            Level 1
            There's another thing that is bugging me...<br /><br />With this code: pdfFirstPage = myPDFPage.pdfAttributes.pageNumber, what happens if (bad practice, but it *could* happen) there are multiple pages with the same folio. <br /><br />I suppose I should iterate through a PDF with alert("Page Number "+ myPDFPage.pdfAttributes.pageNumber); but does myPDFPage.pdfAttributes.pageNumber return a 'real' page Nº like 'Sec 1: iv'  or just a number equivalent to the position in the PDF?<br /><br />Getting off the subject at hand...<br />Does anyone regularly use the Section Prefix?? I've never used it, but wished and hoped for at least *two* 'Section Markers'. Master Pages would benefit from this scenario: "Chapter <marker 1>  <marker 2>  <autoPageNumber>  given that SectionMarkers are treated as one character (making nested styles useless), and therefore causing grief if that 'one' character decides to be much longer than the text box thats holds it... -20,000 kerning, anyone?? /sigh/<br /><br />And a further wish to add to the list: In the Page palette > "View spread at 90°" to allow for a fold across the width of the spread (Calendars, Timetables etc.). Indesign CS4, please!
            • 3. Re: How can I get a total page count of a PDF before placing every page in the PDF?
              Peter Kahrel Adobe Community Professional & MVP
              To find the number of pages in a PDF file I use this functions:

              pp = get_number_of_pages ( File('/d/romani/18-1/rs-18-1.pdf'))
              

              function get_number_of_pages (f)
                 {
                 if (f.exists)
                    {
                    f.open ('r');
                    next_line = f.readln ();
                    while ( next_line.indexOf ('/N ') < 0 )
                       next_line = f.readln ();
                    var p = next_line.match (/\/N (\d+)\/T/)[1]
                    f.close ()
                    return Number(p)
                    }
                 else
                    {
                    alert (f.name + ' does not exist.')
                    exit()
                    }
                 }


              Peter
              1 person found this helpful
              • 6. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                Harbs. Level 6
                Gerald_Singelmann@adobeforums.com wrote:
                > Wicked idea, Peter. I like it :)
                >
                I second that. Very nice!

                Harbs
                • 8. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                  Eric @ MCA Level 3
                  FWIW, if you're on a Mac, Spotlight can tell you this info too. I work in Applescript, but if there's any way to trigger a shell command from JS the theory is the same:

                  --Get result from Spotlight
                  set myResult to do shell script "mdls -name kMDItemNumberOfPages '" & POSIX path of myFile & "'"

                  --Split result on space equals space and grab the second word,
                  -- which should be the page count
                  set oldDelims to AppleScript's text item delimiters
                  set AppleScript's text item delimiters to " = "
                  set myPagesCount to second text item of myResult
                  set AppleScript's text item delimiters to oldDelims

                  As near as I can tell, Spotlight is responsible for all that info in the "More Info" section of the Get Info panel in the Finder.
                  • 9. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                    Level 1
                    Superb!

                    I'm very comfortable with GREP in BBEdit, but really need to expand that into javascript.

                    That snippet of javascript to count the number of pages in a PDF is the sort of lateral thinking I love!

                    Once again, Peter, you're a marvel.

                    --Jezz
                    • 10. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                      Level 1
                      Hi Peter,

                      I found that your method doesn't always work as it is...

                      The header from "InDesign CS2 Scripting Reference.pdf" looks like this:
                      %PDF-1.5
                      %
                      1 0 obj<</Contents 2 0 R/Type/Page/Parent 285639 0 R/Rotate 0/MediaBox[0.0 0.0 612.0 792.0]/CropBox[0.0 0.0 612.0 792.0]/BleedBox[0.0 0.0 612.0 792.0]/TrimBox[0.0 0.0 612.0 792.0]/ArtBox[0.0 0.0 612.0 792.0]/Resources<</Font<</C0_0 5899 0 R/C0_1 286131 0 R>>/ProcSet[/PDF/Text]/ExtGState<</GS0 286123 0 R>>>>/StructParents 4>>
                      endobj
                      2 0 obj<</Length 3252/Filter/FlateDecode>>stream

                      However, about three quarters through the PDF there appears a line:

                      285635 0 obj<</Count 1928/Kids[285636 0 R 285792 0 R 285948 0 R]/Type/Pages>>

                      It's not the only /Count object, but it *is* the only occurence of a line containing "/Pages>>", so it's not all doom and gloom. :)

                      
                      
                      totPDFPages = getPDFPageCount(File.openDialog("Choose a PDF File"));

                      function getPDFPageCount(f) {
                        f.open ('r');
                        var gotCount = false;
                      while (! gotCount) {
                        try {next_line = f.readln();}
                         catch (myError) {
                          alert("We've got an error '"+myError+"\pAborting the script");
                          exit();
                          }
                        if (next_line.indexOf ('/N ') > 0) { // We've got the easy sort of PDF
                         var p = next_line.match (/\/N (\d+)\/T/)[1];
                         alert("Found a '/N' style PDF, with "+p+" pages");
                         gotCount = true;
                         }
                        else if  (next_line.indexOf ('/Pages>>') > 0 ) {  // We probably had to read nearly to the end of the file for the match...
                         var p = next_line.match (/\/Count (\d+)\/K/)[1];
                         alert("Found a '/Count ... /Pages>>' style PDF, with "+p+" pages");
                         gotCount = true;
                         }
                        }
                      f.close ();
                      return Number(p);
                        }


                      --Jezz
                      • 11. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                        Level 1
                        aagh!

                        My try{} catch{} doesn't abort the script if it doesn't find either of the matches, and appears to just keep on trying to read. (endless loop :( )

                        I need to fix for reading past the end of the file.

                        --Jezz
                        • 12. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                          Level 1
                          this catches reading past the end of the file:

                          function getPDFPageCount(f) {
                          
                          
                          f.open ('r');
                            var gotCount = false;
                          while (! gotCount) {
                            next_line = f.readln();
                            if  ( f.eof ) {alert("Aborting the script\nWe've got to the end of the file without finding a page count");
                             f.close();
                             exit();
                             }
                            if (next_line.indexOf ('/N ') > 0) { // We've got the easy sort of PDF
                             var p = next_line.match (/\/N (\d+)\/T/)[1];
                             alert("Found a '/N' style PDF, with "+p+" pages");
                             gotCount = true;
                             }
                            else if  (next_line.indexOf ('/Pages>>') > 0 ) {  // We probably had to read nearly to the end of the file for the match...
                             var p = next_line.match (/\/Count (\d+)\/K/)[1];
                             alert("Found a '/Count ... /Pages>>' style PDF, with "+p+" pages");
                             gotCount = true;
                             }
                            }
                          f.close ();
                          return Number(p);
                            }

                          • 13. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                            Peter Kahrel Adobe Community Professional & MVP
                            Well spotted! I had expected to come across PDF files that had their page count encoded in a different way, but I never have so far. I'll keep your solution in mind.

                            Thanks,

                            Peter
                            • 14. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                              Level 1
                              this will get almost all pdf pages count.

                              jxswm
                              ///////////////////////////


                              function getPDFPageCount(f){
                                if(f.alias){f = f.resolve();}
                                if(f == null){return -1;}
                                if(f.hidden){f.hidden = false;}
                                if(f.readonly){f.readonly = false;}
                                f = new File(f.fsName);
                                f.encoding = "Binary";
                                if(!f.open("r","TEXT","R*ch")){return -1;}
                                f.seek(0, 0); var str = f.read(); f.close();
                                if(!str){return -1;}
                                //f = new File(Folder.temp+"/123.TXT");
                                //writeFile(f, str.toSource()); f.execute();
                                var ix, _ix, lim, ps;

                                ix = str.indexOf("/N ");
                                if(ix == -1){
                                  var src = str.toSource();
                                  _ix = src.indexOf("<< /Type /Pages /Kids [");
                                  if(_ix == -1){
                                    ps = src.match(/<<\/Count (\d+)\/Type\/Pages\/Kids\[/);
                                    if(ps == null){
                                      ps = src.match(/obj <<\\n\/Type \/Pages\\n\/Count (\d+)\\n\/Kids \[/);
                                      if(ps == null){
                                        ps = src.match(/obj\\n<<\\n\/Type \/Pages\\n\/Kids \[.+\]\\n\/Count (\d+)\\n\//);
                                        if(ps == null){return -1;}
                                        lim = parseInt(ps[1]);
                                        if(isNaN(lim)){return -1;}
                                        return lim;
                                      }
                                      lim = parseInt(ps[1]);
                                      if(isNaN(lim)){return -1;}
                                      return lim;
                                    }
                                    lim = parseInt(ps[1]);
                                    if(isNaN(lim)){return -1;}
                                    return lim;
                                  }
                                  ix = src.indexOf("] /Count ", _ix);
                                  if(ix == -1){return -1;}
                                  _ix = src.indexOf(">>", ix);
                                  if(_ix == -1){return -1;}
                                  lim = parseInt(src.substring(ix+9, _ix));
                                  if(isNaN(lim)){return -1;}
                                  return lim;
                                }
                                _ix = str.indexOf("/T", ix);
                                if(_ix == -1){
                                  ps = str.match(/<<\/Count (\d+)\/Type\/Pages\/Kids\[/);
                                  if(ps == null){return -1;}
                                  lim = parseInt(ps[1]);
                                  if(isNaN(lim)){return -1;}
                                  return lim;
                                }
                                lim = parseInt(str.substring(ix+3, _ix));
                                if(isNaN(lim)){return -1;}
                                return lim;
                              }
                              • 15. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                                Level 1
                                Hi all!

                                I am writeing a script in VB that take a pdf file and place the
                                pages acording to the right position and so on.

                                Thank you all for your scripts and ideas.

                                I have a question.

                                What if the pdf contains pages that do not have the same width or height. It hapened to me with some pdf files.

                                So eaven it is slow I'm placeing all the pdf pages (and counting them) and at the same time checking the width and the height to
                                be dhe same in every one of them.

                                I'm scripting for InDesign CS2

                                Nice Coding!
                                • 16. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                                  Level 1
                                  Hi again!

                                  Wel that seams to slow, if you have to many pages It doesn't pay.

                                  While trying to find something for VB, insted of geting and paying for a PDF page counter DLL, searching the internet I come up with this code.

                                  '===========================================
                                  OpenFileDialog1.ShowDialog()

                                  Dim FileName As String
                                  FileName = OpenFileDialog1.FileName
                                  Dim result As Integer

                                  Dim fileReader As System.IO.StreamReader
                                  fileReader = My.Computer.FileSystem.OpenTextFileReader(FileName)
                                  Dim pdfText As String
                                  pdfText = fileReader.ReadToEnd

                                  Dim regx As New Regex("/Type\s*/Page[^s]")
                                  Dim matches As System.Text.RegularExpressions.MatchCollection
                                  matches = regx.Matches(pdfText)
                                  result = matches.Count

                                  MsgBox(result.ToString)
                                  '============================================

                                  For someone that use VB might be usefull.

                                  Nice coding!
                                  • 17. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                                    Level 1
                                    Hi Ervin,

                                    Take a look at the PlaceMultipagePDF.vbs script that comes with InDesign CS3--it solves this problem without attempting to read the PDF, and may already do most of what you want.

                                    The trouble with opening the PDF and using RegExp to try to get the page count is that you are not guaranteed a.) that you can open the PDF, and b.) that there will be only one instance of the page count in the PDF. The internal structure of PDFs can be quite complicated--especially if multiple PDFs have been merged.

                                    Thanks,

                                    Ole
                                    • 18. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                                      Level 1
                                      Hi Olav,

                                      You are right about a) and b).

                                      What we all want, as far as I'v understud is to get the
                                      PDF total number of pages WITHOUT having to place each of them,
                                      in order to count.

                                      Nice coding!
                                      • 19. Re: How can I get a total page count of a PDF before placing every page in the PDF?
                                        colourit2000

                                        Thank you, Eric for the solution in AS.

                                         

                                        Although the post is already more than a year old, I just happen to look for it recently.

                                         

                                        Tks!