8 Replies Latest reply on Sep 18, 2009 4:33 PM by [Jongware]

    Scripting Annotations

    Kathlene Ruhan Level 3

      My edit work comes from PDF annotations. I work off a server for my ID file but have the annotated PDF local. Switching back and forth is cumbersome - Hoping to find a script available to transfer the annotations from the PDF to the ID file, in position, on a it's own layer - Then, when edit round #2, #3, #4, etc... (I often get 10 edits due to legal) come in, they get a new layer with some sort of naming convention to track the edits. Does anyone know of one which would come close to those requirements that is in production?

        • 1. Re: Scripting Annotations
          [Jongware] Most Valuable Participant

          I agree, it could be hugely useful. Perhaps make it a Feature Request ...?

           

          I did a quick little experiment, and the document you get when "Exporting comments" from Acrobat is a structurally simple PDF-ity file. Unfortunately, its internally referring structure isn't something I could grasp with a single glance (one note and one strikeout yield 5 separate objects). I can imagine a Javascript that reads and parses this file, but, really, I can imagine quite a lot. It'd be easier if the Acrobat engineers got together with the ID crew and together created an 'import comments' function.

          • 2. Re: Scripting Annotations
            [Jongware] Most Valuable Participant

            So maybe it could work... Here is the result of a preliminary "feasability study"

             

            For the moment, I'm ignoring all lines except those defining an object. It seems to be pairs of

            - popup object (containing a reference to next), usually at the outer boundary of your page

            - annotation object -- text, strikeout, whatever -- on its correct place.

             

            I'm ignoring the connection now, but perhaps I could add a line between the two.

             

            Getting values out of the single lines is a bit of a hit-and-miss affair -- parentheses? slashes? values? text encoding? I'd have to think about that.

            This scriptlet puts everything onto a single page, but the Page number is given in the annotations, so that oughta be a possibility.

            Note that the measurements are in points and upside down (in true PDF style). For this to work, you really need an annotated PDF of exactly the same size as your ID document (which is not the case with this hardcoded test!).

             

            I'll have to go over my own needs before continuing, as this seems a sizeable project (unless Harbs or Marc or Laurent or Robin or Dave S. or Peter K. or any other of our band of scripters sees "the challenge" in this!)

             

            app.activeDocument.viewPreferences.horizontalMeasurementUnits = MeasurementUnits.POINTS;
            app.activeDocument.viewPreferences.verticalMeasurementUnits = MeasurementUnits.POINTS;
            myFile = File("D:/Temp/indesign cs4 sdk learning-indesign-plugin-development.fdf");
            myFile.open("r");
            while (myFile.eof == false)
            {
              line = myFile.readln();
              // Expect "digit(s) digit(s) obj"
              var objline = line.match(/^\d+ \d+ obj/);
              if (objline == null)
                continue;
              // Expect "Rect bladibla"
              var rect = line.match(/<<\/Rect\[(\d+\.?\d*) (\d+\.?\d*) (\d+\.?\d*) (\d+\.?\d*)\]/);
              if (rect == null)
                continue;
              // convert to numbers :o|
              // vertically invert, while we're here
              rect[1] = Number(rect[1]);
              rect[2] = app.activeDocument.documentPreferences.pageHeight-Number(rect[2]);
              rect[3] = Number(rect[3]);
              rect[4] = app.activeDocument.documentPreferences.pageHeight-Number(rect[4]);
              
              subtype = line.match(/\/Subtype\/(\w+)/);
              switch (subtype[1])
              {
                case "Popup":
               frame = app.activeDocument.textFrames.add();
               frame.geometricBounds = [ rect[2], rect[1], rect[4], rect[3] ];
               headline = line.match(/\/Subj\(([^)]+)\)/);
               if (headline != null)
                    frame.contents = headline[1];
                  break;
                case "Text":
               frame = app.activeDocument.textFrames.add();
               frame.geometricBounds = [ rect[2], rect[1], rect[4], rect[3] ];
               contents = line.match(/\/Contents\(([^)]+)\)/);
               if (contents != null)
                    frame.contents = contents[1];
                  break;
                case "StrikeOut":
                  line = app.activeDocument.graphicLines.add();
                  line.paths[0].pathPoints[0].anchor = [ rect[1], (rect[2]+rect[4])/2 ];
                  line.paths[0].pathPoints[1].anchor = [ rect[3], (rect[2]+rect[4])/2 ];
                  line.strokeColor = app.activeDocument.swatches.item("Red");
                  break;
              }
            }
            myFile.close();
            
            1 person found this helpful
            • 3. Re: Scripting Annotations
              [Jongware] Most Valuable Participant

              I did some more research. Discovery #1: different versions of Acrobat use different FDF export syntaxes (theoretically, one should check the version number for this).

               

              You can get all kinds of properties out of annotations in the PDF, but exact placement is still a hit-and-miss. A strikethrough, for example, seems to have some internal bounding box apart from the one exported into the FDF. If you simply draw a line from edge to edge, it's too wide and you cannot really see which character or characters should be deleted.

               

              Another 'bad' discovery is that any text is encoded in either plain ASCII or in Unicode; and parentheses, backslashes, and returns are escaped by another backslash, making it hard to extract the 'plain note text' in a straightforward manner. Formatted text is even worse; it uses XHTML markup with loads of "span"s and CSS styling to format it.

               

              Additional problems are that my PDFs for a book are created as separate articles, and the editor returns them concatenated into one huge PDF, as well as imposed onto an A4 (for easier printing, I guess). Well, that could be handled by even more clever scripting -- I hardcoded it for testing purposes.

               

              However, the biggest prob is that the annotations "live" on a separate layer from the text. That means that with the slightest edit, the text may move away from their annotations -- and there is no escaping that. You'd have to start correcting at the end of your file and work towards the beginning. It also makes keeping the annotation layer practically worthless.

              • 4. Re: Scripting Annotations
                [Jongware] Most Valuable Participant

                Odd -- just checking InDesignSecrets.com, I stumbled upon this one month old post -- it seems you are not the only one looking for this feature after all!

                1 person found this helpful
                • 5. Re: Scripting Annotations
                  Kathlene Ruhan Level 3

                  HeyThere!

                   

                  You have been great through this! I have been trying along with you to find a shortcut to paradise, but I think it will be more like following the yellow brick road. I did put in a recommendation to Adobe tough. Thanks for all your help!

                  • 6. Re: Scripting Annotations
                    [Jongware] Most Valuable Participant

                    I always find this cross-application stuff loads of fun. I learned the structure of the FDF files; and did you know JS in Acrobat is as capable as InDesign with manipulating a PDFs contents? I discovered the (or "a") trick to run JS in any PDF: create a button, set its action to "Run Javascript". The Edit button gives you access to a teeny weeny script editor (so much unlike ESTK...), and you can press the button to run the script. Great for looking at notes' properties and other things ...

                     

                    So I added some potentially very useful stuff to my programming arsenal.

                    • 7. Re: Scripting Annotations
                      Marc Autret Level 4

                      Hey, Jong! Did you really parse a PDF internally from InDesign JS? Marvelous!

                       

                      (Well, it's true that we can javascript directly from Acrobat... but Acrobat JS is so boring. I love the idea that InDesign could inject in a PDF file various settings or widgets behaviors that the InDesign UI cannot control for now...)

                       

                      Thank you for opening promising prospects !

                       

                      @+

                      Marc

                      • 8. Re: Scripting Annotations
                        [Jongware] Most Valuable Participant

                        Ah -- not a PDF per se, rather its little brother the FDF. It has a structure similar to, but somewhat simpler than, a "real" PDF. It's a real pity you cannot reliably use GREP matching to flesh out strings -- it'll fail, for example, on the first 'escaped' embedded parenthesis. The basics -- annotation type definition and specs, rectangle positions -- is actually straightforward, for a change.

                         

                        When it first came publically available, I read the original PDF (1.3) specs cover to cover and was able to inject PDF commands into ... DOS WordPerfect 5.1. I also wrote a C program that could read an XML file and a spec sheet (which nowadays would be called an FPO style sheet) and in a single step produce a valid PDF from that. That's why I know my way around PDFs.

                         

                        It is possible to read and parse uncompressed PDFs fairly easy, even with Javascript, but when Flate encoding is used, and you find multiple xrefs scattered all over the file even I give up