6 Replies Latest reply on Mar 8, 2011 2:58 PM by Ian Proudfoot

    How to find x,y coordinates of objects in PDF document

    gopaljay78 Level 1

      Hi,

       

      Can any one help me how to find x,y coordinates of an object in the PDF document using Javascript.

       

      The object could be text or image or color panel. I want to get the x,y position of that object.

       

      Thanks,

      Gopal

        • 1. Re: How to find x,y coordinates of objects in PDF document
          [Jongware]-9BC6tI Level 4

          How much work are you prepared to do in Javascript?

           

          InDesign does not give access to the internal objects in a placed PDF. So, if you really really want to use the combination of InDesign and Javascript, you must:

           

          1. Get hold of the PDF specifications. Don't worry, they're free.

          2. Use Javascript to read your PDF file and parse it into recognizable objects, which means

            2a. you have to use binary read functions (as per PDF specification)

            2b. you have to write a decompression library -- more than one, by the way, as PDF supports several different kinds of compression, and Javascript supports none.

            2c. then you have to implement the PDF coordinate system, which is heavily based upon Postscript-style matrix operations, and supports several independent "layers" of transformations.

           

          Oh, and since you ask about text:

           

          3. You have to write a complete font system in Javascript (again, PDF supports several different kinds of font formats, and you'll have to implement all of them).

           

          Somehow I doubt it's worth all this trouble. Can't you just open the PDF in Illustrator?

          • 2. Re: How to find x,y coordinates of objects in PDF document
            John Hawkinson Level 5

            There are PDFs that Illustrator doesn't quite work so well on (not counting the large class of bitmapped  PDFs that it doesn't work at all on).

             

            A reasonable compromise would be to use some tool that will regurgitate information about the objects on a page in a PDF, and to call that tool from JavaScript. (Of course, to do so you must indirect through Applescript or Visual Basic, as appropriate for Win/Mac). There are several such command-line tools. I think the last time I had a similar application I used pdfminer, a tool written in Python; but my application was somewhat specialized, there are probably other tools that might work better for this case.

            • 3. Re: How to find x,y coordinates of objects in PDF document
              [Jongware]-9BC6tI Level 4

              What an odd tool

               

              It could be this is what the OP ultimately was after (I still don't see what InDesign/scripting has to do with it). Its online demo shows how accurately it can work: a random PDF got converted to several thousands of lines, one for each character in the PDF, and all in the ilk of

               

              <span style="position:absolute; left:113px; top:180px; font-size:13px;">A</span>

               

              Thanks for the tip; seems this Python scripter (!) has done all of the hard work I mentioned above. Definitely something to experiment with.

              • 4. Re: How to find x,y coordinates of objects in PDF document
                [Jongware]-9BC6tI Level 4

                (On further examination: I don't think we'll ever know what the OP thinks of this.

                 

                He didn't follow up on even one of his twenty-something posts, and also didn't bother to declare any of them "answered" to his satisfaction.

                 

                Hard to please, eh? Some points would have been nice, too.)

                • 5. Re: How to find x,y coordinates of objects in PDF document
                  John Hawkinson Level 5

                  Yeah, I know you're in it for the points, Jongware. Not sure what to tell you. Did you know that you can write a JavaScript to give you more points on the forum? TRUE STORY!

                   

                  Perhaps, though, no one has pointed out the point system to him. After all, it's not intuitively obvious that it matters if you dont' spend a lot of time hanging out here...

                  • 6. Re: How to find x,y coordinates of objects in PDF document
                    Ian Proudfoot Level 3

                    A good way to get this sort of information out of a PDF would be to use Adobe's own PDFXML format (was Mars). This gives an archive that presents each page as a separate SVG file. It's much easier than trying to work with a PDF binary file.

                    However it's all gone a bit quiet on the Adobe Labs Mars pages with no update for Acrobat X... Perhaps it's just a dead-end.

                    Ian

                    1 person found this helpful