10 Replies Latest reply: Dec 19, 2012 8:35 AM by stevejking RSS

    Importing one Pdf document into another.

    DDahlgren Community Member

      I am trying to import one Pdf-document into another using a Reference XObject. I am looking at thePDF- specification, and I thought I was doing things the right way, but its not working. The PDF-specification doesen't have an example of how to do this, only an explanation.

      Below is the PDF-file I am testing with (if anyone can be bothered looking at it   ). The file embeds another pdf-file, and is supposed to show a part of the embedded file at a certain position.

      Again, if anyone can be bothered looking at my PDF and see if they can see what I am doing wrong here I would appreciate it.

      (There is nothing wrong with the embedded file itself...)

       

      PDF-file embedding another PDF-file and using a Reference XObject to display the embedded file:

       

      %PDF-1.5

      1 0 obj

      <<

      /Type /XObject

      /Subtype /Form

      /BBox [0 0 100 100]

      /Ref <<

      /F (4fa27162e72547a00771606.pdf)

      /Page 1

      >>

      >>

      endobj

      2 0 obj

      <<

      /Type /Filespec

      /F (4fa27162e72547a00771606.pdf)

      /EF <</F 3 0 R>>

      >>

      endobj

      3 0 obj

      <<

      /Type /EmbeddedFile

      /Length 854

      >>

      stream

      %PDF-1.5

      1 0 obj

      <<

      /Length 334

      >>

      stream

      q

      1 0 0 1 128.10769621539 70.591821183642 cm

      1 0 0 1 0 0 cm

      1 0 0 1 0 0 cm

      1 0 0 1 0 0 cm

      1 0 0 1 -128.10769621539 -70.591821183642 cm

      /DeviceCMYK CS

      1 1 1 1 SCN

      /DeviceCMYK cs

      0 0.89 0.89 0.05 scn

      0.2743205486411 w

      2.743205486411 M

      0 J

      0 j

      -1.9202438404877 -2.6517653035306 260.05588011176 146.48717297435 re

      h

      B

      Q

      endstream

      endobj

      2 0 obj

      <<

      /Type /Page

      /Parent 3 0 R

      /Resources <<

      /Font <<>>

      /XObject <<>>

      >>

      /MediaBox [0 0 255.11811023622 141.73228346457]

      /Contents [1 0 R]

      >>

      endobj

      3 0 obj

      <<

      /Type /Pages

      /Kids [2 0 R]

      /Count 1

      >>

      4 0 obj

      <<

      /Type /Catalog

      /Pages 3 0 R

      >>

      endobj

      xref

      0 5

      0000000000 65535 f

      0000000010 00000 n

      0000000401 00000 n

      0000000568 00000 n

      0000000624 00000 n

      trailer

      <<

      /Size 5

      /Root 4 0 R

      >>

      startxref

      679

      %%EOF

      endstream

      endobj

      4 0 obj

      <<

      /Length 322

      >>

      stream

      q

      1 0 0 1 93.268986537973 71.826263652527 cm

      1 0 0 1 0 0 cm

      1 0 0 1 0 0 cm

      1 0 0 1 0 0 cm

      1 0 0 1 -93.268986537973 -71.826263652527 cm

      55.961391922784 45.628651257302 74.615189230378 52.39522479045 re

      W n

      1 0 0 1 55.961391922784 42.062484124968 cm

      74.615189230378 0 0 55.961391922784 0 0 cm

      /XobjectPDF1 Do

      Q

      endstream

      endobj

      5 0 obj

      <<

      /Type /Page

      /Parent 6 0 R

      /Resources <<

      /Font <<>>

      /XObject << /XobjectPDF1 1 0 R >>

      >>

      /MediaBox [0 0 255.11811023622 141.73228346457]

      /Contents [4 0 R]

      >>

      endobj

      6 0 obj

      <<

      /Type /Pages

      /Kids [5 0 R]

      /Count 1

      >>

      7 0 obj

      <<

      /Type /Catalog

      /Pages 6 0 R

      >>

      endobj

      xref

      0 8

      0000000000 65535 f

      0000000010 00000 n

      0000000144 00000 n

      0000000238 00000 n

      0000001172 00000 n

      0000001551 00000 n

      0000001738 00000 n

      0000001794 00000 n

      trailer

      <<

      /Size 8

      /Root 7 0 R

      >>

      startxref

      1849

      %%EOF

        • 1. Re: Importing one Pdf document into another.
          lrosenth Adobe Employee

          A Reference XObject is used for referring to ONE PAGE of another PDF – usually a completely separate file, though it could also be embedded.  It is not a well supported feature of PDF – it requires Acrobat/Reader 9 or later.  I am not aware of any other PDF viewer that supports it ☹.

           

          It’s extremely hard to read/follow what you post.  If you want a PDF looked at – please post a link to an actual PDF.

          • 2. Re: Importing one Pdf document into another.
            khkremer ACP

            PDF is a binary format - even tough sometimes it looks like straight ASCII

            when you open it in an editor. This means that we cannot really do anything

            with what you've posted. If I would save it and open it in my debug tool,

            it would report a corrupt file.

             

            Reference XObjects are a pretty tricky subject, and a lot of planets have

            to perfectly align to make them work. Take a look at this blog post from

            four years ago that explains what you need to do in terms of setting up

            Reader or Acrobat, but also has links to sample files in the comments:

             

            https://blogs.adobe.com/ReferenceXObjects/2008/06/reference_xobjects_1.html#comments

             

             

            Karl Heinz Kremer

            PDF Acrobatics Without a Net

             

            khk@khk.net

            http://www.khkonsulting.com

            • 3. Re: Importing one Pdf document into another.
              DDahlgren Community Member

              Thanks for your reply.

               

               

              If Reference XObjects are not well supported, are there any other ways of easily importing one PDF into another?

               

               

              I am still curious obout what I am doing wrong, thought

              • 4. Re: Importing one Pdf document into another.
                DDahlgren Community Member

                PDF is a binary format - even tough sometimes it looks like straight ASCII

                when you open it in an editor.

                This means that we cannot really do anything

                with what you've posted. If I would save it and open it in my debug tool,

                it would report a corrupt file.

                 

                 

                 

                I don't get that. According to the specification "A non-encrypted PDF can be entirely represented using byte values corresponding to the visible printable subset of the character set defined in ANSI X3.4-1986, plus white space characters."

                 

                Binary data can of course be included in the PDF, but again, "The tokens that delimit objects and that describe the structure of a PDF file shall use the ASCII character set."

                (From section 7.2.1 General)

                 

                My code does not contain any binary data, so if I copy and paste the code into any text editor and save it as a ASCII text file, it opens fine in the PDF-readers I have tried.

                 

                I know it may look a little messy like this. Sorry about that, but I just thought it was an easy way of doing it.

                • 5. Re: Importing one Pdf document into another.
                  khkremer ACP

                  PDF is a binary format. The most important piece in a PDF file is the cross

                  reference table. It contains byte offsets to the different objects in a

                  file. If you insert just one space (which in a true ASCII file will not

                  change much), you are making all byte offsets after the insertion point

                  invalid. They now all point to something different than the start of an

                  object.

                   

                  Because there is no standard about how line endings are encoded in text

                  files, every time may add (or subtract) a byte - depending on what computer

                  system and what editor you use to convert your PDF to fake-ASCII, and then

                  again what I use to try to convert it back to PDF. If you are using a

                  Windows system and I'm using a Mac, the cross reference table gets

                  corrupted. And that's even before the web server does it's magic and

                  potentially adds data.

                   

                  If you want somebody to look at your file.

                   

                  Have you looked at the post I linked to? The settings in Acrobat are

                  important. You need to enable Reference XObjects, AND you have to declare

                  the directory where your target file is stored as trusted.

                   

                  From what I can see in your PDF code, you are doing a whole bunch of cm

                  operations. You may want to consolidate all those into one operation. I

                  have not done a detailed analysis, but you could potentially shift your

                  data off the page. Keep it simple and see if that makes a difference.

                   

                  Karl Heinz Kremer

                  PDF Acrobatics Without a Net

                   

                  khk@khk.net

                  http://www.khkonsulting.com

                  • 6. Re: Importing one Pdf document into another.
                    DDahlgren Community Member

                    PDF is a binary format. The most important piece in a PDF file is the cross reference table. It contains byte offsets to the different objects in a file. If you insert just one space (which in a true ASCII file will not change much), you are making all byte offsets after the insertion point invalid. They now all point to something different than the start of an object.

                     

                    True, you can't edit the file without editing the xref-table. But as long as you don't edit the file without editing the xref-table, and as long as you don't save in a wrong encoding, my code should work fine.

                    (I have tried to copy and paste it from the post without any problems.) 

                     

                    Thanks for even bothering to look at the code, though.

                    The reason there are so many 'cm' operations there is because the pdf-is generated from a little program I made.

                    The 'cm'operations don't have any effect in this case thought, because in the posted code they always start with a certain translation, followed by 4 "identity matrixes"  (1 0 0 1 0 0 cm), before ending with being translated back again.

                    (I know the 'cm' s are not the problem because I can do the same operations with an image XObject instead of an Reference XObject, and then things work fine...)

                    • 7. Re: Importing one Pdf document into another.
                      lrosenth Adobe Employee

                      And as Karl explained, we don’t know what line ending format (Mac vs. Win vs. Unix) you used, so that the number of characters could be different thus breaking the xref.

                       

                      Bottom line – if you don’t post an actual PDF file, we are unable to review it.

                      • 8. Re: Importing one Pdf document into another.
                        DDahlgren Community Member

                        Ok, I uploaded it on a filesharing site. Hope it works.

                        This should be the link

                        http://i.minus.com/1336142392/P0Wc6W01M9-bfpohRCTqCg/dbxLJXYihJSUjU.pdf

                        • 9. Re: Importing one Pdf document into another.
                          khkremer ACP

                          That link does not work.

                           

                          How many pages do you have in your reference document? You are trying to

                          pull in the second page. If this is only a one page document, it will of

                          course not work. Remember that page numbers in the PDF world are zero

                          based. Assuming that your referenced document does indeed have a 100x100pt

                          page as indicated by your BBox statement, the following content stream

                          should give you a correctly placed reference XObject:

                           

                          q

                          1 0 0 1 0 0 cm

                          /XobjectPDF1 Do

                          Q

                           

                          Again, you need to make sure that your directory is correctly configured in

                          Acrobat's preferences, and that you have Reference XObjects enabled for all

                          documents (and not just PDF/X5, which is the default).

                           

                          Karl Heinz Kremer

                          PDF Acrobatics Without a Net

                           

                          khk@khk.net

                          http://www.khkonsulting.com