28 Replies Latest reply: Jan 28, 2013 3:02 PM by Moris.Mihailidis RSS

    PDF Size will increase in size dramatically with every submit.

    tarekahf Community Member

      I have a PDF Form desinged using Adobe LiveCycle Desinger ES2.

       

      It has a submit button which will submit the form to the server (IIS and ASP.NET) using this javascript command:

       

       

      event.target.submitForm( {cURL: "http://server/ASPNETWebPage.ASPX", aPackets:["datasets","pdf"], cSubmitAs: "XDP"});

       

      On the server, from ASP.NET, I use the following code to extract the submitted "chunk" element and convert it from Base64 to Binary PDF File:

       

                  fs = New System.IO.FileStream(mFormFileNameFolder, IO.FileMode.Create)
                  bw = New System.IO.BinaryWriter(fs)
                  ' Get chunk element form the submitted XML
                  Dim srChunk As New StringReader(mXML.GetElementsByTagName("chunk")(0).InnerXml)
                  Do While True
                      Dim theChunkLine As String
                      theChunkLine = srChunk.ReadLine
                      If Not String.IsNullOrEmpty(theChunkLine) Then
                          theReadBytes = theChunkLine.Length
                      Else
                          theReadBytes = 0
                          Exit Do
                      End If
                      Dim theBase64Length = (theReadBytes * 3 / 4)
                      Dim buffer() As Byte
                      buffer = Convert.FromBase64String(theChunkLine)
                      bw.Write(buffer)
                  Loop
                  bw.Close()
                  bw = Nothing
                  fs.Close()
                  fs = Nothing

       

       

      The above code is working fine, and PDF is generted successfully.

       

      I have one problem.

       

      With every submit, the generated PDF Size will increase dramatically. I reported this to Adobe Support, and they cofirmed that this is by desing and that with every submit, the previous PDF State is saved, and the new state is added. That is why I get huge PDF File.

       

      I was told that the only way to solve this problem is to submit the form as PDF ONLY, and after I save the PDF File on a file system, I then must use Adobe Service/Process "exportData" to extract the XML Data from the PDF.

       

      I think this is really big change to me. I was hoping that there is a way to indentify the latest PDF State from the chunk element.

       

      Any help will be greatly appreciated.

       

      Tarek.

        • 1. Re: PDF Size will increase in size dramatically with every submit.
          pguerett techies

          Are you submitting as XDP because you want the data and PDF separately? If not why not just submit the data and leave the PDF out of it. You can change the cSubmitAs parameter to XML and then you will get data only. Are there signatures involved in this scenario? Do you have LiveCycle Server at the back end.

           

          Paul

          • 2. Re: PDF Size will increase in size dramatically with every submit.
            tarekahf Community Member

            Thanks Paul,

             

            The main idea of using Adobe PDF is to save the result as PDF on the server with all digital signature. Otherwise, I will use HTML Forms or ASP.NET Forms.

             

            I am now looking for a method to remove all unwanted bytes from the chunk element, and keep only the minimum.

             

            Appreciate it if any one can help.

             

            Tarek.

            • 3. Re: PDF Size will increase in size dramatically with every submit.
              pguerett techies

              So it is the signatures themselves that is making the size of the PDF grow ......there is nothing I can do about that.

               

              Paul

              • 4. Re: PDF Size will increase in size dramatically with every submit.
                tarekahf Community Member

                No, it is not the signatures which is causing the problem. Even if I so not add any signature, then the size would still increase.

                 

                Tarek

                • 5. Re: PDF Size will increase in size dramatically with every submit.
                  pguerett techies

                  Then that makes no sense .....the file shoudl grow slightly as more data is added but the signatures will cause a copy of the pdf to be saved so you can compare to the pre signature version and that is generally what cause the pdf size to grow large. Is the file Reader Extended?

                   

                  Paul

                  • 6. Re: PDF Size will increase in size dramatically with every submit.
                    tarekahf Community Member

                    Yes, the file is Reader Extended.

                     

                    According to Adobe Support, only becuase I am using this command:

                     

                    event.target.submitForm( {cURL: "http://server/ASPNETWebPage.ASPX", aPackets:["datasets","pdf"], cSubmitAs: "XDP"});

                     

                    the above command will send the XML Data and the PDF curent state, and the last state of the PDF embedded (before any change). So basically, the PDF size will double with every submit.

                     

                    I was told, if I send only PDF (without XDP) then only the last PDF state is sent. But then, I have to use another service/process "exportData" to extract the XML Data from the PDF.

                     

                    I have never used "exportData" from .NET. So, I am now looking for a way to extract only the last stafe of the PDF from the submitted XDP.

                     

                    Tarek.

                    • 7. Re: PDF Size will increase in size dramatically with every submit.
                      pguerett techies

                      What version of Reader Extensions are you using .......the last state of the PDF shoudl not be changed if you have not signed it yet!

                       

                      Paul

                      • 8. Re: PDF Size will increase in size dramatically with every submit.
                        tarekahf Community Member

                        I am using Adobe LiveCycle Reader Extnesions Server ES2. I can send you a short screenr.com video I recorded to show you the submission process and how the size is doubling !!

                         

                        I can send you a private message with link to the video.

                         

                        Tarek

                        • 9. Re: PDF Size will increase in size dramatically with every submit.
                          pguerett techies

                          No need ......I am not doubting that it is doubling in size ...just trying to figure out why.

                           

                          There was an issue at one time where Reader Extensions was affecting the size of the PDF but that was in earlier versions than the one that you have.

                           

                          Have you submitted just the PDF and if so does the size get affected?

                           

                          Paul

                          • 10. Re: PDF Size will increase in size dramatically with every submit.
                            tarekahf Community Member

                            Do you mean that I should try to submit the form as PDF Only ?

                             

                            If so, I will add another button that will submit the PDF to a test URL, but I don't know how it will be received on the server ! Will it be received as stream of binary or text ? How to generate the PDF file on the server ?

                             

                            I will give it a try.

                             

                            Tarek.

                            • 11. Re: PDF Size will increase in size dramatically with every submit.
                              tarekahf Community Member

                              I did a quick test, and I used the following javascript command:

                               

                              event.target.submitForm( {cURL: "http://server/ASPNETWebPage.ASPX", cSubmitAs: "PDF"});

                               

                              and on the server, using ASP.NET, I converted the input stream to a byte array, and saved the result as binary to a file stream. The result was a working PDF file and the size was OK. I did several submits with few changes, and the size was increasing by only 10-20 bytes max.

                               

                              So, if there is no way to use this command:

                               

                              event.target.submitForm( {cURL: "http://server/ASPNETWebPage.ASPX", aPackets:["datasets","pdf"], cSubmitAs: "XDP"});

                               

                              and be able to generate a PDF with reasonable size with every submit, then this means I have to use {cSubmitAs:"PDF"} and I have to look for a way to extract the XML Data form the PDF using .NET.

                               

                              Appreciate your help.

                               

                              Tarek.

                              • 12. Re: PDF Size will increase in size dramatically with every submit.
                                Chuck Myers (Adobe) Community Member

                                Did you catch my comment on your recorded video. assuming that you're the same person?  The issue that I saw there is that you were including images in the form data, and that they image was a 1.9MB TIFF file. Each image that you include will become part of the saved PDF. You should use an appropriately sized image. 

                                 

                                And yes, Lee, Paul and I have all been talking about this issue.

                                • 13. Re: PDF Size will increase in size dramatically with every submit.
                                  tarekahf Community Member

                                  Hi Chuck,

                                   

                                  Thanks for the feedback.

                                   

                                  I was recording the video over VPN Connection, and that is why there is no sound, and the mouse was moving in a funny way.

                                   

                                  Yes, I know I am using large images. But the problem is still there. If I use smaller size images, the problem will also be there. Even if I don't use any image, the size will increas in doubles with every submit.

                                   

                                  I just used large images to see how far the process can go without breaking. I have received reports from various users that they are getting OutOfMemoryException. When I analysed the situation, I discovered the root casue which is the topic of this thread. Later, I decided to change the method for converting the "chunck" element form Base64 string to Binary and I used buffering to avoid this error, and I succeeded.

                                   

                                  Now, I am not getting "OutOfMemoryException", but the size will continue to increase with every submit.

                                   

                                  I am working now on new porject for Staff Appraisal and it involves 4 users: Staff, Manager, Director, and HR. Each one will have to submit the form at least 3 times (in one callendar year), and each time they have to sign the form. I need to do something now, in order to solve the root cause of the problem for the new porject. This new project is critical and upper management are watching !

                                  Tarek.

                                  • 14. Re: PDF Size will increase in size dramatically with every submit.
                                    pguerett techies

                                    Going back to our last test where you submitted just the PDF and saw an insignificant increase in size. This should be no different than the submission as an XDP (except the XDP will have all of the data as well) so it shoudl be a few K bigger). Chuck has mentioned the use of images ......are you including the images in your data stream? Are you also including the template in your data stream. If you are unsure can you write the inbound XDP file for a couple of submissions to separate files and send me the results. We can have a look at them here and see where the file size is coming from. You can send the files to LiveCycle8@gmail.com

                                     

                                    Paul

                                    • 15. Re: PDF Size will increase in size dramatically with every submit.
                                      tarekahf Community Member

                                      Thanks Paul,

                                       

                                      When I did the test for submitting only the PDF, I forgot to test with the same large images. This is the new server-side code I used to generate the PDF:

                                       

                                          Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
                                              Dim thePDFStreem As New System.IO.BinaryReader(Page.Request.InputStream)
                                              Dim thePDFBytes() As Byte
                                              Dim theFile As New System.IO.FileStream(MapPath(".") & "\" & "thePDFFile.pdf", IO.FileMode.Create)
                                              Dim theWriter As New System.IO.BinaryWriter(theFile)
                                              Dim theURL As String
                                              thePDFBytes = thePDFStreem.ReadBytes(Page.Request.InputStream.Length)
                                              theWriter.Write(thePDFBytes)
                                              theWriter.Close()
                                              theFile.Close()
                                              theURL = CSLAIDB.Library.GetWebRoot() & System.IO.Path.GetFileName(theFile.Name)
                                              thePDFLink.NavigateUrl = System.IO.Path.GetFileName(theFile.Name)
                                              Response.AppendHeader("Refresh", "2; URL=" & theURL)
                                          End Sub

                                       

                                       

                                      I did the test again same like before, and the size was increasing with about 2MB with every submission, even if I make a small change.

                                       

                                      What do you mean by "send you the inbound XDP" ?

                                       

                                      Do you want me to send the result PDF File ? Or the result XDP File in text format ?

                                       

                                      Note: The support team has the PDF Samples I generated: the origianl size, and the the one with huge size.

                                       

                                      Tarek.

                                      • 16. Re: PDF Size will increase in size dramatically with every submit.
                                        pguerett techies

                                        Then lets leave it with support and let them get to the root of the issue.

                                         

                                        Paul

                                        • 17. Re: PDF Size will increase in size dramatically with every submit.
                                          Chuck Myers (Adobe) Community Member

                                          I've been in touch with the support person in Edinburgh and looked at your files and your various attempts to make the size smaller. 

                                          Simply stated, it really is working as designed, but it is difficult to appreciate that unless you go a bit deeper into the file. 

                                           

                                          Or, to say this another way, it grows dramatically in size because you have added dramatic amounts of data.

                                           

                                          So, using your example of this form with a 1.9MB TIF image embedded as a inch-square thumnail five times,  I picked up most of this information using publicly available tools, such as the document font list (document properties/ fonts), text editors, and Windjack's Canopener.

                                           

                                          I'll give you a few metrics and comments which may help:

                                           

                                          1. The Base PDF file size is about 1.4MB.  Much of this is because of your embedded fonts which take over 1.1MB
                                          2. Your form is a reader-extended dynamic XFA form.  That means that the PDF itself does not contain the real pages as PDF marking operators...  It's generated each time you open it in Reader from the XFA form definition and your data.
                                          3. The image itself is 1.9MB.  But remember that this image is Base64-encoded, so it takes four bytes of XML for every three of image.  That makes the XML data 2.6MB/image.  And I'll note again that that's an incredibly large image to use in a square inch image.
                                          4. The file you've given us has the image repeated 5 times.  That explains the 14MB file size (2.6*5+1=14).  You can see a snapshot of the XML data and its size in the canopener view for "big".
                                          5. I presume that you know that PDF files have a versioned structure, where changes to the file add on in incremental change areas.  The file you sent has two areas...  One about 1MB and one 13MB.  You can see these if you open the file in a good text editor and search for %%EOF.  That happens at the end of each incrememental change.  In other words, the incremental change is all the XML data and there is only one incremental update area.  See section 7.5.6 of the PDF reference manual if you'd like to know more about the incremental update.
                                          6. You also observed that if you open this file in Acrobat 9.1 and save it, the file shrinks from 14MB to 4MB.  This is due to a feature that Acrobat added where it will compress parts of the XFA data stream.  You can see this in the canopener view for small: it is the exact same uncompressed size, but is reduced 10MB by the flate_compression. So you can thank Acrobat engineering, but it won't help your form submission issue much
                                          7. I'll also note that a basic check that I did on your file was to export the form data (tools/ forms/ more form options/ manage form data/ export in Acrobat 10) and saw the same size XML data stream for both of these.

                                           

                                          You're basically running up against basic laws of space conservation: put a number of big things in a flexible sack, and the sack grows. I'd suggest that you give strong guidance to folks on the size of the image that they use.

                                           

                                          PDF can be a bit mysterious if you can't see what's happening.  That's why tools like Canopener are key to shedding daylight on the dark insides.

                                           

                                          Finally, I will note that your filesize WILL increase when you add digital signatures.  The size comes when you sign, not when you add the field.  Simply stated, Acrobat (or Reader) will make a pdf marking set of the pages each time that the form is signed... that's the record part of it and it is a new level of incremental change.  So you can expect it to grow as signatures are added.  Again, this is even more reason to use appropriately sized images.

                                          • 18. Re: PDF Size will increase in size dramatically with every submit.
                                            tarekahf Community Member

                                            Thanks a lot C. Myers,

                                             

                                            You explanation helped me understand what is happening.

                                             

                                            I have been following the same method for the past 4 years, and I was hit by this problem (OutOfMemoryException) only when some users started using image size more than 500KB. Then, I decided to report this problem.

                                             

                                            I was able to rewrite the code to convert from Base64 to binary using buffering:

                                             

                                            http://forums.asp.net/t/1662571.aspx/1?URGENT+Exception+OutOfMemoryException+thrown+when+w hen+converting+to+String+

                                             

                                            So far, I am not getting OutOfMemoryExceptions, but the PDF Size will continue to grow with every submit. However, if the all the images size is less than 50KB, the increase is not significant.

                                             

                                            Please allow me to ask this question:

                                             

                                            Is there a way to change the above code so that I can extract only the last version of the submitted PDF from the Data Stream "chunk" element ?

                                             

                                            Sooner or later, some one will notice that such PDF sizes are not logical. Even when the PDF does not have images, I have noticed in the past, some PDF Sizes (for Staff Profile Data Collection Form) are something like 15MB !!! I was not able to figure out why. But now I understand. I think the user must have submitted the form for saving many times.

                                             

                                            Now, things are OK. But, I will post back if this problem will fire back.

                                             

                                            Tarek.

                                            • 19. Re: PDF Size will increase in size dramatically with every submit.
                                              Chuck Myers (Adobe) Community Member

                                              I'd like to see one of the files that has grown so much.  Or, better yet, I'd like to see a sequence of files, base, after submit with one image, after the next submit that adds another image, etc., and we can diagnose from there; also, a step where just some of the form data, not pictures, are changed.  But I'd also suggest that you get the 10 day trial pdf canopener from Windjack to inspect the files yourself for the base data AND that you count the number of %%EOF so that you can see the number of incremental updates (sounds like a good use of GREP).  But let's get some scientific numbers on the problem.

                                               

                                              Best would be to send the files to support on the existing case number.

                                               

                                              And I'd like to take this to a point of conclusion and then even do a brief blog on this topic.  I can only imagine that other people have these same issues.

                                               

                                              As for the "getting just the last chunk," it really depends on the SW you are running on the server.  "Simple" PDF utilities will just always make an incremental update. More rich software, like Form Data Integration in processes in LC let you export the data and then import to a clean form. And there are also tools in LiveCycle like assembler that will consolidate the incremental updates. 

                                               

                                              But the overall question is "what software are you using to merge the XML data into the form?"  Is it from Adobe or somwhere else?  Your forum posts don't shed any light on this.

                                              • 20. Re: PDF Size will increase in size dramatically with every submit.
                                                Tarek AHF Community Member

                                                I will try to prepare the files you requested, and I will send them all to support.

                                                 

                                                I am not using any tool to merge the XML Data with PDF. I have developed a .NET Program to merge XML with PDF using XDP format. The result is rendered to the client browser as XDP MIME Type using VB.NET "Response.write()"

                                                 

                                                When the PDF is rendered on the client, then when the user clicks "Submit" or save, and the PDF sent to ASPX Page on the server, then the "chunk" element is extracted from "Page.InputStream" and converted from Base64 to Binary Array, and the PDF is then generated as PDF file and saved on the server. All this is doen using .NET Program under IIS Server on Windows 2003 Server.

                                                 

                                                I will try to use LiveCycle assembler services that will consolidate the incremental updatesthat but I have never done that from ASP.NET.

                                                 

                                                Tarek.

                                                • 21. Re: PDF Size will increase in size dramatically with every submit.
                                                  Chuck Myers (Adobe) Community Member

                                                  You sent a file to support that shows the problem well.  The signed file had 7 incremental updates, and each update was about 1.3MB. But I noted that the image size varied significantly. Some were GIFs were 3KB while the TIFs were 360KB (all measured on the base64 data).  I would venture to say that you won't have a dramatic issue like this if the files are 1/100th the size

                                                   

                                                  I kicked this around with a key form developer (see his blog) and he had a great idea.  You can check the size of the image that the user has attached and give them an error if they have added an image that is too large: that can give them some idea on how to create a thumbnail.  John's words we "Just look at the length of the imagefield.rawValue – will tell them the size of the base64 image. If it’s too big, clear the field." That may be the most effective way to make the size increase less dramatic.  And it should not change your workflow.

                                                  • 22. Re: PDF Size will increase in size dramatically with every submit.
                                                    Chuck Myers (Adobe) Community Member

                                                    The heart of the problem was that large images were being placed in XFA image fields. Due to the design of PDF and incremental updates, copies of these images were being added to the file for each file save.  I'll write more on this later, most likely on the ADEP product blog.  But for now, the solution is to limit the size of the image in the field.  [As background, the image was used for a 1x1 inch thumbnail of a face, which is well-satisfied by a 72 DPI highly compressed JPG, or around 20-40K bytes or less.  The images in the file were on the order of megabytes, which caused massive issues. 

                                                     

                                                    John Brinkman did a blog post on how to check the image size and generate an error if it is too large.  You can see this on John's Formfeed blog, and it is quite elegant.

                                                    • 23. Re: PDF Size will increase in size dramatically with every submit.
                                                      tarekahf Community Member

                                                      Thanks a lot Chunk and John.

                                                       

                                                      I think this will control the issues I am facing, and will catch the cause before it hits the server.

                                                       

                                                      I will implement this check in my form ASPA. I tested the sample form from John's blog, and it is working fine.

                                                       

                                                      Tarek.

                                                      • 24. Re: PDF Size will increase in size dramatically with every submit.
                                                        Jeff-Adobe Adobe Employee

                                                        Thanks John. That Formfeed blog post is a great answer to this question.

                                                        -Jeff

                                                        • 25. Re: PDF Size will increase in size dramatically with every submit.
                                                          tarekahf Community Member

                                                          Just to confirm that I implemeted the javascript code to check for image size before submit, server weeks ago, and so far, I never faced this problem again.

                                                           

                                                          Many thanks again.

                                                           

                                                          Tarek.

                                                          • 26. Re: PDF Size will increase in size dramatically with every submit.
                                                            Moris.Mihailidis

                                                            I have a similar problem with the size of a pdf increasing.

                                                            Using canopener did not reveal much at first, but then I used PDFXplorer and found a difference in results.

                                                             

                                                            Catalog

                                                            --- Acroform

                                                            ------ Fields

                                                            ------ XFA

                                                             

                                                            With canopener the Fields object is empty.

                                                            With PDFXplorer the Fields object is repeating (alot!) and contains XFA object. I am guessing this is where the huge size exists.

                                                             

                                                            Can anybody advise me as to what the Fields object actually represents and how/when it is populated?

                                                             

                                                            If I save the pdf using Acrobat, the size is heavily reduced.

                                                            Then if I view in PDFXplorer, the Fields object is empty.

                                                            • 27. Re: PDF Size will increase in size dramatically with every submit.
                                                              tarekahf Community Member

                                                              Hi Moris,

                                                               

                                                              Thanks for the update on the same problem.

                                                               

                                                              I hink based on my understanding of the feedback of Adobe Staff in this thread, it seems that those fields and XFA content are repeating because this is how it was programmed to happen (or how it was Architected) to allow keeping copies of incremental updates.

                                                               

                                                              It will be good if you can post more information about the version you are using ...etc.

                                                               

                                                              Tarek.

                                                              • 28. Re: PDF Size will increase in size dramatically with every submit.
                                                                Moris.Mihailidis Community Member

                                                                Hi Tarek

                                                                 

                                                                Appreciate your comment.

                                                                 

                                                                I do not have any specific version of Reader or Acrobat the users are using when this issue occurs. I have asked 1st level support to gather those details next time if possible.

                                                                 

                                                                I have raised this with Adobe Enterprise Support, hopefully they can shed some light.

                                                                 

                                                                I did find that Fields object is part of the Interactive Form Dictionary.

                                                                Also, along your feedback, there is only 2 instances of %%EOF, in the pdf at fault.

                                                                 

                                                                Moris