15 Replies Latest reply on May 11, 2018 6:46 AM by kingaling

    Cascading Filters - Streams

    chellappanp

      I am trying to decode cascading filters. I have written my own ASCII85 encoding and decoding filters.

      For Flate encoding and decoding, I am using zlib.

       

      When I first do an ASCII85 decoding and then a FLATE decoding on a stream using

      /Filter [/FlateDecode /ASCII85Decode] it works fine

      but

      when I do a FLATE encoding and then an ASCII85 encoding on the same stream using

      /Filter [/ASCII85Decode /FlateDecode] it does not.

       

      Any pointers as to why this happens ?

       

      Thanks

      P.Chellappan

        • 1. Re: Cascading Filters - Streams
          lrosenth Adobe Employee

          You have encoding backwards.

           

          Encoding happens left to right (or in array index order) while decoding happens in reverse.

          • 2. Re: Cascading Filters - Streams
            chellappanp Level 1

            Yes. I understand.

            Assume "a" is a stream

            If I first encode "a" with ASCII85 and then with FLATE and use /Filter [/FlateDecode /ASCII85Decode] it works fine.

            But if I first encode "a" with FLATE and then with ASCII85 and use /Filter [/ASCII85Decode /FlateDecode] there is a problem.

             

            P.Chellappan

            • 3. Re: Cascading Filters - Streams
              lrosenth Adobe Employee

              It seems like you are not reading ISO 32000 (aka the PDF Standard).  It is very clear there:

               

               

              Filter

               

               

              name or array

               

               

              (Optional) The name, or an array of zero, one or several names, of filter(s) that shall be applied in processing the stream data found between the keywords stream and endstream. Multiple filters shall be specified in the order in which they are to be applied.

               

              So if you encode A with ASCII85 and then Flate, then the correct value would be  and NOT the other way around!

              • 4. Re: Cascading Filters - Streams
                chellappanp Level 1

                a = deflate(ascii85_encode(a))

                WORKS ONLY WITH /Filter [/FlateDecode /ASCII85Decode] and not the other way around

                 

                a = ascii85_encode(deflate(a))

                DOES NOT WORK WITH /Filter [/ASCII85Decode /FlateDecode] or /Filter [/FlateDecode /ASCII85Decode]

                 

                What I mean by works, is it opens in Acrobat.

                • 5. Re: Cascading Filters - Streams
                  chellappanp Level 1

                  This is what is given in ISO 32000

                   

                  For example, data encoded using LZW and ASCII base-85 encoding (in that order) shall be decoded using the following entry in the stream dictionary:

                  EXAMPLE 2/Filter [ /ASCII85Decode /LZWDecode ]

                   

                  This is exactly the way I have used it too.

                   

                  Could anything else be wrong ?

                  • 6. Re: Cascading Filters - Streams
                    kingaling

                    Do you have a sample of data that was deflated and then ascii85 encoded?
                    Also I would be interested to see the result of the ascii85 decoding prior to it being read by the flatedecoder.
                    Because based on what you've stated here, the only thing I can think of is that the output of the ascii85 decoding is not exactly correct. And I only suggest that because you said you wrote the ascii85 decoder yourself.

                    • 7. Re: Cascading Filters - Streams
                      chellappanp Level 1

                      Shared Files - Acrobat.com

                       

                      The above is the link to the following uploaded sample files. Please let me know if you need further samples.

                       

                      vc_ups_0.pdf - Object 24 not compressed

                      Remarks - PDF is ok.

                       

                      vc_ups_1.pdf - Object 24 uses /Filter [/FlateDecode /ASCII85Decode]

                      Remarks - PDF is ok.

                       

                      vc_ups_2.pdf - Object 24 uses /Filter [/ASCII85Decode /FlateDecode]

                      Remarks - PDF is no ok.

                       

                      vc_ups_3.pdf - Object 24 uses /Filter [/ASCII85Decode]

                      Remarks - PDF is ok.

                       

                      vc_ups_4.pdf - Object 24 uses /Filter [/FlateDecode]

                      Remarks - PDF is ok

                       

                      • 8. Re: Cascading Filters - Streams
                        kingaling Level 1

                        The stream in vc_ups_2.pdf - Object 24 appears to be invalid.

                        ascii85_stream.png

                        Per

                        PDF Reference

                        sixth edition

                        Adobe® Portable Document Format

                        Version 1.7

                        November 2006

                         

                        "The ASCII base-85 encoding uses the characters ! through u and the character z, with the 2-character sequence ~> as its EOD marker. The ASCII85Decode filter ignores all white-space characters (see Section 3.1, “Lexical Conventions”). Any other characters, and any character sequences that represent impossible combinations in the ASCII base-85 encoding, cause an error."

                         

                        You have a bunch of chars in that stream that violate the spec.

                        • 9. Re: Cascading Filters - Streams
                          chellappanp Level 1

                          As you can see vc_ups_1.pdf which is first ASCII85 encoded and then Flate encoded works fine.

                          vc_ups_3.pdf which is only ASCII85 encoded works fine.

                          vc_ups_4.dpf which is only Flate encoded also works fine.

                          So it appears that the ASCII85 encoder is working fine.

                          But the problem starts when the stream is first Flate encoded and then ASCII85 encoded.

                          So probably Flate encoding inserts some white space characters in the stream.

                          Can you please let me know what are the white space characters.

                          Do I simply remove all these white space characters from the Flate encoded stream before the ASCII85 encoding process ?

                           

                          Thanks

                          P.Chellappan

                          • 10. Re: Cascading Filters - Streams
                            kingaling Level 1

                            The invalid chars are in the picture I added to my last post (see all the values within the blue highlighted area that are less than 0x21h).
                            Also, I have decoded object 24 from vc_ups_3.pdf and vc_ups_4.pdf. They are not the same.
                            If I had to guess, I would say your code is not properly accounting for non-printable characters.
                            Also your ASCII85 stream in object 24 does not terminate with the proper character sequence of "~>" as it is stated in the spec.

                             

                            And finally,
                            RE: Do I simply remove all these white space characters from the Flate encoded stream before the ASCII85 encoding process ?

                            Answer: No. If you did that, once you ASCII85Decode the stream, your next step would be to try to FlateDecode that stream which would fail since you previously removed characters from it.

                             

                            It is OK to have non-printable chars in a Flate encoded stream.
                            It is NOT OK to have non-printable chars in an ASCII85 encoded stream.

                             

                            Your ASCII85 code needs to take into account the correct start and end point of the streams. And i mean the EXACT start and end points. See the PDF spec section about streams ending with 0x0D0Ah vs just 0x0Ah or just 0x0Dh. It needs to take into account the length of each piece and the overall size to properly account for any required padding.

                            • 11. Re: Cascading Filters - Streams
                              kingaling Level 1

                              I guess I can't edit my post so here's an update to it:

                              Object 24 from vc_ups_3.pdf and vc_ups_4.pdf after decoding ARE in fact the same. Sorry about that.

                              However the ASCII85 encoded stream in vc_ups_3.pdf is missing the terminating '~>'

                               

                              Everything else stated in my previous post stands.

                              I have no ASCII85 encoder myself, so I'm coding one up. I'll keep ya posted with my findings.

                              • 12. Re: Cascading Filters - Streams
                                chellappanp Level 1

                                Thanks for your efforts. Please let me know when you are done with the encoder.

                                 

                                P.Chellappan

                                • 13. Re: Cascading Filters - Streams
                                  kingaling Level 1

                                  ASCII85 Code

                                   

                                  Well, that was an adventure. I actually needed to do this anyway so it was not an issue for me.
                                  If you run across any issues with the code, feel free to drop me an issue on my github page noted in the code.

                                   

                                  Keep in mind though that the rules noted in the specification regarding EOL chars before and after the streams should still be respected. This is only for encoding / decoding.
                                  Ensuring you have the correct start and end points remains up to you.

                                   

                                  Take care.

                                   

                                  Shane

                                  • 14. Re: Cascading Filters - Streams
                                    chellappanp Level 1

                                    When I click on the ASCII85 Code link, I get a file not found error.

                                    Can you please send it again ?

                                     

                                    P.Chellappan

                                    • 15. Re: Cascading Filters - Streams
                                      kingaling Level 1

                                      Derp sorry. Try this: ASCII85 Code