8 Replies Latest reply: Apr 30, 2013 5:31 AM by lazydba RSS

    Parsing postscript file

    lazydba Community Member

      Hi.

      To extract Purchase Order number from postscript file after Cristal Report 8.5 in current environment (Windows 2003 (32 bit.), Cristal report 8.5)

      we re parsing this file with gawk/awk and renaming file using this number,  as result Vendor will receive .pdf file named as PO number.

       

      Same task for Cristal Report 11.5 fails (Windows 8 (64 bit), Cristal Report 11.5, Virtual printer Apple(or HP), gawk/awk does not work.

      Could you please let us know what technology we can use to extract this information now.

      Thanks.

        • 1. Re: Parsing postscript file
          Helge Blischke Community Member

          Please post (an URL to)  sample files for both versions for me to look into.

           

          Helge

          • 2. Re: Parsing postscript file
            lazydba Community Member

            Thank you for response, Helge.

            Can I send these files by email?

            Thnaks,

            Alex

            • 3. Re: Parsing postscript file
              lazydba Community Member

              This image below shows the string we are looking for in good/old postscript file (Cristal Report 8.5).

              Postscript file form CR 11.5 does not have this section at all, neither we can not find requested string at all

              (looking in ASCI).

              When we reconvert PS file from CR 11.5 with ps2txt.exe we can find something relative to our numbers but we are not sure if this is only way.

              Thanks.

               

               

              716046_OK.PNG

              • 4. Re: Parsing postscript file
                Helge Blischke Community Member

                Yes, you may send the files by email (address: h dot blischke at acm dot org).

                • 5. Re: Parsing postscript file
                  Helge Blischke Community Member

                  Hi Alex,

                   

                  the CR 11.5 file has been produced with the feature "font subsetting" switched on.

                  That leads to a font encoding quite different from ordinary ASCII, for example

                  (from your 70016046.ps file):

                   

                  <0201010304010504>                    (hex codes)

                  "  7  0  0  1  6  0  4  6"                    (equivalent ASCII characters)

                   

                  Note that all text characters are encoded as hex strings, probably

                  due to the font subsetting feature.

                   

                  Try to switch off the font subsetting (as I don't have a Win8 at hand, I cannot

                  tell you how to do this) and try again. Maybe the hex representation is the

                  default in your system, so you may need to try looking for the respective

                  hex string as well.

                   

                  Helge

                  • 6. Re: Parsing postscript file
                    lazydba Community Member

                    Thank you, Helge.

                    I will do more research tomorrow.

                    May be you can advise some simplest script to convert hex to ascii? (this is in case if we would not be able to switch the font to ascii) we are using CR 11.5 from application so I would try to get from apps developers advice how can we have font converted .

                    Thanks a lot one more time,

                    Alex

                    • 7. Re: Parsing postscript file
                      Helge Blischke Community Member

                      Alex,

                       

                      I just ran both PS files through Ghostscript's ps2ascii utility, and whow!, in both cases

                      I got human readable ASCII output, thus showing that this utility fairly well copes

                      with the ad-hoc encoding used in the 11.5 PS file.

                       

                      Due to confidentiality reasons I don't post the converted texts here but mail them to you by

                      separate email.

                      You will see by these examples that the 11.5 Crystal reports obviously preprocess constant parts

                      of the report contents and emit the variable parts afterwards, so that the variable data in the

                      ASCII text do not appear in the semantic context.

                       

                      I'd suggest to install Ghostscript (recommended a fairly recent version, >= 9.01) and use the

                      ps2ascii utulity instead of your awk script.

                       

                      Helge

                      • 8. Re: Parsing postscript file
                        lazydba Community Member

                        Thanks a lot, Helge.

                        Very valuable advice.

                        I will review our scripts today and will post results later.

                        Best regards,

                        Alex