6 Replies Latest reply on Jul 15, 2013 4:23 AM by Ashu1780

    CIDset in subset font is incomplete error by pdf/a-1A validation

    Ashu1780

      Hi,

       

      I have 100 of PDF files that have been generated from OCR software (by checking TAG enable option within the OCR software). These PDFs are generated as PDF-A/1A

      compliance

       

      While verifying the complaince using Adobe Preflight option, i am getting the below error for all pages -

       

      "CIDset in subset font is incomplete"

       

      Can anyone please help me to fix these errors?

       

      The customer requires Scanning and PDF to be created from OCR software and should be compliant to PDF-A/1A.

       

      Thanks,

      Naushad

        • 1. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
          raeben3 Level 3

          What software are you using to do OCR?

           

          Is there a way you can adjust what font is being used by the OCR software or use a different software for OCR? This is not  a standard OpenType font or one the can be mapped to Unicode.

           

           

          I'm not sure exactly how OCR software places fonts in  PDFs. Does it show up in the PDF Properties under Fonts?   If so, what is the name of the Font.

           

          When you examine the Tagged text in the Tags Panel, if you open up a tag and look at the content does it make sense or is it nonsense (gobbledygook)?

           

          While OCR text is not visible to the end user directly, it can be selected using the text tool and it is recognized by Acrobat for tagging purposes.  The Assistive Technology, e.g., screen reader. will be reading from this text. So if it is not understandable you do not have an accessible file.

           

          You would probably obtain better results using the Acrobat OCR feature, preferably on a Windows machine.  It's been a while since I've exchanged files between Mac and Windows, and I wouldn't trust that the encodings would be the same without testing it.

          • 2. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
            Test Screen Name Most Valuable Participant

            I believe that the report is absolutely correct, and many files fail this test. What it describes is of no importance, but it is one of the rules of PDF/A.

            If you are making them with Acrobat, try using the latest version.

            • 3. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
              Ashu1780 Level 1

              Thank you for your valuable response. Please see below my response -

               

              1. I am using Abbyy Fine Reader software (version 11) for OCR scanned Image to create Searchable PDF with corrected OCR. The PDF generated from OCR should be TAG enable and are compliant to PDF-A/1A. We have option in OCR to save PDF (text under page image) as PDF/A

               

              2. Yes, all the fonts are shown in PDF property, however with CIDset embedded option.

               

              3. While reviewing the Tagged text in tag panel in PDF, it mostly tagged in improper sequence. We are checking to correct this manually from acrobat touch-up text option.

               

              4. We are using Windows version OCR and Acrobat software.

               

               

              I believe we should have some solutions to recitify these problems.

               

              Looking forward to hearing from you.

               

              Regards,

              Naushad

              • 4. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
                Ashu1780 Level 1

                Thank you for your valuable response.

                 

                As i am using Acrobat 9 and latest version of Finereader software, please advise if any of these needs to be upgraded.

                 

                I have in another email response provided the complete details of the process and software version used.

                 

                 

                Looking forward to hearing from you.

                 

                Regards,

                Naushad

                • 5. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
                  Test Screen Name Most Valuable Participant

                  I believe that the Acrobat report is correct, so it is the PDF creation software that is at fault. You should contact the maker and see if they know about the problem (rather than just upgrading on the assumption it is fixed).

                   

                  I believe I read that Acrobat Pro XI can also fix this problem in preflight, if that is the only option.

                   

                  Tagging order is a different question. Incorrect order from OCR is not a software fault, but normal. Arguably the whole idea of automatic tagging is completely wrong, since if it could be automated properly, tagging would not have been invented. It is best to think of automatic tagging as a time saver, and the first part of a process which must include detailed and expert examination of all tags in every page.

                  • 6. Re: CIDset in subset font is incomplete error by pdf/a-1A validation
                    Ashu1780 Level 1

                    Thanks a lot! I have written to Finereader support contact for this.

                     

                    I will update on this shortly.

                     

                    Best regards,

                    Naushad