2 Replies Latest reply on May 20, 2012 1:56 AM by Dave Merchant

    preflight method to identify fonts which will fail tagging/extraction




      I've run accross several PDFs that contain problematic fonts -- examples are:


      1.  Adding tagging to the document adds strange characters around the fonts (i.e. "@" or "É")

      2.  Copying and pasting the text to word creates strange boxes, question marks, garbled letters.


      I've been investigating this problem and it seems to have something to do with a failure to convert characters to Unicode values --- I've noticed some of hte problematic text has custom/identity-h encoding, and doesn't contain a Cmap reference to unicode.


      However, I've seen several instances of text that is missing a unicode encoding, but copying it to word seems to work fine?


      Is there a custom preflight profile that could proactively identify text that will fail copying + pasting / will produce strange characters when accessibility tagging is added?


      Alternatively, is there a check that will at least flag text that is likely to fail in this way?


      using acrobat 9 pro