1 Reply Latest reply: Jul 10, 2014 3:01 PM by Test Screen Name RSS

    How to display cyrillic characters in a PDF

    mwebb987 Community Member

      I am fairly green in terms of representing text in PDF documents and need some assistance. My main question is how do I represent Cyrillic characters in PDF files.

       

      I know the basics of how to represent text in PDF files and the PostScript commands to use. I know that bytes written to the file in the range of 0 to 255 will print correctly when using the correct encoding (we are using the WinAnsiEncoding). What I cannot seem to figure out is how to represent extended character sets and different glyphs (such as those used in the Cyrillic alphabet) in a PDF file. Do I need to use CID fonts and CMaps?

       

      Here is an example of the text I understand how to print:

       

      stream

      0.00000000 0.00000000 0.00000000 RG

      0.00000000 0.00000000 0.00000000 rg

      BT

      /Helvetica 14 Tf

      7.2 768.96 Td

      (Hello World!) Tj

      ET

      endstream

       

      I'm really not clear on how to represent any of the Chinese or Japanese fonts either, so really any help here is appreciated. Any examples are appreciated as well.

       

      Thanks!

        • 1. Re: How to display cyrillic characters in a PDF
          Test Screen Name CommunityMVP

          You don't need to use CIDFonts and CMaps for Cyrillic (though you can). The crucial thing to realise is that displaying Latin1 (that is, English and related text) leaves you in a very simple corner of PDF. Doing anything with other encodings instantly makes a project more complex, perhaps 10 times more complex. Far eastern fonts perhaps 10 times more complex again.

           

          The principle is the same for all of them. To use any character you need

          1. A font containing that character. PDF has built in fonts containing Latin1, such as Helvetica, but there is no such luxury for other encodings.

          2. The right (license) to embed the font.

          3. The technical ability to embed the font. In many case this isn't just a case of embedding a file as a stream, but also you need to analyse the tables in the font, and sometimes trim or modify them.

          4. An encoding for the font.

          5. Text streams which use character positions in the encoding to show the text.

           

          Basically you need to read and read and reread the chapter on text, and its references (such as font formats). This will become your constant friend or tormenter for the many months of the project.

           

          If you don't like the sound of that, or it doesn't make ecomomic sense to do that, there are many PDF libraries which have taken the necessary months or years to do this.