4 Replies Latest reply: Dec 15, 2010 7:40 AM by Linda Stern RSS

    Copy cyrillic text from Adobe Reader

    Linda Stern

      Hi,

       

      I have got some PDF documents that contain text in Russian (Cyrillic letters). I open these PDFs with Adobe Reader, mark the text and copy it to the buffer. When I then paste the text into Word or any other application, there is no Russian, but just special symbols. Note: that the text is displayed correct in the PDF itself.

       

      I use Windows 7 Ultimate, 64 bit, in German. Is there any way to get cyrillic text into the buffer without scrapping it?

       

      Thank you,

      Linda

        • 1. Re: Copy cyrillic text from Adobe Reader
          Claudio González MVP

          My guess is that the

          fonts were nor properly embedded in the PDF, and you don't have them in your system...

          • 2. Re: Copy cyrillic text from Adobe Reader
            Linda Stern Community Member

            Claudio, thank's for your guess.

             

            I have asked the creator of the PDF files what fonts have been used to generate PDF and the answer was Arial MT. I have installed Arial MT True Type in my system but it did not help. I assume that I have to use Type1 fonts. However, I am not sure where I can get Arial MT Type 1 font and if the font is commercial or not.

             

            I have also found a hint that Adobe Type Manager does not work with Win7, 64 bit, which is my system. But I am fully unsure if it has something to do with my issue.

            • 3. Re: Copy cyrillic text from Adobe Reader
              Claudio González MVP

              Linda, I very much doubt that any variant of Arial contain

              s cyrillic characters. And, as far as I know, there is no way to copy and paste from a PDF to another format text in fonts that are not in your system (sorry, I had overlooked this part).

              • 4. Re: Copy cyrillic text from Adobe Reader
                Linda Stern Community Member

                Hi Claudio,

                 

                I have researched some hours on this issue. This is a very old bug in Adobe Reader that has not been corrected for years.

                 

                The problem is that every font is internally "organazied" in table where latin characters are at the first places and any other characters follow below latin. See this table as example http://www.azfonts.de/images/fonts/A/R/arialmt/table.gif

                 

                Adobe Reader picks the right letters when displaying the PDF. However, when copying text into the buffer it looks for the standard encoding of the system. In my case it is German, so Adobe Reader ignores the encoding in the PDF, converts text in latin characters and thus messes up the cyrillic text.

                 

                I hope that anybody from Adobe Reader can confirm this bug and correct it in the next version.

                 

                Linda