12 Replies Latest reply on Feb 8, 2015 11:07 AM by Azinov

    Converting Russian PageMaker 6 to InDesign 5.5

    ddfx

      Converting a Russian PageMaker 6 file to InDesign 5.5 and the text is garbled, even after changing the font to known Russian friendly fonts...

       

      Âëàäèìèð  îçëîâ (Ðîññèÿ)

      Ñòåíëè  ðèïïíåð (ÑØÀ)

      Ëåîíèä  ðîëü (Ðîññèÿ)

       

      Any tips on what's going wrong and how to do this?

       

      Thank you :-)

        • 1. Re: Converting Russian PageMaker 6 to InDesign 5.5
          Joel Cherney Adobe Community Professional & MVP

          That's a classic encoding problem. The Russian text was set in one of ye olden Cyrillic text encodings - I can't tell by looking but I suspect Windows 1251. These were PageMaker 6 files from Windows PageMaker, right? You have a bunch of different ways to approach this, depending on the volume of stuff you're trying to rescue from the twentieth century, and whether or not you have twentieth-century platforms available to you. The two main options are:

           

          1) Download a trial copy of PM7, open, resave, and then try to open the resultant PMD files with InDesign

          2) Use Javascript to do a massive find-and-replace operation (assuming that there is no non-Cyrillic text that might use any of those accented Latin glyphs)

           

          There are a few things to try within PageMaker if you try option 1 (resetting text in a different font, diagnostic recompose, etc.). Option 2 is more useful if you have a genuinely unusual Cyrillic encoding in those PM docs - I've typically used no. 2 myself due to the variety of non-standard encodings in our archives..

          • 2. Re: Converting Russian PageMaker 6 to InDesign 5.5
            ddfx Level 1

            Thank you. That's great info. Yes, the PageMaker 6 docs are most likely from a Windows environment... and no, I don't have a a twentieth-century platform available sadly...

            Downloading Pagemaker 7 for Mac resulted in error message... stuffit archives are not supported. Apparently they're discontinuing pagemaker and all support. 

            Any tips on where to learn about option 2?

            Thank you!!!!!!

            • 3. Re: Converting Russian PageMaker 6 to InDesign 5.5
              Joel Cherney Adobe Community Professional & MVP

              Well, you can still download the Windows PM trial; I suspect that your problem downloading the PM trial was local to your computer. It doesn't matter; if your PM docs are Windows documents then Mac PM wouldn't have helped you fix the Windows-platform encoding problems. As far as Option 2 goes... I have a Javascript squirreled away on my old mirror-door G4 for doing automated find-and-replace; you could probably adjust it to fit. Or you could use FindChangeByList. This script is installed by default in your scripting panel; look at Scripts -> Application -> Samples -> Javascript -> FindChangeSupport, and right-click on "FindChangeList.txt" to see what you'd have to edit in order to use that script to do this for you. However, in order to use either option you'd need to be able to know the correct mapping of Unicode values, i.e. if ÑØÀ = США then

               

              Latin capital letter n with tilde = Cyrillic capital es

              Latin capital letter o with stroke = Cyrillic capital letter sha

              Latin capital letter a with grave = Cyrillic capital letter a

               

              then convert the hex Unicode values from the old value to the new

              00D1 -> 0421

               

              So if you don't know what mapping is being used (Is it WinCP1251? Is it MacCyrillic? Is it KOI-8?) then you would need to figure out which one it is - only possible if you know what the Russian is supposed to be. For a few years, I did this a great deal, so I've seen quite a few rare encoding systems. However, I am pretty sure that you are dealing with Win1251 - a quick Google search for "ÑØÀ" seems to imply that I am correct in my guess. However, most of the "rare" encoding systems I've seen for Cyrillic are either "pretty much 1251 with a few idiosyncracies" or "Pretty much MacCyrillic with a few inconsistencies." If it really is 1251, then of course another quick Google search for "convert 1251 to unicode javascript" shows that someone else has already done most of the hard work.  

               

              Still, as you can see, pulling old translations out of the twentieth century can be quite an undertaking; sometimes it's cheaper and faster to just have someone rekey the whole thing from PDF/hardcopy.

              • 4. Re: Converting Russian PageMaker 6 to InDesign 5.5
                [Jongware] Most Valuable Participant

                Good guess, Joel -- it is Windows-1251!

                 

                I copied it into TextWrangler, saved as "test.xml", with this little header added:

                 

                <?xml encoding="windows-1251" ?>
                

                 

                but with the "encoding" set to "Windows Latin1". TextWrangler rightly complains that the document specifies another character set, but you can opt to save anyway. That step is necessary because it will save the text file but not change the actual content -- the literal character codes.

                 

                On opening the same file again, TextWrangler is smart enough to examine this header I added, and then translates the characters according to the Windows-1251 (Cyrillic) character set. With the correct characters now available, I can remove the header and save the file as UTF-8, or simply select-copy-paste into InDesign and get this little list (and apparently it's a name-and-country list).

                 

                <?xml encoding="windows-1251" ?>
                Владимир  озлов (Россия)
                Стенли  риппнер (США)
                Леонид  роль (Россия)
                

                 

                Message was edited by: [Jongware] ... This is still the only way to get the Advanced Formatting button ...

                • 5. Re: Converting Russian PageMaker 6 to InDesign 5.5
                  [Jongware] Most Valuable Participant

                  ... As it happens I have a Win-1251 translation array somewhere sitting on my HD, so here is a Javascript, ready to be run. Save the following as "ConvertCharset.jsx" in your Scripts folder, select any text and then double-click the script to translate.

                   

                  if (app.documents.length > 0 && app.selection.length == 1 && app.selection[0].hasOwnProperty("contents"))
                  {
                            for (t=0; t<app.selection[0].characters.length; t++)
                            {
                                      ch = app.selection[0].characters[t].contents.charCodeAt(0);
                                      if (ch >= 0x80 && ch <= 0xff)
                                                app.selection[0].characters[t].contents = ['\u0402','\u0403','\u201A','\u0453','\u201E','\u2026','\u2020','\u2021','\u20AC','\u2030','\u0409','\u2039','\u040A','\u040C','\u040B','\u040F','\u0452','\u2018','\u2019','\u201C','\u201D','\u2022','\u2013','\u2014','\u003F','\u2122','\u0459','\u203A','\u045A','\u045C','\u045B','\u045F','\u00A0','\u040E','\u045E','\u0408','\u00A4','\u0490','\u00A6','\u00A7','\u0401','\u00A9','\u0404','\u00AB','\u00AC','\u00AD','\u00AE','\u0407','\u00B0','\u00B1','\u0406','\u0456','\u0491','\u00B5','\u00B6','\u00B7','\u0451','\u2116','\u0454','\u00BB','\u0458','\u0405','\u0455','\u0457','\u0410','\u0411','\u0412','\u0413','\u0414','\u0415','\u0416','\u0417','\u0418','\u0419','\u041A','\u041B','\u041C','\u041D','\u041E','\u041F','\u0420','\u0421','\u0422','\u0423','\u0424','\u0425','\u0426','\u0427','\u0428','\u0429','\u042A','\u042B','\u042C','\u042D','\u042E','\u042F','\u0430','\u0431','\u0432','\u0433','\u0434','\u0435','\u0436','\u0437','\u0438','\u0439','\u043A','\u043B','\u043C','\u043D','\u043E','\u043F','\u0440','\u0441','\u0442','\u0443','\u0444','\u0445','\u0446','\u0447','\u0448','\u0449','\u044A','\u044B','\u044C','\u044D','\u044E','\u044F'][ch-0x80];
                            }
                  }
                  

                   

                  Message was edited by: [Jongware] Ye gods this is annoying! Still the only way to get to the Advanced Formatting -> Syntax Hi-Lite button!

                  • 6. Re: Converting Russian PageMaker 6 to InDesign 5.5
                    ddfx Level 1

                    Wow!!! You guys are amazing and very generous. Thank you!!!!

                     

                    Installed Pagemaker 7 in a parallels desktop environment... and same problem.

                     

                    Created ConvertCharset.jsx in the script folder and got this message...

                     

                    what's next?

                     

                    thank you thank you enormously!!!!!

                     

                    oops 2012-05-15_1955.png

                    • 7. Re: Converting Russian PageMaker 6 to InDesign 5.5
                      [Jongware] Most Valuable Participant

                      The script works fine for me. I am able to get that exact error but only if I select a text frame instead of text.

                      • 8. Re: Converting Russian PageMaker 6 to InDesign 5.5
                        Azinov

                        How i change this script for working with special character?

                        If content have em dash, quotes, etc — the script doesn't work...

                        • 9. Re: Converting Russian PageMaker 6 to InDesign 5.5
                          [Jongware] Most Valuable Participant

                          Please expand on "doesn't work".

                          • 10. Re: Converting Russian PageMaker 6 to InDesign 5.5
                            Azinov Level 1

                            If in the text symbols such as across dash or quotes the estk says "app.selection[0].characters[t].contents.charCodeAt is not a function".

                            • 11. Re: Converting Russian PageMaker 6 to InDesign 5.5
                              Kasyan Servetsky Level 5

                              I suggest to check if the character is Enumerator; if so, skip it:

                              if (app.documents.length > 0 && app.selection.length == 1 && app.selection[0].hasOwnProperty("contents")) {  
                                  for (t=0; t<app.selection[0].characters.length; t++) {
                                      if (app.selection[0].characters[t].contents.constructor.name != "Enumerator") {
                                          ch = app.selection[0].characters[t].contents.charCodeAt(0);  
                                          if (ch >= 0x80 && ch <= 0xff) app.selection[0].characters[t].contents = ['\u0402','\u0403','\u201A','\u0453','\u201E','\u2026','\u2020','\u2021','\u20AC','\u2030','\u0409','\u2039','\u040A','\u040C','\u040B','\u040F','\u0452','\u2018','\u2019','\u201C','\u201D','\u2022','\u2013','\u2014','\u003F','\u2122','\u0459','\u203A','\u045A','\u045C','\u045B','\u045F','\u00A0','\u040E','\u045E','\u0408','\u00A4','\u0490','\u00A6','\u00A7','\u0401','\u00A9','\u0404','\u00AB','\u00AC','\u00AD','\u00AE','\u0407','\u00B0','\u00B1','\u0406','\u0456','\u0491','\u00B5','\u00B6','\u00B7','\u0451','\u2116','\u0454','\u00BB','\u0458','\u0405','\u0455','\u0457','\u0410','\u0411','\u0412','\u0413','\u0414','\u0415','\u0416','\u0417','\u0418','\u0419','\u041A','\u041B','\u041C','\u041D','\u041E','\u041F','\u0420','\u0421','\u0422','\u0423','\u0424','\u0425','\u0426','\u0427','\u0428','\u0429','\u042A','\u042B','\u042C','\u042D','\u042E','\u042F','\u0430','\u0431','\u0432','\u0433','\u0434','\u0435','\u0436','\u0437','\u0438','\u0439','\u043A','\u043B','\u043C','\u043D','\u043E','\u043F','\u0440','\u0441','\u0442','\u0443','\u0444','\u0445','\u0446','\u0447','\u0448','\u0449','\u044A','\u044B','\u044C','\u044D','\u044E','\u044F'][ch-0x80];  
                                      }
                                  }  
                              }
                              
                              • 12. Re: Converting Russian PageMaker 6 to InDesign 5.5
                                Azinov Level 1

                                Thanks, it's work fine!