2 Replies Latest reply on Nov 21, 2013 2:13 PM by G P Shannon

    Mapping non-Unicode fonts to Unicode

    G P Shannon Level 1


      I have a document in Lao, originally built in PageMaker 7.0 then opened in ID CS6 (Mac OS), that uses Saysettha, a non-Unicode Lao font. Looks great now, but I need to convert it to Unicode in order to post it to the web in a wiki that my client's company maintains. When I apply a Unicode Lao font (I have several, including the Unicode version of Saysettha, and they all behave the same) or copy-paste into TextEdit, it becomes garbled Roman like this: ๚ö๛êó 1£¸¾ ¹¨÷ɤ¨¾¡-Ã--¡¾-³ñ¤ -Áì½ ¡¾--¦ˆ¦¾-. Other SE Asian scripts were OK, such as Burmese.


      Anybody have suggestions on how to port this doc to a Unicode version?




        • 1. Re: Mapping non-Unicode fonts to Unicode
          Joel Cherney Level 5

          Do you have any access to an install of Office on Windows? What you have there is one of the many pre-Unicode Lao encodings. There are many Saysetthas with a variety of encodings going back to the early 90s. I think your sample is Saysettha 2000 - which makes sense because that was the variant that worked best in PageMaker. Unless you can hire Lao-fluent Javascript developers at less than three cents per hour, the most cost-effective route to convert this stuff to Unicode will be to buy a license for LaoScript, which sits on top of most recent versions of MS Word, and performs conversions like this.


          Although I'm not 100% certain, I think that your sample is missing zero-width spaces, which you'll need if you are going to post this content on a Wiki without hard breaks at the end of every line. You won't know where they go unless you read the language. And unless you have Burmese-literate participants in your project, even money on the same being true of Burmese or any other non-Vietnamese SE Asian script - keyed without zero-width spaces, and therefore broken, not actually OK as you assume. These languages don't wrap correctly without some extra work in this area, and I doubt that PageMaker-era complex-script DTP forced good Unicode practices in complex-script layout.

          1 person found this helpful
          • 2. Re: Mapping non-Unicode fonts to Unicode
            G P Shannon Level 1

            Thank you Joel for your historical knowledge and very helpful reply. It seems the client's contact in Laos will be trying your suggestions and possibly rebuilding the document using more current tools.