2 Replies Latest reply: Nov 10, 2009 4:38 AM by [Jongware] RSS

    [CS4] Broken Symbol import from Word (and a Fix script)

    [Jongware] Community Member

      If you import a Word document into CS4, the automatic translation of Symbol characters to Unicode fails. Some characters will look okay, such as the alpha -- but when you change the font to another one (containing Greek characters, of course), you will get an "a". Others get the pink box treatment (less-than-or-equal), or seem to disappear entirely (the upwards arrow gets 'translated' into a discretionary hyphen).

       

      This only happens when importing the text as Word document -- saving the same as RTF and importing that works just fine, so whenever possible, you should probably do that.

       

      In case you cannot convert a Word document to RTF, attached is a script "fixSymbol.jsx", that searches-and-replaces all characters in the Symbol set with its original Unicode.

       

      The script has an extra file extension ".txt", but that's just to fool this forum software. Save as "fixSymbol.jsx" into your scripting folder -- without .txt -- and double-click to run.

        • 1. Re: [CS4] Broken Symbol import from Word (and a Fix script)
          [Jongware] Community Member

          Uh-oh. An unforeseen problem with the fixing script.

           

          I copied the base-to-unicode list verbatim, but neglected to notice the order of find-and-changes. Lo and behold, there were a few clashes. As an example, the MULTIPLICATION SIGN \u+00B4 (bad code) got replaced by Unicode U+00D7 (good code), but then all U+00D7 code (bad and good) got replaced again, by U+22C5. Now this happens to be DOT OPERATOR, which is another way of writing a multiplication, but that's really a coincidence. There were several other characters which got replaced twice -- messing things up the second time around.

           

          It oughta be fixed in the accompanying new version, but really ... Adobe's the one that should fix it ...

          • 2. Re: [CS4] Broken Symbol import from Word (and a Fix script)
            [Jongware] Community Member
            .. It oughta be fixed in the accompanying new version ..

             

            Great. This is one of those things were you don't notice anything going wrong unless the planets are in some specific alignment, or something like that.

            Another try -- now it switches on Case Sensitive as well, just to give it a certain je ne sais quo.