    [AS, CS4] Export/import Tagged Text File: Encoding Problem

    tobias.wantzen Level 1

      Hi there,


      I have a script which exports a text selection to a temp folder as (unicode) Tagged Text File. This file is then loaded into an AS variable and processed line by line for some textual changes. Afterwards a new text file with the changed content is written.

      Up to CS2 (and 10.4 PPC) this script worked like a charm. With the new system and Intel it stopped working. Meanwhile I discovered that it must have something to do with encoding problems. Everything in the old script with »as unicode text« forced an error (since AS 2 every text is treated as UTF-16), so I changed those lines to »as text«.

      Further I figured out, that InDesign exports to Unicode Tagged Text as UTF-16LE (little endian), but for importing tagged text it needs UTF-16BE (big endian). AS variables and »write« command uses MACROMAN, so I added some conversions with »iconv« and it worked.


      Sadly the last conversion from MACROMA to UTF-16BE converts all En Spaces from the out string to normal spaces.


      So I'm a bit at my wits end with the problem. Can someone please help me?





      [InDesign 6.0.4 with AppleScript v2.0.1 on Mac OS X 10.5.8, Intel]


      Some code for the relevant parts:


      Selection export as Tagged Text:


      on TMPexportTaggedText(myFileName, myText, myEncoding)
          tell application "Adobe InDesign CS4"
              tell tagged text export preferences
                  if myEncoding = "ascii" or myEncoding = 1 then
                      set character set to ASCII
                  else if myEncoding = "ansi" or myEncoding = 2 then
                      set character set to ansi
                  else if myEncoding = "unicode" or myEncoding = 3 then
                      set character set to unicode
                  else if myEncoding = "jis" or myEncoding = 4 then
                      set character set to shift JIS
                  else if myEncoding = "gb" or myEncoding = 5 then
                      set character set to GB18030
                  else if myEncoding = "ks" or myEncoding = 6 then
                      set character set to KSC5601
                      set character set to ansi
                  end if
                  set tag form to abbreviated
              end tell
              export myText format tagged text to (myFileName) without showing options
          end tell
          tell application "Finder"
                  do shell script "iconv -f UTF-16LE -t UTF-16 " & (POSIX path of myFileName) & " > " & (POSIX path of myFileName) & "1"
                  set myFileName to myFileName & "1"
              end try
          end tell
      end TMPexportTaggedText



      String output and placing in InDesign (circleLines does the textual changes and throws out a string with):


      set theOutput to my circleLines(IN_File & "1", myOptions)
      if theOutput ≠ "" then
        tell application "Finder"
          open for access file (OUT_File & "1") with write permission
          write (theOutput) to file (OUT_File & "1")
          close access file (OUT_File & "1")
             do shell script "iconv -f MACROMAN -t UTF-16BE " & (POSIX path of (OUT_File & "1")) & " > " & (POSIX path of OUT_File)
             do shell script "cat " & (POSIX path of OUT_File) & " | tr '\\n' '\\r'"
          end try
        end tell
      end if
      place (OUT_File) on curSel without showing options



      String output with


          write (theOutput) to file (OUT_File & "1") as unicode text
      produces an InDesign error (placing failed).