8 Replies Latest reply on Jun 18, 2010 11:16 AM by sboerner2

    How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export

      How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export XML from InDesign CS3?
      Pramod.
      pramod.pant@hotmail.com
        • 1. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export
          [Jongware] Most Valuable Participant
          If the existing Export as XML does not already export these, you cannot
          i add
          them in some magical way. That has nothing to do with JS -- the Export function is (shockingly) not written in JS, so you can't just change it in the way you want.

          By the way, I'm assuming with Special Characters you do NOT mean those that are in a font (as these are exported just fine), but some of InDesign special markers -- hyphens, spaces, variables, page number placeholders.

          You can try one of these two methods:

          1. Search and replace your codes of interest with placeholders (that
          i do
          get exported), then export as usual. In the exported XML, replace the placeholders with the XML entities/commands you need.

          2. Save the ID file as INX. That's sure to contain almost everything that was in the original document. Then process this file with XSLT to create your final XML.
          "Almost everything", because this is not 100.1% reliable; then again, that final 0.1% is not likely to mean something useful in any XML output.
          • 2. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export
            Level 1
            Dear friend,
            Yes I mean InDesign special markers -- hyphens, spaces, variables, page number placeholders.
            I already have a javascript batch export program which runs successfully except the fact that it is not able to remove this special character '^-'(Discretionary Hyphen) from the exported xml file.
            • 3. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export
              [Jongware] Most Valuable Participant
              Hmm. I'm sorry, still not very clear.
              Your export works fine, right? By JavaScript, but that doesn't change a thing, it's the same export as you get when doing it manually. Only thing is, there are discretionary hyphens in the exported XML, right?
              You cannot change the Export XML function itself. So, there are (again) two opportunities.

              1. Remove the discretionary hyphens from your InDesign document before exporting. A simple search-and-replace will do the job.

              2. Remove them from your XML file after exporting. I'm no star in JS file input/output; but apparently, post-processing of XML is possible. And of course you can also consider another tool for the post-processing (that's why I mentioned XSLT before).
              • 4. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to export
                [Jongware] Most Valuable Participant
                >1. Remove the discretionary hyphens from your InDesign document before exporting. A simple search-and-replace will do the job.

                Actually, it won't :-) You also need to turn off hyphenation for all text. But the main point was, "don't have any hyphens in your doc at the time you do the export".
                • 5. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to ex
                  sboerner2 Level 1

                  I'd like to reopen this topic. There seemed to be some confusion in the original discussion, so let me see if I can clarify the original post and (hopefully) get an answer or two.

                   

                  The problem described by pramod_pant still exists in IDCS4. That is, when scripting for XML export it is possible to set all but two of the preferences available in the Export XML panel. In JavaScript, these preferences are set in the XMLExportPreference class. The two that are missing are 1) Export Untagged Tables as CALS XML, and 2) Remap Break, Whitespace, and Special Characters.

                   

                  I use a batch script to export all of the XML files in a document. It works fine except that on occasion I get some odd characters in the output files -- including Unicode formatting characters such as 0x2029 after every return. It seems that the script picks up the most recent setting used in the Export XML panel. If Remap Break ... was selected before, then the script remaps these characters, and all is well. If Remap Break ... was not selected, then the script does not, and that's a problem.

                   

                  For the time being, I'm just checking this option in the Export XML panel to make sure it's set before using the script. But it would be nice to be able to set this preference within the script itself. Is it hiding somewhere else? Or is it just not scriptable?

                   

                  SB

                  • 6. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to ex
                    [Jongware] Most Valuable Participant
                    .. missing are 1) Export Untagged Tables as CALS XML ...

                     

                    That is in XMLExportPreference ... doesn't it work?

                     

                    exportUntaggedTablesFormatXMLExportUntaggedTablesFormat:
                    XMLExportUntaggedTablesFormat.NONE
                    XMLExportUntaggedTablesFormat.CALS
                    r/wThe export format for untagged tables in tagged stories. (default: XMLExportUntaggedTablesFormat.CALS)

                     

                    Your Remap Break (ect.) doesn't seem to be scriptable, at the mo'. It must be an oversight in writing the Scripting interface.

                     

                    (Only FYI)

                     

                    .. including Unicode formatting characters such as 0x2029 after every return ..

                     

                    U+2029 is the recommended Paragraph Separator Code. (And I wonder why they add this, as the return is also exported )

                    • 7. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to ex
                      sboerner2 Level 1

                      Took another look and found the Export Untagged Tables setting -- thank you for pointing that out, I'd overlooked it. But too bad that Remap Break seems to have gone missing. Seems like an important one to have been left out. Maybe Adobe will run across this and provide a fix in the next revision.

                       

                      I wondered about the paragraph separator code, too. A bit redundant, but it's probably used internally. Not much use in XML, that's for sure.

                       

                      SB

                      • 8. Re: How to incorporate "Remap break, Whitespace, and Special Characters" feature in Javascript to ex
                        sboerner2 Level 1

                        Just wanted to add this update ...

                         

                        This topic addresses the same issue and seems to have an answer. Setting myDocument.xmlExportPreferences.characterReferences to "true" seems to have the same effect as selecting "Remap Break, Whitespace and Special Characters" in the XML Export dialog box. Works great so far.

                         

                        SB