If the existing Export as XML does not already export these, you cannot
them in some magical way. That has nothing to do with JS -- the Export function is (shockingly) not written in JS, so you can't just change it in the way you want.
By the way, I'm assuming with Special Characters you do NOT mean those that are in a font (as these are exported just fine), but some of InDesign special markers -- hyphens, spaces, variables, page number placeholders.
You can try one of these two methods:
1. Search and replace your codes of interest with placeholders (that
get exported), then export as usual. In the exported XML, replace the placeholders with the XML entities/commands you need.
2. Save the ID file as INX. That's sure to contain almost everything that was in the original document. Then process this file with XSLT to create your final XML.
"Almost everything", because this is not 100.1% reliable; then again, that final 0.1% is not likely to mean something useful in any XML output.
Yes I mean InDesign special markers -- hyphens, spaces, variables, page number placeholders.
Hmm. I'm sorry, still not very clear.
You cannot change the Export XML function itself. So, there are (again) two opportunities.
1. Remove the discretionary hyphens from your InDesign document before exporting. A simple search-and-replace will do the job.
2. Remove them from your XML file after exporting. I'm no star in JS file input/output; but apparently, post-processing of XML is possible. And of course you can also consider another tool for the post-processing (that's why I mentioned XSLT before).
>1. Remove the discretionary hyphens from your InDesign document before exporting. A simple search-and-replace will do the job.
Actually, it won't :-) You also need to turn off hyphenation for all text. But the main point was, "don't have any hyphens in your doc at the time you do the export".
I'd like to reopen this topic. There seemed to be some confusion in the original discussion, so let me see if I can clarify the original post and (hopefully) get an answer or two.
I use a batch script to export all of the XML files in a document. It works fine except that on occasion I get some odd characters in the output files -- including Unicode formatting characters such as 0x2029 after every return. It seems that the script picks up the most recent setting used in the Export XML panel. If Remap Break ... was selected before, then the script remaps these characters, and all is well. If Remap Break ... was not selected, then the script does not, and that's a problem.
For the time being, I'm just checking this option in the Export XML panel to make sure it's set before using the script. But it would be nice to be able to set this preference within the script itself. Is it hiding somewhere else? Or is it just not scriptable?
.. missing are 1) Export Untagged Tables as CALS XML ...
That is in XMLExportPreference ... doesn't it work?
r/w The export format for untagged tables in tagged stories. (default: XMLExportUntaggedTablesFormat.CALS)
Your Remap Break (ect.) doesn't seem to be scriptable, at the mo'. It must be an oversight in writing the Scripting interface.
.. including Unicode formatting characters such as 0x2029 after every return ..
U+2029 is the recommended Paragraph Separator Code. (And I wonder why they add this, as the return is also exported )
Took another look and found the Export Untagged Tables setting -- thank you for pointing that out, I'd overlooked it. But too bad that Remap Break seems to have gone missing. Seems like an important one to have been left out. Maybe Adobe will run across this and provide a fix in the next revision.
I wondered about the paragraph separator code, too. A bit redundant, but it's probably used internally. Not much use in XML, that's for sure.
Just wanted to add this update ...
This topic addresses the same issue and seems to have an answer. Setting myDocument.xmlExportPreferences.characterReferences to "true" seems to have the same effect as selecting "Remap Break, Whitespace and Special Characters" in the XML Export dialog box. Works great so far.