Namespace in xml element

Report · Feb 23, 2011

I have an xml file tagged with namespaces:
for example, ce:section, ce:para

i just trying to collect the 'ce:floats' xml nodes by using the following code:

//Defining the Namespace in the array of arrays of 2 strings
Dim NmSpArr(,) As String = {{"sb", "http://www.elsevier.com/xml/common/struct-bib/dtd"}, {"cl", "http://xml.cengage-learning.com/cendoc-core"}, {"ce", "http://www.elsevier.com/xml/common/dtd"}, {"aid", "http://ns.adobe.com/AdobeInDesign/4.0/"}, {"aid5", "http://ns.adobe.com/AdobeInDesign/5.0/"}, {"xmlns:xsi", "http://www.w3.org/2001/XMLSchema-instance"}, {"xmlns:mml", "http://www.w3.org/1998/Math/MathML"}, {"xmlns:xlink", "http://www.w3.org/1999/xlink"}}
//Trying to collect the 'ce:floats'
Dim obj As InDesign.Objects = IndDoc.XMLElements(1).EvaluateXPathExpression("//ce:floats", NmSpArr)

ID CS4
Vb.net

Above code always returning 0 as count. It seems there is problem with the xml namespaces definition.

Any help would be greately appreciated.

Regards,
Suresh

Report · Feb 26, 2011

It is working for the xml which doesn't have namespaces. For example, xml nodes such as book, author, price.

When it comes to namespaces only it is not working. For example, xml nodes such as ce:book; ce:author; ce:price.

Any help?

Regards,

Suresh

Report · Feb 27, 2011

I assume it doens't work with //float? You could use some ugly XPath expression like "contains(@name,'float')"?

Report · Feb 28, 2011

Its works properly if the xml node doesn't have namespaces. For example, float ('//float') works fine with me.

It is not working at all if the xml node has namespaces. For example, ce:float ('//ce:float') not working with me.

Breaking my head for the last few days. Any Help?

Regards,

Suresh

Report · Feb 28, 2011

What about my suggested XPath expression?

Report · Feb 28, 2011

OK, no, I guess that's not going to work.

For one thing, "InDesign’s XML rules support a limited subset of the XPath 1.0 specification." Not 100% that applies to inDesign's xmlElements objects, but I suspect it does.

In my XPath expression, @name wasn't correct, because you need to call the name() function. So I suppose something like "//*[contains(name(),'float')]" but I don't think thats' going to work for you.

I'm not sure exactly how you introduced namespaces (perhaps it would be good for you to post the actual XML you are importing?), but I created a "ce:float" tag, and found that the above XPath did not match it, even when a "float" tag did match. That's similar to your problem.

I suspect that namespace support is just broken. They don't even seem to match against "*", when the "float" tags do. Good luck.

Maybe you should do this another way.

Report · Feb 28, 2011

John,

Yes, its true i think indesign supporting xml nodes only without namespaces. But if that is the case why there is a second argument called Nametable in EvaluateXPathExpression function. Below are the things i have tried and that is also failed:

//suppose to display all the child nodes

IndDoc.XMLElements(1).EvaluateXPathExpression("//*",NmSpArr).count
count = 0

//same as above

IndDoc.XMLElements(1).EvaluateXPathExpression("//*[contains(name(),'figure')]",NmSpArr).count
count = 0

Above code not at all collecting a singel node. Is this is a bug or limitation of Indesign support for Xpath Expression.

Regards,

Suresh

Report · Feb 28, 2011

Suresh,

EvaluateXPathExpression works only on the root tag instead of child.

Try

XMLElements(0) instead of XMLElements(1).

Hope it helps!

Anil Yadav

Report · Feb 28, 2011

I tested this in JavaScript and used the zero-index and got the same results. XPath's count of child nodes does not include those that have a : in their .markupTag.name. (I had assumed VB was 1-indexed, but I guess not). So I don't think that is the problem.

Yes, Suresh, maybe the namespace argument is the key, but I would still expect XPath's count to be right for "//*", even if "//ce:float" did not work.

So I think maybe XPath expressions are broken in this case. But it shouldn't be too much work to just traverse the tree.

(It is disappointing that there's no method to convert a subtree to a string XML representation so you can feed it into better XML tools...).

But I guess I'm just speculating and Suresh is hoping for someone who really knows. Anyone?

Report · Mar 01, 2011

What is the conclusion for this issue?

Is this is a bug in ID CS4 or there is a solution but what we are trying is wrong?

Is there anyone know any alternative methods to collect the xml nodes with namespaces?

Regards,

Suresh

Report · Mar 01, 2011

Suresh:

Is there anyone know any alternative methods to collect the xml nodes with namespaces?

Again, I wrote:

 But it shouldn't be too much work to just traverse the tree.

Here's an example in JavaScript. It should be easy to translate it into VB. Of course all you need is the traverse() function -- the rest is just examples and test cases:

(function() {
  var
    x = app.activeDocument.xmlElements[0],
    star, all, floats, cefloats, anyfloats;

    function traverse(t, p) {
        var r = [], i;
        if (p(t)) {
            r = r.concat(t);
        }
        for (i=0; i<t.xmlElements.length; i++) {
            r = r.concat (traverse(t.xmlElements, p));
        }
        return r;
    }

    function dumplist(l) {
        var i,
            rv = [];
        rv.push("has "+l.length+" elements:");
        for (i=0; i<l.length; i++) {
            rv.push("  "+i+"\t"+l.toSpecifier()+
                "\t"+l.markupTag.name);
        }
        return rv.join("\n");
    }


  star = x.evaluateXPathExpression("*");
  all = traverse(x,
      function() { return true; }
  );
  floats = traverse(x,
       function(n) { return n.markupTag.name==="float"; }
  );
  cefloats = traverse(x,
      function(n) { return n.markupTag.name==="ce:float"; }
  );
  anyfloats = traverse(x,
      function(n) { return n.markupTag.name.match(/float$/); }
  );
  
  $.writeln("star: "+dumplist(star));
  $.writeln("all "+dumplist(all));
  $.writeln("floats "+dumplist(floats));
  $.writeln("cefloats "+dumplist(cefloats));
  $.writeln("anyfloats "+dumplist(anyfloats));

}());

So, if you run this on a document with an XML schema like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root>
  <ce:float>A</ce:float>
  <ce:float>B</ce:float>
  <float>AC</float>
  <float>Ad</float>
  <ce:float>Af</ce:float>
  <div>
    <ce:float>Ai</ce:float>
    <float>Ag</float>
  </div>
</Root>

It gives you this output:

star: has 3 elements:
  0     /document[@id=1]/XML-element[@id=2]/XML-element[@id=5]     float
  1     /document[@id=1]/XML-element[@id=2]/XML-element[@id=6]     float
  2     /document[@id=1]/XML-element[@id=2]/XML-element[@id=11]     div
all has 9 elements:
  0     /document[@id=1]/XML-element[0]     Root
  1     /document[@id=1]/XML-element[0]/XML-element[0]     ce:float
  2     /document[@id=1]/XML-element[0]/XML-element[1]     ce:float
  3     /document[@id=1]/XML-element[0]/XML-element[2]     float
  4     /document[@id=1]/XML-element[0]/XML-element[3]     float
  5     /document[@id=1]/XML-element[0]/XML-element[4]     ce:float
  6     /document[@id=1]/XML-element[0]/XML-element[5]     div
  7     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[0]     ce:float
  8     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[1]     float
floats has 3 elements:
  0     /document[@id=1]/XML-element[0]/XML-element[2]     float
  1     /document[@id=1]/XML-element[0]/XML-element[3]     float
  2     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[1]     float
cefloats has 4 elements:
  0     /document[@id=1]/XML-element[0]/XML-element[0]     ce:float
  1     /document[@id=1]/XML-element[0]/XML-element[1]     ce:float
  2     /document[@id=1]/XML-element[0]/XML-element[4]     ce:float
  3     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[0]     ce:float
anyfloats has 7 elements:
  0     /document[@id=1]/XML-element[0]/XML-element[0]     ce:float
  1     /document[@id=1]/XML-element[0]/XML-element[1]     ce:float
  2     /document[@id=1]/XML-element[0]/XML-element[2]     float
  3     /document[@id=1]/XML-element[0]/XML-element[3]     float
  4     /document[@id=1]/XML-element[0]/XML-element[4]     ce:float
  5     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[0]     ce:float
  6     /document[@id=1]/XML-element[0]/XML-element[5]/XML-element[1]     float

Report · Mar 01, 2011

OK, I found the correct solution!

Please note, Suresh, I would have found it a lot faster if you had included the XML of your document. I wish I had asked you to do this, but I didn't.

In order for InDesign to deal properly with namespaces, it is critical that the namespace be defined in an xmlns attribute.

(I discovered this because it turns out I was wrong when I said:

(It is disappointing that there's no method
to convert a subtree to a string XML representation
so you can feed it into better XML tools...).

There is such a thing. The exportFile(exportFormat.XML, file) method of an XMLElement. And if I do that and try to import my above XML with an XML parser, either the one in InDesign's JavaScript environment (the XML constructor that implements E4X), or the one in Firefox, etc., I get an error about namespaces.

)

I just added an xmlns attribute to the Root tag using Structure > New Attribute. This gives a Root attribute like this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Root xmlns:ce="http://www.elsevier.com/xml/common/dtd">
  <ce:float>A</ce:float>

Suddenly, the XPath methods become usable.

For instance, "*" returns all of the child nodes, including the ce:floats:

dumplist(x.evaluateXPathExpression("*"))
Result: has 6 elements:
  0     /document[@id=1]/XML-element[@id=2]/XML-element[@id=3]     ce:float
  1     /document[@id=1]/XML-element[@id=2]/XML-element[@id=4]     ce:float
  2     /document[@id=1]/XML-element[@id=2]/XML-element[@id=5]     float
  3     /document[@id=1]/XML-element[@id=2]/XML-element[@id=6]     float
  4     /document[@id=1]/XML-element[@id=2]/XML-element[@id=8]     ce:float
  5     /document[@id=1]/XML-element[@id=2]/XML-element[@id=11]     div

Now, "//float" doesn't match ce:float nodes and "//ce:float" also does not match them. But that's OK, because the XPath node test works fine:

dumplist(x.evaluateXPathExpression("//*[name()='ce:float']"))
Result: has 4 elements:
  0     /document[@id=1]/XML-element[@id=2]/XML-element[@id=3]     ce:float
  1     /document[@id=1]/XML-element[@id=2]/XML-element[@id=4]     ce:float
  2     /document[@id=1]/XML-element[@id=2]/XML-element[@id=8]     ce:float
  3     /document[@id=1]/XML-element[@id=2]/XML-element[@id=11]/XML-element[@id=13]     ce:float

And if you want all the floats regardless of namespace, you can use XPath's local-name() function:

dumplist(x.evaluateXPathExpression("//*[local-name()='float']"))
Result: has 7 elements:
  0     /document[@id=1]/XML-element[@id=2]/XML-element[@id=3]     ce:float
  1     /document[@id=1]/XML-element[@id=2]/XML-element[@id=4]     ce:float
  2     /document[@id=1]/XML-element[@id=2]/XML-element[@id=5]     float
  3     /document[@id=1]/XML-element[@id=2]/XML-element[@id=6]     float
  4     /document[@id=1]/XML-element[@id=2]/XML-element[@id=8]     ce:float
  5     /document[@id=1]/XML-element[@id=2]/XML-element[@id=11]/XML-element[@id=12]     float
  6     /document[@id=1]/XML-element[@id=2]/XML-element[@id=11]/XML-element[@id=13]     ce:float

I was going to offer you some code (again, JavaScript, not directly relevant to you) to export the XML node tree and then use E4X on it, but that's pointless. Since E4X needs the xmlns attribute that InDesign's object model needs too. Kill two birds with one stone.

Does this solve your problem, Suresh?

Report · Mar 01, 2011

Hi John,

Thank you very much first for all your help.

Please find the below indesign xml structure.

Yes it is true that you have to add namespace in the root element if your xml nodes contains namespace prefix. Since already it was added on my ID Structure then it is something like Vb.net DOM doesn't support the xml nodes with namespaces collection.

Still am working on to find the solution. Let me try with the different namespace and different simple xml file. I will get back to you shortly.

I will check with the traverse function in my file. If it works fine then also it is fine.

Happy that atleast it is working for javascript.

Report · Mar 01, 2011

Hi, Suresh:

Please find the below indesign xml structure.

If it's not too much to ask, can you please export it as XML and post that? I'm happy to import your stucture and try it, but I'm not going to type the whole thing in from the screenshot...

Yes it is true that you have to add namespace in
the root element if  your xml nodes contains
namespace prefix. Since already it was added
on  my ID Structure then it is something like Vb.net
DOM doesn't support the  xml nodes with
namespaces collection.

I don't think that is really possible. The DOM is the same for all the scripting languages. It's just a different skin. Note that I did not use the optional namespace prefix options to the evaluateXPathExpression() call.

Still am working on to find the solution. Let me try with the different namespace and different simple xml file. I will get back to you shortly.

Great.

I will check with the traverse function in my file.
If it works fine then also it is fine.

Honestly, I don't see how it could fail. Since it traverses the tree by hand...

Oh, I guess I should mention -- I was doing all my tests in CS5. But I just now went and confirmed that they work in CS4 also, so that's not the problem.

Report · Apr 13, 2011

I was recently reminded of this thread...I think that in early March Suresh figured out the problem and it had to do with Adobe's handling of XML namespaces, and it turned out he needed to adjust the declaration of XML namespaces.

Suresh, could you please post your solution so others can benefit?

Report · Apr 13, 2011

John, forgive me for knowing nothing of xml… but if you need edit/fix for use in ID then there is a javascript xml object… about a dozen pages cover it in the tool kit guide. Its just that I don't see you using it although I wouldn't know how… Just an observation in case it had been overlooked…

Report · Apr 13, 2011

Hey, MuppetMark.

That almost-native JavaScript XML support is called E4X. But it is completely seperate from the InDesign DOM.

That is, it lets you construct a series of XML objects from an external XML file or a from a string of XML data, and then lets you manipulate that data as you see fit (traverse the tree, perform some operations, etc.). But it has no connection with InDesign documents, the InDesign object model, or items taht are tagged with XML inside your InDesign document.

That said, one solution to some problems is to export InDesign's XML into a file (or string), and then process it with E4X.

But that's not usually what you want to be doing, and it doesn't help for a lot of things, where you need to be able to refer to InDesign DOM objects (like a text frame) without exporting it (because if you export to XML, you lose most of the attributes beyond the text contents of the frame -- e.g. its position on the page, just for starters).

So what's E4X useful for?

Well, suppose you had a script that read a CSV file and created one textFrame per row at the locations specified by the columns of the row. You could instead have it read an XML file instead of a CSV file. You would use E4X for that.

(You're right, in another recent thread I mentioned you could use an external XSLT processor or XPath evaluator; you could also use E4X for that...but I don't think you would want to.)

Report · Jan 24, 2013

What are changes need to be updated in the InDesign file to access xmlelement with namespace?

Report · Feb 12, 2015

Hi,

This will result elements names with namespace.

var myDocument = app.activeDocument;

var alltags = new Array;

var i1 =0;

var myXPath = "//*";

var myRuleSet = new Array (new ProcessProduct);

var patt1=new RegExp("^ce:section-title|ce:table|ce:figure|ce:textbox$");

//var patt1=new RegExp(arguments[0]);

with(myDocument) {

try {

var elements = xmlElements;

__processRuleSet(elements.item(0), myRuleSet);

}

catch (err) {

alert (err);

}

output()

function output(){

{

return alltags;

}

function ProcessProduct()

{

this.name = "ProcessProduct";

this.xpath = myXPath;

this.apply = function(myElement, myRuleProcessor) {

if(patt1.test(myElement.markupTag.name)==true)

{

//alltags[i1++] = myElement.markupTag.name;

alltags[i1++] = myElement;

}

return true;

}

function __processRuleSet (root, ruleSet, prefixMappingTable)

{

var mainRProcessor = __makeRuleProcessor(ruleSet, prefixMappingTable);

try {

__processTree(root, mainRProcessor);

__deleteRuleProcessor(mainRProcessor);

} catch (e) {

__deleteRuleProcessor(mainRProcessor);

throw e;

}

function __makeRuleProcessor(ruleSet, prefixMappingTable)

{

var pathArray = new Array();

for (i=0; i<ruleSet.length; i++)

{

pathArray.push(ruleSet.xpath);

}

try

{

var ruleProcessor = app.xmlRuleProcessors.add(pathArray, prefixMappingTable);

}

catch(e){

throw e;

}

var rProcessor = new ruleProcessorObject(ruleSet, ruleProcessor);

return rProcessor;

}

function ruleProcessorObject(ruleSet, ruleProcessor)

{

this.ruleSet = ruleSet;

this.ruleProcessor = ruleProcessor;

}

function __deleteRuleProcessor(rProcessor) {

rProcessor.ruleProcessor.remove();

delete rProcessor.ruleProcessor;

delete rProcessor.ruleSet;

delete rProcessor;

}

function __processTree (root, rProcessor)

{

var ruleProcessor = rProcessor.ruleProcessor;

try

{

var matchData = ruleProcessor.startProcessingRuleSet(root);

__processMatchData(matchData, rProcessor);

ruleProcessor.endProcessingRuleSet();

}

catch (e)

{

ruleProcessor.endProcessingRuleSet();

throw e;

}

function __processMatchData(matchData, rProcessor)

{

var ruleProcessor = rProcessor.ruleProcessor;

var ruleSet = rProcessor.ruleSet;

while (matchData != undefined)

{

var element = matchData.element;

var matchRules = matchData.matchRules;

var applyMatchedRules = true;

for (var i=0; i<matchRules.length && applyMatchedRules && !ruleProcessor.halted; i++)

{

applyMatchedRules = (false == ruleSet[matchRules].apply(element, rProcessor));

}

matchData = ruleProcessor.findNextMatch();

}

Regards,

Parthiban Paramanathan

Adobe Community

Namespace in xml element